Model for classifying one large object based on small details that are on the object


I have a dataset consisting of large images that contain one large single object, this object has small details on it and i want to classify this object into 6 different classes based on the small details that the object has on it. I do not know where exactly the small details are in the large object but they are visible. I’m looking for a model/vision transformer that can help me classify it, I’ve tried using Resnet50 and the results i got were not that good.
Any suggestions which model i should use?

I’m afraid you need to be a little more specific. It sounds like a simple mutliclass classification problem to me, but that doesn’t say anything about the reasons why it doesn’t work that well. A resnet50 seems to be a good start…

One source of issues could be images significantly larger than the size normally accepted by Resnet, with details disappearing when resizing for Resnet. Maybe transformers might be better suited.