Doubts in Image transformations

Hello guys,

I have gone through the first two lectures of part 1 course. So i started a project which is handwriting classification for tamil alphabets. There are 156 classes and I have about 60000 images of different handwriting images (about 250 - 350 images per class. When I ran resnet and used techniques like unfreeze and lr_find, I get an accuracy of 91% which is pretty cool. I wanted to increase the accuracy rate and so i looked at the alphabets where there were errors.

I found the alphabets which had errors. I have given two examples here. These are four classes alphabets which were misclassified about 30 times each. As you can see class 1 and 2 are somewhat similar and if some cropping happened in 2, it will look like class 1. So my doubt is maybe some cropping or some kind of transformation of images is happening. All the images in my dataset are of different sizes. I have used this command to collect the images as a databunch.

data = ImageDataBunch.from_name_re(path, fnames, pat, size = 224, bs=bs,
ds_tfms=get_transforms(do_flip=False, max_rotate=0, max_zoom=1, max_lighting=0, max_warp=0, flip_vert=False, p_affine=1, p_lighting=1)
).normalize(imagenet_stats)

Any suggestions appreciated

Hi Jayaram Subramanian Hope all is well!

Having built a number of image classifiers using the lessons 1 and 2, I can confirm your network is performing as it is designed to. From my experience and testing the more similar the images the in your classes, the greater the errors in inference and confusion.

I built a wristwatch classifier it was fine for two classes, but the accuracy fell for every additional class I added.

I think you may find, if you look at your cropped images, that not very much is cropped out, but its worth a look to be sure.

This link shows ways of examining cropped images.

image

I believe people use such things as feature engineering to help make their classifiers more accurate.

Cheers mrfabulous1 :smiley: :smiley:

Thanks @mrfabulous1 for your reply. I looked at the docs. I wasn’t exactly clear how to find the transformations that’s happening to the images I send to the model.

I used data.show_batch(rows=8, figsize=(6,6)) similar to lesson 1 and 2 to look at the images. Is that a correct way to look at images that is sent to the model for ?
I tried this a couple of times to look at different images from my dataset. I found that I myself could not identify on an average 5/64 (7.8%) of the images because of the some cropping / transformations. My accuracy rate is 92% so when I get 91% accuracy from the model I feel I am not surprised.
For a fact I know that I can accurately predict close to 99% or more when presented an image from the dataset without tranformations. If I am not providing the model with the correct dataset I feel I cannot expect good accuracy from the model. How do I send my dataset as accurately as possible without any transformations.
data = ImageDataBunch.from_name_re(path, fnames, pat, size = 112, bs=bs,
ds_tfms=get_transforms(do_flip=False, max_rotate=0, max_zoom=1, max_lighting=0, max_warp=0, flip_vert=False, p_affine=1, p_lighting=1)
).normalize(imagenet_stats)
My images have sizes varying from 34 * 34 to 200*150 (approximately). Is that a problem. Should I modify my dataset of images? . Another question is how can I accurately send my images to the model if I dont need transformations?