I am working on the Kaggle Cervical Cancer competition and following this topic to do some transfer learning with ResNet.
When I used the VGG16 model in Keras and Theano (part 1 environment), these train and test images are just fine. Now I am using Keras 2.0 and Tensorflow as backend (part 2 environment), some of the train images seed to have corrupted EXIF data and been ignored. Specifically, when I ran:
# precompute convolutional output trn_conv_features_resnet = resnet_model_conv.predict_generator(trn_batches, trn_batches.samples)
I got a bunch of messages like:
/home/shi/anaconda3/lib/python3.6/site-packages/PIL/TiffImagePlugin.py:692: UserWarning: Possibly corrupt EXIF data. Expecting to read 524288 bytes but only got 0. Skipping tag 3 "Skipping tag %s" % (size, len(data), tag)) /home/shi/anaconda3/lib/python3.6/site-packages/PIL/TiffImagePlugin.py:692: UserWarning: Possibly corrupt EXIF data. Expecting to read 393216 bytes but only got 0. Skipping tag 3 "Skipping tag %s" % (size, len(data), tag)) /home/shi/anaconda3/lib/python3.6/site-packages/PIL/TiffImagePlugin.py:692: UserWarning: Possibly corrupt EXIF data. Expecting to read 33554432 bytes but only got 0. Skipping tag 4
I manually counted and found out that 258 images were ignored, out of the total 1281 training images. My questions are:
- Is there a way to fix this EXIF corruption issue? I searched but had no luck so far…
- How can I figure out which 258 images are ignored? I can at least manually remove these 258 images out of the train data, in the worst case scenario.
Thank you!