I’m trying to apply Resnet50 to the dogs vs cats competition. Previously I used VGG19 to get a log loss of 0.088. I was expecting better results with Resnet and I did get better results for the most part. My validation accuracy was .9890. I then try to submit the same to the competition but i get a loss of 1.6 something which is very weird. I’m reproducing relevant bits of the code here. Please take a look and let me know if I’m missing something.
I’m getting really confused here. It would be great if you can shed some light on this.
Thanks in advance.
EDIT: I made an issue with matching filenames while generating preds. I did get a slightly better result than VGG-19. I moved 5 places up in the leaderboard. I did get a slightly better result by adding batch norm to the average pooling model since i saw it was overfitting. Now trying Data Augmentation, etc.
I was following the default notebook when it comes to clipping the values. I tried a different range of values that gave me different results on the leaderboard. This one seems to be giving a good result so far. But i didnt understand what you mean by clipping it “asymmetrically”.
0.05 from the bottom and 0.025 from the top. I’m going for same margin from both directions but guess it actually could make sense to treat them separately
Also, I’m noticing something quite weird and i dont know why this happens. When i try to use just the resnet50 model to finetune and fit data it’s pretty off. Here’s a sample of my code. Do let me know if it makes sense as to why this is happening.
Is it because I’ve enabled shuffle=False? If i got thoughts about shuffle right it shouldnt make a difference when both my train and val batches have shuffle=False set right?
There is something more serious amiss - you are getting a nan value which can result if you do for example something like dividing by 0 or taking a log of 0. Once you get a nan value it can pollute all your other calculations.
Exactly! I tried this in a new notebook and I’m still getting the exact same issue. I dont have an issue while running VGG or other custom convnets that i’m using. Just this issue with running resnet this way.
Can you try it on the redux dataset and see if it’s something you’re having an issue with also? That would help a lot.
No not normal. Not shuffling is definitely an error, so you should at least fix that. I don’t know what other errors you have without seeing all your code.
Right. So thought that not shuffling could be it and did attempt to create a new notebook where i’m just replicating the same code but with shuffling and i’m getting still nan as loss.
I’ve just created a gist for you to go through. I cant seem to find an obvious error or i’m making a very very stupid mistake.
I did too! Did you have any luck in figuring out the issue? I ran a couple of print statements, and the type of the resnet model was actually a keras.engine.training.model whereas the type of the Vgg16BN model was keras.model.sequential. I didn’t understand why that was the case, but I could not perform the pop() operation on the resnet50 model.
Please let me know if anybody else has had luck with resnet.
Silly mistake, forgot to precompute, and then add the remaining Fully Connected/GlobalAveragePooling2D layers. However, Jeremy got amazing results on cats vs dogs, but I’m trying it on the Fisheries contest and I’m getting really bad results. Did anybody have any luck with getting good results using the Resnet50 model on the Fisheries contest?
I had a problem with pop but model.layers.pop worked. Similarly model.add doesn’t work but model.layer.append works to some degree but then it complains about not being built!!!
I am getting the same error! Added shuffle=True and it doesn’t help. I’m trying ResNet on a dataset other than the cats vs dogs and the same issue remains. Was anybody able to solve this?
I’m also getting nan loss values with the provided resnet class. I’m working with the invasive species kaggle competition, which only has two classes, so I’ve tried:
Setting the final dense layer to have 1 output with sigmoid activation and set the loss function to binary_crossentropy
Setting the final dense layer to have 2 outputs with softmax activation and setting the loss function to categorical_crossentropy/mean_squared_error (both didn’t work)