I am averaging predictions currently.
Which Kaggle competition is this?
Yah, I’m interested to see what @jeremy has to say on the value of K-Fold CV in neural networks in general and CNN architectures specifically.
It looks like @sermakarevich used it towards great results. I’m not sure exactly what his process was (e.g., did he use the same architecture or multiple architectures, did he use the same process to train each model or did it vary, etc…, etc…).
I started with resnet34 (using validation set) and got to rank 60 (the model was decent). I then moved to resnext101_64 with the same data split, and the ranking improved to 22.
I am now trying to use this resnext101_64 to train using the full dataset and
val_idxs = , but I get an accuracy of
1. and training loss of about
0.62. I am not quite sure I am comfortable with what I see.
This means I’m overfitting right (given there is no validation set)? Am I right in assessing that this does not look as good as it should?
If you’re changing architecture, you need a proper validation set. Don’t use a validation set with just one image in!
My bad. I think I got it now.
So just to clarify, in order to utilize the complete training data in this case, if I train my model first using some data split and then recreate the data object and set
val_idxs =  and train again, that should give me better results?
Yes - you need to exactly replicate a full process that worked with the validation set in place. Otherwise you don’t know if you’re over or under fitting!
These bigger models have far more parameters, so they overfit easily…
Thanks! Makes much more sense now.
to confirm my understanding:
return data if sz>300 else data.resize(340,‘tmp’)
if the image size is greater than 300 then we use it as it is in ‘train’ ‘valid’ ‘test’ folder.
and if the image size is less than 300, we resize it into 340 size and place it in /tmp folder and use it.
Is my understanding correct?
And any request we make for a size less than 300 will simply use the saved images in the /tmp folder. So if you first train against size 224 and then 299, the transforms will simply be resizing the same set of saved images.
ok. Thank you.
I tried to use
val_idxs= but I get an assertion error when trying to load the pretrained convnet.
@stathis I think its because you have
Precompute=True. It’s giving the error because you previously had activations generated on the validation set but now the validation set isn’t there. I believe if you switch to
Precompute=False then it should work. In other words, training on all of the data doesn’t really work with Precompute=True.
@jamesrequa Thanks. Actually that was just loading the pretrained model from scatch so it should work. Pulling from github solved it
Yes you are correct that it will work with
Precompute=True if you do it from scratch without ever creating those activations previously but I don’t think you would ever want to do that since you should never start by training with all of the data
Is there any similar thing for
ImageClassifierData.from_paths api . If I leave validation folder empty , the code doesn’t like it very much.
I had the same question and found the answer here (How to train on the full dataset using ImageClassifierData.from_csv)
The transforms downsize the images to 224 or 299. Reading the jpgs and resizing is slow for big images, so resizing them all to 340 first saves time.
ImageClassifierData.from_paths() is there any alternative to copying the validation set back into the training path?