Thanks @sermakarevich. For my submission I just used the preds after doing the np.exp on the output of learn.predict(is_test=True). For the ids of the test images that correspond to these predictions I used sample_submission.csv and took the ids column.
I want to validate if I have understood your response correctly. Before running the learn.predict() on the test set, the test.index needs to be set as per what you have mentioned. Then the ids that are entered in the submission file also need to be taken from the same index. Is this correct?
columns = pd.read_csv('../../../dogs/sample_submission.csv', index_col='id').columns
test = learn.TTA(is_test=True)[0]
test = pd.DataFrame(np.exp(test))
test.columns = columns
test.index = [i.split('.jpg')[0].split('/')[-1] for i in data.test_dl.dataset.fnames]
test.index.name ='id'
Huh. At this point the only other thing Iād recommend trying is to reinstall kg and recreate your Kaggle credentials. Definitely looks like a credentials issue.
I wish I could say I had some āsecret formulaā but honestly I have literally just been following the general steps that we have learned so far. The only difference is for dog breed I skip the unfreeze section entirely. Because of that I am not using any differential learning rates but I am using cycle_len and cycle_mult. Also I am being a bit selective with my ensemble which simply contains bagged avg of a few models. So make sure if you are also taking the avg of multiple models that you check how each one is contributing to the overall score. If you start by just combining a bunch of them together from the start you may not realize that one of them could actually be hurting your score
Anyway now there are a few more ppl on the LB that are around 0.10 log loss so I think there is still room for improvement (assuming they are not cheating)
Some of them look too solid for cheating, so yeah, it looks like <0.1 is possible.
I try to use this oof predictions on train to check what and how to blend. But I stacked on trying to make all steps from lessons work. Cant achieve any improvement when use cycle_len and cycle_mult compared to just step by step lr decrease.
Thereās a kernel that gets 0.07 IIRC. The trick is to simply use the Imagenet model directly without removing any layers at all - i.e. take advantage of the pre-trained fully connected layer too. In practice this is almost never useful, but in this case because itās a subset of imagenet in the competition, you can use these trick.
How would we implement this with fastai? If I understand correctly all of the models we are using here have the top layers for the original models switched out.
Thanks @sermakarevich, though, me too, no secret here
My approach seems almost the same as @jamesrequaās.
I skipped unfreeze, and trained with all but one training data. I have been tried 3 models so far, and submitted an ensemble of some of them, each of them scored around 0.19-0.20.
Other than that, I just followed what @jeremy taught in the lesson.
Check out the CIFAR 10 post I just posted - it shows how to use a model without changing it at all. You can use this approach to use a full imagenet model.