Do you guys switch to differential learning rates only after your normal learn.fit stops reducing validation loss or is it one of those things you need to get a feel for?
Anytime Iām finetuning lower layers with unfreeze then I always use differential learning rates. If Iām just training top layers then I just stick with one learning rate.
I might have phrased that poorly - it was my understanding that i should always start by training top layer, then finetuning lower layers with differential learning rates, and i was wondering when do i decide that itās time to train lower layers too. Is that not correct, are you finetuning all layers from the start?
The truth is it really depends on the dataset/problem. But my own personal general rule of thumb is usually to initially train top layers followed by finetuning lower layers. For the top layers, I donāt typically do more than 3 cycles or so, just enough to get the loss rates to a reasonable starting point before finetuning lower layers.
Thatās what i wanted to know. Thank you James!
Woah. Thatās pretty remarkable. Can you share any general insights on fine-tuning while avoiding overfitting ?
tbh everything you need to know is in the lesson notebooks that we have covered so far in this course The more practice you get training different models and observing how the loss rates change the better intuition you will have. For finetuning, if you get the learning rates right and take advantage of cycle_len and cycle_mult you should be able to avoid overfitting. It can also be quite helpful to save your weights at the end of each cycle with cycle_save_name so you can basically just cherry pick the best loss rate that occurred before you started overfitting. This can often take quite a bit of trial and error to get right, so donāt expect everything to just āwork outā on the first try.
Thanks James. Iāve yet to start saving weight at the end of each cycle. After tweaking around learning rates(slightly), cycle_len and cycle_multi, I still find that sudden overfitting happens now and then, ruining everything that comes afterwards. Iāll try saving/using the weights at end of the cycles as well.
It works in the data frame (12 species). When I pass the same csv file into ImageClassifierData.from_csv, I got 19 species. The problem caused by white space. How to fix it?
See earlier in this thread
Kaggle provided the training dataset in folders. Why do you prefer from_csv
to from_path
?
It seems Jeremy should also make one more prediction on LBā¦
pred_classes = [data.classes[i].replace("", "_") for i in preds]
Something Like thisā¦
Or replace them in csv directlyā¦
Canāt upload the .csv hereā¦
I guess there are plenty of ways to do this. Hereās another way to do it in the dataframe:
label_df.species = label_df.species.apply(lambda x: re.sub(' ', '_', x))
I use the script you mentioned ti create labels.csv and also do the following steps to remove the folders from the train folder. But after that my data frame is empty and also the csv file dosenāt show up in my PATH.
What am i doing wrong
Your df.to_csv
call is not storing the csv file in PATH
but in the current working directory (where your notebook is). To save it to PATH
, you need to add that in the call to to_csv
.
Itās actually saving in fastai
directory as per your methodā¦
Simply Do a mv
I did a mv but the data frame still shows empty.
Just checking if i got it right : The downloaded data is in folders for each species. So should i be creating the ālabels.csvā first and then move the images to parent train folder and delete the empty species folders? OR
First move images to parent train folder, create csv and then delete the empty species folders.
I try both ways and it is still the same.