Lesson 3 In-Class Discussion


(Ramesh Sampath) #171

Are you using EC2 P2 machines or G2?

If you search in the forums, there was some problem with paperspace. I would stick with AWS (if you also hv the credits) over the next 6 weeks particularly if you have a setup that’s working well with the AMI.


(Nikhil B ) #172

Thanks, I’ve been using P2 with the Fastai AMI. This has been running for several hours now to get to 86% done on the last learn model. Which makes knowing how to load/save models quite important I guess. I was looking for the saved model (f’{sz}’) but couldn’t find it yet…do you know where that is stored?


(Ramesh Sampath) #173

That’s strange it’s taking several hours. Which notebook is this? Do you have your code in Github? Can you give some more specifics and I can try to replicate. I use AWS P2 as well and mostly happy with it.


(Maureen Metzger) #174

@beecoder @ramesh I am running the same planets notebook on Crestle right now and can tell you that my times are in line with Nikhil’s. I’m getting about 986s/it on sz=256 (after unfreezing).


(Nikhil B ) #175

Thanks @memetzgz. Time to get familiar with loading/saving models. @ramesh this is the usual lesson file in the path “fastai/courses/dl1/lesson2-image_models.ipynb” I’m not doing anything fancy, I did a git pull ~28 hours ago. The AMI is fastai-part1v2-p2 (ami-8c4288f4).


(Ramesh Sampath) #176

Yeah…Interesting. I am running it now. The sz=64 parameter is the dimensions of the Image runs OK. So larger image size does cause slow runs in general (since it’s a bigger input data dimensions).

Unfreeze all of the layers with larger sz image definitely make it much slower. So may be we unfreeze with smaller dimensions but keep it frozen for larger sz arrays. It might be useful to have a parameter in unfreeze so that we can choose to unfreeze only limited number of top layers and not all the way. I will add this to the other thread on feature requests - Wiki: Fastai Library Feature Requests


Wiki: Fastai Library Feature Requests
(Jeremy Howard (Admin)) #177

@ramesh you can use freeze_to() for that :slight_smile:


(Ramesh Sampath) #178

I looked into it…but it has only two layers above for Resnet34. They were both huge. Is it possible to freeze_to a sub layer or breakdown to more layers in the pretrained network?

The caveat is we have to give more learning Rates. Might be better to give an option to specify a dictionary of layer names we want to unfreeze and learning rates for them? Thoughts / suggestions - @jeremy


(K Sreelakshmi) #179

What is “num_workers” used for in the ImageClassifierData function?


#180

Number of CPU cores you want to use


(K Sreelakshmi) #181

Thank you. If we don’t specify anything, what is it by default?


#182

Not sure :slight_smile: away from computer - maybe 4 or -1. -1 means use max.

BTW to see this play out I would recommend installing htop and running it from terminal, really nice way to visualize what your CPU cores are doing.


(K Sreelakshmi) #183

:+1: Okay


(Ramesh Sampath) #184

It’s 8. You can always get the default value by looking at Source code…type ??ImageClassifierData.from_csv or ??ImageClassifierData.from_paths depending on what you are using.


(K Sreelakshmi) #185

Jeremy mentioned earlier that 20% of the training data is good for validation set. I have a dataset just like in the dogscats format - train and test only. This is how we do it with the csv file:
label_csv = f'{PATH}train_v2.csv
n = len(list(open(label_csv)))-1
val_idxs = get_cv_idxs(n)

how do i get the validation set from train when i don’t have a csv labels file?


(sergii makarevych) #186

How about create such a file? os.listdir , split, to_csv etc ?


(Ramesh Sampath) #187

Take a look at this post to how you can shuffle some data from train and move to validation directory - Faster experimentation for better learning


(K Sreelakshmi) #188

I didn’t get that. Are you saying:

  1. create csv file?
  2. split training set?

(K Sreelakshmi) #189

This really helps. Thank you!


(sergii makarevych) #190

You would like to split train/valid using from_csv method but dont have csv with labels. I assumed you should have then train with labels encoded in images filenames (dog12345.jpg) or already stored in specific folder names (dog/12345.jpg). It is possible to create necessary csv file from both these types of train set.

With from_csv method you should not move files between folders train > valid > train.