Lesson 3 In-Class Discussion

Are you using EC2 P2 machines or G2?

If you search in the forums, there was some problem with paperspace. I would stick with AWS (if you also hv the credits) over the next 6 weeks particularly if you have a setup that’s working well with the AMI.

Thanks, I’ve been using P2 with the Fastai AMI. This has been running for several hours now to get to 86% done on the last learn model. Which makes knowing how to load/save models quite important I guess. I was looking for the saved model (f’{sz}’) but couldn’t find it yet…do you know where that is stored?

That’s strange it’s taking several hours. Which notebook is this? Do you have your code in Github? Can you give some more specifics and I can try to replicate. I use AWS P2 as well and mostly happy with it.

@beecoder @ramesh I am running the same planets notebook on Crestle right now and can tell you that my times are in line with Nikhil’s. I’m getting about 986s/it on sz=256 (after unfreezing).

Thanks @memetzgz. Time to get familiar with loading/saving models. @ramesh this is the usual lesson file in the path “fastai/courses/dl1/lesson2-image_models.ipynb” I’m not doing anything fancy, I did a git pull ~28 hours ago. The AMI is fastai-part1v2-p2 (ami-8c4288f4).

Yeah…Interesting. I am running it now. The sz=64 parameter is the dimensions of the Image runs OK. So larger image size does cause slow runs in general (since it’s a bigger input data dimensions).

Unfreeze all of the layers with larger sz image definitely make it much slower. So may be we unfreeze with smaller dimensions but keep it frozen for larger sz arrays. It might be useful to have a parameter in unfreeze so that we can choose to unfreeze only limited number of top layers and not all the way. I will add this to the other thread on feature requests - Wiki: Fastai Library Feature Requests

1 Like

@ramesh you can use freeze_to() for that :slight_smile:

1 Like

I looked into it…but it has only two layers above for Resnet34. They were both huge. Is it possible to freeze_to a sub layer or breakdown to more layers in the pretrained network?

The caveat is we have to give more learning Rates. Might be better to give an option to specify a dictionary of layer names we want to unfreeze and learning rates for them? Thoughts / suggestions - @jeremy

What is “num_workers” used for in the ImageClassifierData function?

Number of CPU cores you want to use

Thank you. If we don’t specify anything, what is it by default?

Not sure :slight_smile: away from computer - maybe 4 or -1. -1 means use max.

BTW to see this play out I would recommend installing htop and running it from terminal, really nice way to visualize what your CPU cores are doing.

:+1: Okay

It’s 8. You can always get the default value by looking at Source code…type ??ImageClassifierData.from_csv or ??ImageClassifierData.from_paths depending on what you are using.

1 Like

Jeremy mentioned earlier that 20% of the training data is good for validation set. I have a dataset just like in the dogscats format - train and test only. This is how we do it with the csv file:
label_csv = f'{PATH}train_v2.csv
n = len(list(open(label_csv)))-1
val_idxs = get_cv_idxs(n)

how do i get the validation set from train when i don’t have a csv labels file?

How about create such a file? os.listdir , split, to_csv etc ?

Take a look at this post to how you can shuffle some data from train and move to validation directory - Faster experimentation for better learning

I didn’t get that. Are you saying:

  1. create csv file?
  2. split training set?

This really helps. Thank you!

You would like to split train/valid using from_csv method but dont have csv with labels. I assumed you should have then train with labels encoded in images filenames (dog12345.jpg) or already stored in specific folder names (dog/12345.jpg). It is possible to create necessary csv file from both these types of train set.

With from_csv method you should not move files between folders train > valid > train.