Lesson 2 In-Class Discussion

How do we do the SSH with Cygwin? I have win 7 machine.

Try http://www.putty.org/ :slight_smile:


@jeremy For those of us with home machines that are designed for deep learning is there a list of packages that are installed in the ami?

I built a deep learning box after the last course and I want to get it setup equivalent to the ami.

install the environment based on the environment.yml file.

  1. Git clone the fastai repo.
  2. Run the following command (found that you will need a *nix system as some packages are not supported on windows) conda env create -f environment.yml
  3. activate the env & run pip install -r requirements.txt to install the python packages needed
  4. ???
  5. profit.

use that conda command that he did, conda env update There is also a way to initially create a conda environment, but I am not remembering it off the top of my head, hopefully somebody else remembers, but basically you point that to environment.yml file and it magically installs everything into a new environment. It is really slick and awesome.

Note: The conda env update must be done in the same directory as the environment.yml.

Any idea where to get the pretrained resnext50 model from? It doesn’t seem to be on model zoo.

Edit: Found this github project (https://github.com/clcarwin/convert_torch_to_pytorch) that can convert the model available here (https://github.com/facebookresearch/ResNeXt) to one fastai can use.

Thanks again @jeremy and @yinterian for this lesson. :clap:

I have a couple of general/ML questions, in the context of Deep learning.

  • I see that we picked up 20% of data for validation in one of the examples. What about things like cross-validation and things like k-fold validation ? Is it too much to compute perhaps ?

  • We are using Accuracy to measure model efficiency. I’m assuming this is (True Positives+True Negatives)/Total. Just curious, is it uncommon/complicated to use other measures, things like Precision, Recall, PR curve, AUC(ROC curve) etc. ?

  • How do we deal with unbalanced classes ? (I think Jeremy mentioned a paper on balancing datasets.)

Thanks @charlielee this was very helpful. :slight_smile:

Reg. point 3, Jeremy said that the paper suggests just copying the classes underrepresented to increase their count. I want to experiment with some sort of data augmentation tricks + copying too.

Because the training loss includes dropout. We’ll learn about this soon.


That paper is only relevant to folks with effectively infinite resources. IMHO it’s of little if any practical value.


We’re only using the learning rate finder from that paper, not the cyclical learning rates themselves. The annealing method we use is from https://arxiv.org/abs/1608.03983

Yes, exactly

Yup that’s exactly what we’ll be doing :slight_smile:


You got it right

Yes there’s dropout. We’ll learn about that soon.


We don’t appear to be using any k-fold cross validation so far. Is this not needed because of TTA?

1 Like

'+ train prediction through cv

Yes that’s a real issue in the unicorn community. In practice, you have to adjust your probabilities to ‘undo’ the over-sampling.

Yup I did say that. Adding the validation set back in right at the end is fine - you can get a better final model this way. cc @yinterian