Wiki: Lesson 2

Check your understanding of the lesson 2

<<< Check your understanding of the lesson 1 | Check your understanding of the lesson 3 >>>

(post original in portuguese at Deep Learning Brasilia - Lição 2)

Hi guys,

I did watch again the video of the lesson 2 (part 1) to get the whole image and I took notes of the vocabulary used by @jeremy.

Let’s play ! OK ? :wink:
Can you give a definition / a url / an explanation for all the followings terms and expressions ?

If yes, you are done with the 2nd lesson !!! :sunglasses: :sunglasses: :sunglasses:

PS : you do not want to test yourself or you want to check your answers ? Go to the blog post “Deep Learning 2: Part 1 Lesson 2” of @hiromi : " super travail !!! :slight_smile: "

  • Image classifier
  • 3 lines of code
  • the key point to train the model is the PATH to data
  • particular structure of the data folder with train, validation, test folders
  • each folder with subfolders cats and dogs
  • pretty standard data folder stucture
  • validation accuracy
  • image vizualisation
  • learning rate
  • epoch
  • dataset
  • training set loss
  • validation set loss
  • Deep Learning
  • minimum point of a function
  • function which has hundreds of million parameters
  • algorithm
  • what is the loss or error at this point choosen at random
  • what is the gradient at this point
  • learning rate
  • learning rate finder
  • mini batch
  • GPU
  • parallel processing power of the GPU effectively (generally 64 or 128 images at a time)
  • mini batch = iteration
  • learn.lr_find()
  • learn.sched.plot()
  • 10-1 = 1e-1
  • we are not looking at the point with the lowest learning rate
  • hyper parameters
  • fastai library
  • Adam
  • momentum
  • learning rate optimizer
  • dynamic learning rate
  • make your model better : give more data
  • overfitting
  • specialization of the model
  • the model hast to generalize
  • more labeled data
  • data augmentation
  • rotation, flip, zoom
  • aug_tfms
  • tfms_from_model
  • transforms_side_on
  • transforms_top_down
  • each type of image has a particular data ugmentation
  • learning rate versus loss
  • the higher learning rate that gives the lower loss
  • all local mimina are the same
  • unfreezing layers
  • precompute=True
  • activation is a number (feature, level of confidence, probability)
  • save the activations for all images of the dataset
  • first time you train the model, it is longer than after when precompute=True
  • when precompute=true, data augmentation does not have an impact
  • learn.precompue=False
  • overfitting = your model does not generalize
  • stochastisc gradient descent with restarts
  • cycle_len=1
  • learning rate annealing
  • cosine annealing
  • learning rate schedule
  • learn.sched.plot_lr()
  • with SGDR, you do not need to retrain your model with new random values of the model parameters
  • even better for generalization, you can save parameters values befor the restart and get the average of them (cycle_save_name)
  • learn.save()
  • learn.load()
  • each time you create a new object learn, you start with a new model with new sets of weights
  • weights / parameters
  • fine tuning
  • pretrained model
  • we’ve learned new layers on top of the pretrained model
  • frozen layers are layers not trained
  • learn.unfreeze()
  • layer 1 looks for edges and gradients
  • the early layers need no training or a little
  • diferential learning rates
  • lr = array([lr1,lr2,lr3])
  • if images look like the imagenet ones, divide by 10. If not, divide by 3.
  • learn.fit(lr, 3, cycle_len=1, cycle_mult=2)
  • size of images
  • learn.freeze_to (if you want to freeze a particular layer)
  • number of cycles
  • cycle_mult multiplies the length of a cycle after each training through a cycle
  • hyper parameters
  • all our inout images mut be squared in terms of dimensions as the dimensions have to be consistent to be computed by the GPU
  • predictions on validation set
  • TTA : Test Time Augmentation on validation or test data set when you want to get the prediction values (take the average of predictions after data augmentation)
  • log_preds,y=learn.TTA()
  • accuracy(log_preds,y)
  • cropping images is not recommended
  • fastai library is Open Source
  • pytorch
  • library on top of pytorch because pytorch is not so easy to use
  • model to phone, tensorflow
  • confusion matrix
  • plot_confusion_matrix(cm, data.classes)
  • training a world class image classifier in 8 steps (there are 3 main steps in fact)
  • dogbreed kaggle competition
  • kaggle cli script
  • pandas
  • csv files
  • label_df = pd.read_csv(label_csv)
  • a pivot table
  • max_zoom : random zoom up to the choosen number (like 1.1)
  • cross validation indexes
  • val_idxs = get_cv_idxs(n)
  • mostly imagenet images size is 224 x 224 or 299 x 299
  • dictionary comprehension
  • list comprenhension
  • matplotlib
  • histogram
  • size of validation set
  • when I start training a model, I want to do it very fast : so, I need small images (like 224 x 224 or smaller)
  • CUDA memory error (you should decrease your batch size)
  • Kernel restart
  • the more classes you have and the more difficult to get an accuracy very high
  • if you train something on a smaller size, you can call learn.set_data(get_data(299,bz)) and pass in a larger dataset
  • learn.freeze()
  • fully convolutional architectures can handle pretty much arbitrary sizes
  • underfitting means that cycle_len is too short
  • dataset balanced or unbalanced
  • mainly 3 steps : search for the learning rate, train with frozen layers, train with unfrozen layers
  • improve the training by increasing image sizes and get a better architecture
  • deconvolution
  • resnet34
  • resnext50
  • satelite images (Planet dataset)
  • learn.sched.plot_loss()
  • Crestle
  • Paperspace
  • AWS, AWS credit
  • AWS setup : console, EC2, launch an instance, Community AMIs, fastai, p2.xlarge, launch, key pair, Public Ip adress, ssh
  • cd fastai
  • git pull
  • conda env update
  • don’t forget stopping your GPU on Crestle/Paperspace/AWS !
3 Likes