Check your understanding of the lesson 2
<<< Check your understanding of the lesson 1 | Check your understanding of the lesson 3 >>>
(post original in portuguese at Deep Learning Brasilia - Lição 2)
Hi guys,
I did watch again the video of the lesson 2 (part 1) to get the whole image and I took notes of the vocabulary used by @jeremy.
Let’s play ! OK ?
Can you give a definition / a url / an explanation for all the followings terms and expressions ?
If yes, you are done with the 2nd lesson !!!
PS : you do not want to test yourself or you want to check your answers ? Go to the blog post “Deep Learning 2: Part 1 Lesson 2” of @hiromi : " super travail !!! "
- Image classifier
- 3 lines of code
- the key point to train the model is the PATH to data
- particular structure of the data folder with train, validation, test folders
- each folder with subfolders cats and dogs
- pretty standard data folder stucture
- validation accuracy
- image vizualisation
- learning rate
- epoch
- dataset
- training set loss
- validation set loss
- Deep Learning
- minimum point of a function
- function which has hundreds of million parameters
- algorithm
- what is the loss or error at this point choosen at random
- what is the gradient at this point
- learning rate
- learning rate finder
- mini batch
- GPU
- parallel processing power of the GPU effectively (generally 64 or 128 images at a time)
- mini batch = iteration
- learn.lr_find()
- learn.sched.plot()
- 10-1 = 1e-1
- we are not looking at the point with the lowest learning rate
- hyper parameters
- fastai library
- Adam
- momentum
- learning rate optimizer
- dynamic learning rate
- make your model better : give more data
- overfitting
- specialization of the model
- the model hast to generalize
- more labeled data
- data augmentation
- rotation, flip, zoom
- aug_tfms
- tfms_from_model
- transforms_side_on
- transforms_top_down
- each type of image has a particular data ugmentation
- learning rate versus loss
- the higher learning rate that gives the lower loss
- all local mimina are the same
- unfreezing layers
- precompute=True
- activation is a number (feature, level of confidence, probability)
- save the activations for all images of the dataset
- first time you train the model, it is longer than after when precompute=True
- when precompute=true, data augmentation does not have an impact
- learn.precompue=False
- overfitting = your model does not generalize
- stochastisc gradient descent with restarts
- cycle_len=1
- learning rate annealing
- cosine annealing
- learning rate schedule
- learn.sched.plot_lr()
- with SGDR, you do not need to retrain your model with new random values of the model parameters
- even better for generalization, you can save parameters values befor the restart and get the average of them (cycle_save_name)
- learn.save()
- learn.load()
- each time you create a new object learn, you start with a new model with new sets of weights
- weights / parameters
- fine tuning
- pretrained model
- we’ve learned new layers on top of the pretrained model
- frozen layers are layers not trained
- learn.unfreeze()
- layer 1 looks for edges and gradients
- the early layers need no training or a little
- diferential learning rates
- lr = array([lr1,lr2,lr3])
- if images look like the imagenet ones, divide by 10. If not, divide by 3.
- learn.fit(lr, 3, cycle_len=1, cycle_mult=2)
- size of images
- learn.freeze_to (if you want to freeze a particular layer)
- number of cycles
- cycle_mult multiplies the length of a cycle after each training through a cycle
- hyper parameters
- all our inout images mut be squared in terms of dimensions as the dimensions have to be consistent to be computed by the GPU
- predictions on validation set
- TTA : Test Time Augmentation on validation or test data set when you want to get the prediction values (take the average of predictions after data augmentation)
- log_preds,y=learn.TTA()
- accuracy(log_preds,y)
- cropping images is not recommended
- fastai library is Open Source
- pytorch
- library on top of pytorch because pytorch is not so easy to use
- model to phone, tensorflow
- confusion matrix
- plot_confusion_matrix(cm, data.classes)
- training a world class image classifier in 8 steps (there are 3 main steps in fact)
- dogbreed kaggle competition
- kaggle cli script
- pandas
- csv files
- label_df = pd.read_csv(label_csv)
- a pivot table
- max_zoom : random zoom up to the choosen number (like 1.1)
- cross validation indexes
- val_idxs = get_cv_idxs(n)
- mostly imagenet images size is 224 x 224 or 299 x 299
- dictionary comprehension
- list comprenhension
- matplotlib
- histogram
- size of validation set
- when I start training a model, I want to do it very fast : so, I need small images (like 224 x 224 or smaller)
- CUDA memory error (you should decrease your batch size)
- Kernel restart
- the more classes you have and the more difficult to get an accuracy very high
- if you train something on a smaller size, you can call learn.set_data(get_data(299,bz)) and pass in a larger dataset
- learn.freeze()
- fully convolutional architectures can handle pretty much arbitrary sizes
- underfitting means that cycle_len is too short
- dataset balanced or unbalanced
- mainly 3 steps : search for the learning rate, train with frozen layers, train with unfrozen layers
- improve the training by increasing image sizes and get a better architecture
- deconvolution
- resnet34
- resnext50
- satelite images (Planet dataset)
- learn.sched.plot_loss()
- Crestle
- Paperspace
- AWS, AWS credit
- AWS setup : console, EC2, launch an instance, Community AMIs, fastai, p2.xlarge, launch, key pair, Public Ip adress, ssh
- cd fastai
- git pull
- conda env update
- don’t forget stopping your GPU on Crestle/Paperspace/AWS !