Wiki: Lesson 2

pierreguillou · April 14, 2018, 3:01am

Check your understanding of the lesson 2

<<< Check your understanding of the lesson 1 | Check your understanding of the lesson 3 >>>

(post original in portuguese at Deep Learning Brasilia - Lição 2)

Hi guys,

I did watch again the video of the lesson 2 (part 1) to get the whole image and I took notes of the vocabulary used by @jeremy.

Let’s play ! OK ?
Can you give a definition / a url / an explanation for all the followings terms and expressions ?

If yes, you are done with the 2nd lesson !!!

PS : you do not want to test yourself or you want to check your answers ? Go to the blog post “Deep Learning 2: Part 1 Lesson 2” of @hiromi : " super travail !!! "

Image classifier
3 lines of code
the key point to train the model is the PATH to data
particular structure of the data folder with train, validation, test folders
each folder with subfolders cats and dogs
pretty standard data folder stucture
validation accuracy
image vizualisation
learning rate
epoch
dataset
training set loss
validation set loss
Deep Learning
minimum point of a function
function which has hundreds of million parameters
algorithm
what is the loss or error at this point choosen at random
what is the gradient at this point
learning rate
learning rate finder
mini batch
GPU
parallel processing power of the GPU effectively (generally 64 or 128 images at a time)
mini batch = iteration
learn.lr_find()
learn.sched.plot()
10-1 = 1e-1
we are not looking at the point with the lowest learning rate
hyper parameters
fastai library
Adam
momentum
learning rate optimizer
dynamic learning rate
make your model better : give more data
overfitting
specialization of the model
the model hast to generalize
more labeled data
data augmentation
rotation, flip, zoom
aug_tfms
tfms_from_model
transforms_side_on
transforms_top_down
each type of image has a particular data ugmentation
learning rate versus loss
the higher learning rate that gives the lower loss
all local mimina are the same
unfreezing layers
precompute=True
activation is a number (feature, level of confidence, probability)
save the activations for all images of the dataset
first time you train the model, it is longer than after when precompute=True
when precompute=true, data augmentation does not have an impact
learn.precompue=False
overfitting = your model does not generalize
stochastisc gradient descent with restarts
cycle_len=1
learning rate annealing
cosine annealing
learning rate schedule
learn.sched.plot_lr()
with SGDR, you do not need to retrain your model with new random values of the model parameters
even better for generalization, you can save parameters values befor the restart and get the average of them (cycle_save_name)
learn.save()
learn.load()
each time you create a new object learn, you start with a new model with new sets of weights
weights / parameters
fine tuning
pretrained model
we’ve learned new layers on top of the pretrained model
frozen layers are layers not trained
learn.unfreeze()
layer 1 looks for edges and gradients
the early layers need no training or a little
diferential learning rates
lr = array([lr1,lr2,lr3])
if images look like the imagenet ones, divide by 10. If not, divide by 3.
learn.fit(lr, 3, cycle_len=1, cycle_mult=2)
size of images
learn.freeze_to (if you want to freeze a particular layer)
number of cycles
cycle_mult multiplies the length of a cycle after each training through a cycle
hyper parameters
all our inout images mut be squared in terms of dimensions as the dimensions have to be consistent to be computed by the GPU
predictions on validation set
TTA : Test Time Augmentation on validation or test data set when you want to get the prediction values (take the average of predictions after data augmentation)
log_preds,y=learn.TTA()
accuracy(log_preds,y)
cropping images is not recommended
fastai library is Open Source
pytorch
library on top of pytorch because pytorch is not so easy to use
model to phone, tensorflow
confusion matrix
plot_confusion_matrix(cm, data.classes)
training a world class image classifier in 8 steps (there are 3 main steps in fact)
dogbreed kaggle competition
kaggle cli script
pandas
csv files
label_df = pd.read_csv(label_csv)
a pivot table
max_zoom : random zoom up to the choosen number (like 1.1)
cross validation indexes
val_idxs = get_cv_idxs(n)
mostly imagenet images size is 224 x 224 or 299 x 299
dictionary comprehension
list comprenhension
matplotlib
histogram
size of validation set
when I start training a model, I want to do it very fast : so, I need small images (like 224 x 224 or smaller)
CUDA memory error (you should decrease your batch size)
Kernel restart
the more classes you have and the more difficult to get an accuracy very high
if you train something on a smaller size, you can call learn.set_data(get_data(299,bz)) and pass in a larger dataset
learn.freeze()
fully convolutional architectures can handle pretty much arbitrary sizes
underfitting means that cycle_len is too short
dataset balanced or unbalanced
mainly 3 steps : search for the learning rate, train with frozen layers, train with unfrozen layers
improve the training by increasing image sizes and get a better architecture
deconvolution
resnet34
resnext50
satelite images (Planet dataset)
learn.sched.plot_loss()
Crestle
Paperspace
AWS, AWS credit
AWS setup : console, EC2, launch an instance, Community AMIs, fastai, p2.xlarge, launch, key pair, Public Ip adress, ssh
cd fastai
git pull
conda env update
don’t forget stopping your GPU on Crestle/Paperspace/AWS !