Hello,
Here is a timeline of the video of the lesson 1 of yesterday (with the links to the corresponding parts in the video).
Welcome
- (not related to DL) Welcome speech by Pete Baker: https://www.youtube.com/watch?v=7hX8yKCX6xM&t=1580
- (not related to DL) Welcome speech by David Uminsky (Director of the USF Data Institute): https://www.youtube.com/watch?v=7hX8yKCX6xM&t=1869
Jeremy Howard
-
Home and general information: https://www.youtube.com/watch?v=7hX8yKCX6xM&t=1964
– Thread on lesson 1 in the forum: https://www.youtube.com/watch?v=7hX8yKCX6xM&t=2415
– Thread: https://forums.fast.ai/t/lesson-1-class-discussion-and-resources/27332
– Docs on the course: http://course-v3.fast.ai/
– Fastai Docs in html: http://docs.fast.ai
– Fastai docs in github: https://github.com/fastai/fastai_docs
– Fastai Docs in Jupyter Notebooks: https://github.com/fastai/fastai_docs/tree/master/docs_src -
Online GPU setup: https://www.youtube.com/watch?v=7hX8yKCX6xM&t=2502
– install a GPU: http://course-v3.fast.ai/#using-a-gpu
– FAQ on the course: https://forums.fast.ai/t/faq-and-resources-read-this-first/24987 -
Beginning of the lesson: https://www.youtube.com/watch?v=7hX8yKCX6xM&t=3142
-
Jupyter notebook: https://github.com/fastai/course-v3/blob/master/nbs/dl1/00_notebook_tutorial.ipynb
Notebook 1
(https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson1-pets.ipynb)
-
Step 1: import and prepare the images (https://www.youtube.com/watch?v=7hX8yKCX6xM&t=3890)
– creation of the generaldatabunch
dataset which contains the 3 datasets train, val and test
-
Step 2: create the model (https://www.youtube.com/watch?v=7hX8yKCX6xM&t=5274)
– creation of thelearn
model which contains the neural network architecture and thedatabunch
dataset (we can add the error evaluation metric on the val set as argument if we want)
– Learning Transfer: we use the parameters of a model already trained to recognize objects in images (resnet34)
– Overfitting: to check that during his training our model does not specialize on the train set but learns well to recognize the general characteristics of the objects to detect, we use a val set on which we calculate the error (see metric above) in thelearn
model
-
Step 3: train the model with the
fit_one_cycle()
method and notfit()
as in the previous version of the course (explication of the Leslie Smith paper in the article of @sgugger : The 1cycle policy)
After the break: https://www.youtube.com/watch?v=7hX8yKCX6xM&t=6536
-
Step 4: analyze the predictions made by the model to understand how it works and possibly improve it (https://www.youtube.com/watch?v=7hX8yKCX6xM&t=7922)
– use of theinterp
object instantiated by theClassificationInterpretation.from_learner (learn)
method
– 3 methods to use on theinterp
object:
—plot_top_losse()
to view the images on which the model generates a big error (loss),
—plot_confusion_matrix()
which displays the Matrix Confusion,
—most_confused()
which publishes the list of labels (classes) predicted with the greatest number of errors -
Step 5: improve the model (https://www.youtube.com/watch?v=7hX8yKCX6xM&t=8310)
– find the best Learning Rate with thelr_find()
method and thenrecorder.plot()
(to display the loss-vs-lr curve)
– then use theunfreeze()
method on thelearn
model in order to be able to train all the layers of the resnet34 network and not only those added at the end of the model in order to have an architecture capable of giving a probability for each of the 37 classes … BUT using different Learning Rate according to the layers vialearn.fit_one_cycle(2, max_lr=slice(1e-6,1e-4))
: the idea is that the first layers do not need to be much modified because they have already been trained to detect simple geometric shapes that are found in all images. -
Step 6: we can still get a better result (a lower error) by changing the model and using a more complicated (deeper) model like resnet50 (https://www.youtube.com/watch?v=7hX8yKCX6xM&t=9018)