A Guided Walk-through of 2.0 (Like Practical Deep Learning for Coders)

(Zachary Mueller) #1

For a bit of background, this past year at my university I ran my own “study group” where I proctored my own rendition of Practical Deep Learning for Coders. In this I focused more on a datatype by datatype basis each lecture and went in-depth into each. I knew that eventually I wanted to redo these notebooks into 2.0 for the Spring when I kicked it off again, but as I designed the whole thing to be intro friendly, I have decided to do it now and port 2.0 over. In these notebooks I will go over the high-level API differences so that those who may only stay at this level right now don’t get too overwhelmed by all the new information that 2.0 brings.

The first few notebooks will be very similar to the original course as it’s a great introduction, and then branching off from there. The first one is available here where I go over PETs! Over the next week or two I’ll slowly be bringing in more notebooks and converting them over.

Notebooks Available:

01 Image Classification (and an introduction to the library!)
02 Custom Image Classification (and how to use .label_from_folder())
03a Tabular Data (and how to use labeled test sets)
03b K-Fold Validation and Ensembling
04b Permutation Importance
05 Multi-Label and Variations with the DataBlock API
06 Utilizing the State of the Art
07 Image Regression
08a IMDB Sample (text in a DataFrame or csv)
10 Segmentation

  • Note these notebooks were originally made in Colab so they will work in that environment as well as regular Jupyter :slight_smile:
30 Likes

Deep Learning na Unb (Brasília) - Parte 1 - Lição 1
Fastai V2 Resources - blogs/projects/articles/research papers
(Amrit ) #2

@muellerzr just what I was looking for! Cheers!

1 Like

(Zachary Mueller) #3

Thanks @amritv :slight_smile: I just uploaded a tabular example, I’m working on getting an example with Tabular RAPIDs next, and then Baysian optimization, k-folds, along with Regression (rossmann)

2 Likes

(Amrit ) #4

Sounds great - got his error

AttributeError: type object ‘Image’ has no attribute ‘size’

trying to figure out why - any ideas?

Ran the following lines in Colab to load all dependencies:

> !pip3 install torch===1.3.0 torchvision===0.4.1 -f https://download.pytorch.org/whl/torch_stable.html
> !pip install git+https://github.com/fastai/fastai_dev > /dev/null

However getting this error:

AttributeError Traceback (most recent call last)

<ipython-input-3-352c3a9f46af> in <module>() 1 from fastai2.basics import * ----> 2 from fastai2.callback.all import * 3 from fastai2.vision.all import *

2 frames

/usr/local/lib/python3.6/dist-packages/fastai2/vision/core.py in <module>() 20 #Cell 21 if not hasattr(Image,’_patched’): —> 22 _old_sz = Image.Image.size.fget 23 @patch_property 24 def size(x:Image.Image): return Tuple(_old_sz(x))

AttributeError: type object ‘Image’ has no attribute ‘size’

When running:

> from fastai2.basics import *
> from fastai2.callback.all import *
> from fastai2.vision.all import *
1 Like

(Zachary Mueller) #5

Ah yes I think I need to update that notebook. You need to install the most recent Pillow for that to work.

!pip install Pillow --upgrade

Thanks!

0 Likes

(Amrit ) #6

Got it, saw that at this post Fastai-v2 - read this before posting please! 😊

Works now!

1 Like

(Zachary Mueller) #7

I’ve added a k-fold validation example notebook and at the bottom of it is how to ensemble models together

0 Likes

(Jeremy Howard (Admin)) #8

@muellerzr very nice! Note that as of today there’s a full lesson 1 nb available, thanks to @sgugger

5 Likes

(Zachary Mueller) #9

Thanks @Jeremy! It means a lot hearing that from you :slight_smile: I’ll try to not step on @sgugger’s toes too much then. These will focus more on a direct comparison of what’s new and some nice implementations (feature importance, Ranger, etc) :slight_smile:

1 Like

(Jeremy Howard (Admin)) #10

Don’t worry about his toes - I was only pointing it out in case there was some useful ideas you could steal.

3 Likes

#11

Yes, it’s never a problem to have too many resources :slight_smile:

2 Likes

(Zachary Mueller) #12

Thanks Jeremy! There certainly were!

Sounds good Sylvain I’ll do my best!

I’ve updated notebooks 1 and 2 now with ImageDataBunch examples as well as the full PipeLine (also from folder now works how we’re used to! Thanks Sylvain!)

0 Likes

(Zachary Mueller) #13

I’ve added a notebook detailing permutation importance based on Pak’s work in 1.0 here

2 Likes

(Zachary Mueller) #14

So I was hoping for a RAPIDs notebook, but Colab is not seeming to let me use a T4 instance… :frowning: otherwise I have a multi-label classification based on sgugger’s example :slight_smile:

0 Likes

(Pavel) #15

Hi.
@muellerzr, trying to understand why importances in your notebook look not familiar to me (if I read importances correctly the top feature is responsible for only 3% of accuracy) as I compare it with my football case, where top feature is 30%-important, I have noticed that you calculate FI on a separate set. Have I understood it correctly?
If so it raises a serious question, should we calculate FI on a whole set (including the one we trained our model) or on a separate one (to try to be unbiased)?
I have thought about it a bit beforehand and for now my mind is the following. As we, in this type of FI calculation, strictly speaking, more analyse the model itself than the data, we can use the whole dataset. It allows us a) use more data (which can be important when there is a scarcity of it) and b)maybe get more clear results as unaccuracy l of the test (never seen by the model) set is too big and have often the same order of magnitude than permutaion unaccuracy. I mean that (base_error - value) can be more chaotic as base_error is as big as value.
By the way, I don’t really remember if Jeremy mentioned something regarding this topic on it in the video with Feature Importance concept

1 Like

(Zachary Mueller) #16

Hi @Pak! To answer your question, I chose (and have been using) a separate test entirely for a few reasons. I wanted FI to be focused on how my model is behaving on unseen data, to see what in the real world my model will do. This is two-fold, as it eliminates any biases during training in my features, and also allows for a better understanding of that model’s behavior during post-production.

Jeremy went over it briefly in the Intro to ML course, and I cannot recall correctly if he did permutation importance (it was random forests then)

Let me know if you have any thoughts or questions! :slight_smile:

Also, here is a note from the scikit-learn documentation:

Using a held-out set makes it possible to highlight which features contribute the most to the generalization power of the inspected model. Features that are important on the training set but not on the held-out set might cause the model to overfit.

1 Like

(Pavel) #17

I think I do understand what FI shows using training set. I think it determines to what columns model binds itself the most, in some sense it means in what columns there is the most consumable info connected to dependent variable contained. But I’m not quite have intuition what this method of FI will ‘mean’ on an unseen data.
And I’m afraid that it can be hard, to catch the ‘useful signal’ of importance in the chaos of low accuracy on unseen data. Similar problem (low value of signal to noise ratio) turned me away from calculating FI by retraining :frowning:

This maybe a part of the answer. I don’t know what testset-FI mean, but the difference in FI’s over training and test sets is a good source of info answering to which features model maybe tend to bind too much.
I should check that out, thanx

1 Like

(Zachary Mueller) #18

I’ve added a notebook detailing how to use the new optimization function Ranger and a new fit function, as well as the Mish activation function :slight_smile: Still working on an update to pytorch before I can continue onto any of the other Vision-based tasks. There’s a few implementation issues that I’m working on sorting out so you won’t get quite the accuracy you expect (working on that)

1 Like

(Jeremy Howard (Admin)) #19

@muellerzr FYI there’s a Lookahead wrapper for optimizers, instead of a Ranger optimizer. Use it like so:

def optfunc(p, lr=defaults.lr): return Lookahead(Radam(p, lr=lr))
2 Likes

(Zachary Mueller) #20

Exactly what I did, thanks :wink:

def opt_func(ps, lr=defaults.lr): return Lookahead(RAdam(ps, wd=1e-2,mom=0.95, eps=1e-6,lr=lr))

(I lined up our hyperparameters)

1 Like