Wiki: Lesson 3

ibunny · March 14, 2018, 1:16am

Finally got it! Thanks a million for your patience:relaxed:

srmsoumya · March 22, 2018, 10:17am

I faced the same issue, calling learn.predict() solves the issue, were you able to figure out the actual problem though?

hanumant · March 23, 2018, 5:44am

When I try to run the Dogbreed test set, i get log_preds.shape as (5, 10357, 120) and y.shape as (10357, 1)
I am not sure what the 5 is, I imagine the 120 is the one hot encoded version of the dog breed classes?

log_preds,y = learn.TTA(is_test=True)
log_preds.shape
(5, 10357, 120)

In any case I am guessing I cannot submit this to Kaggle. Can someone advise on how to correct the log_preds?

wdhorton · March 23, 2018, 11:15am

5 comes from the number of test time augmentations (from learn.TTA). Basically you have a list of predictions for 5 different transforms on the image. You’d want to average them in order to submit to Kaggle (or use learn.predict, which doesn’t do any augmentations)

hanumant · April 10, 2018, 12:35am

HI

I am trying to do the State Farm classification problem on kaggle. I am trying to use resnext101.

I looked at the image sizes they are 480x480, whereas I believe resnext101 uses 299?

Also I don’t think I can flip the images vertically as the labels care about the orientation of the object under classification.

Can anyone suggest what augmentations to apply?

fone · April 10, 2018, 12:38am

I want to get a “Confusion Matrix” for the results, but I get the error:

ValueError: Classification metrics can't handle a mix of multilabel-indicator and multiclass targets.

Maybe this is the wrong type of graph / visualization? Do you not do CMs for multi-label classifiers? And if you do, how would you do this?

I tried:

from sklearn.metrics import confusion_matrix
probs = np.mean(np.exp(multi_preds),0)
preds2 = np.argmax(probs, axis=1)
cm = confusion_matrix(y, preds2)

amir01 · April 19, 2018, 7:42am

@jeremy in planets competiton, why data = data.resize(int(sz1.3), ‘tmp’) is used instead of data = data.resize(int(sz), ‘tmp’)?? I couldn’t find answer about 1.3 on the forums.

also wanted to ask about learn.set_data. is it more fine tuning the CNN (which we trained on small images) now with same images but with bigger image dimensions??. Kind of strange thing bcz i thought training on same data with more epochs leads to overfitting but here just changing image dimensions and training again, it is avoiding overfitting.

pierreguillou · April 25, 2018, 1:16am

Check your understanding of the lesson 3

<<< Check your understanding of the lesson 2 | Check your understanding of the lesson 4 >>>

(original post in portuguese)

Hi guys,

I did watch again the video of the lesson 3 (part 1) to get the whole image and I took notes of the vocabulary used by @jeremy.

Let’s play ! OK ?
Can you give a definition / a url / an explanation for all the followings terms and expressions ?

If yes, you are done with the 3rd lesson !!!

PS : you do not want to test yourself or you want to check your answers ? Go to the blog post “Deep Learning 2: Part 1 Lesson 3” of @hiromi : " super travail !!! "

try to teach what you learned by posting in a blog
wiki thread in the Fastai forum
AWS fastai AMI
Github
Tmux (Ubuntu, Mac Os)
Understand why there are validation images not well classified
learning rate
why a low learning rate is safer but slower for training a NN ?
why a high learning rate can increase the value of the loss function ?
learn.lr_find(); learn.sched.plot()
batch size
SGDR
fastai vs pytorch
CNN ou Convolutional Neural Network
Resnet
Beginner Fastai forum
Kaggle site
How to download data from Kaggle : script kaggle-cli
pip install kaggle-cli
accepts the competition rules in Kaggle site
kg download -u user -p ‘password’ -c competition
How to download images from any sites
CurlWget as Google Chrome extension
symlinks
ls -l in a terminal
Quick DogsCats
fastai.conv_learner
tfms, data transformation
data object
shift + tab
test_name=“test”
learn object
precompute=True
learn.unfreeze()
learn.bn_freeze(True) for deeper NN (resnet50 and above) with similar dataset that Imagenet dataset ( if are you using a deep network on a very similiar dataset to your target (ours is dogs and cats) - its causing the batch normalization not be updated)
batch normalization
use TTA for get validation predictions
tensorflow, keras // pytorch, fastai
mobile applications
create a submission file
individual prediction
http://setosa.io/ev/image-kernels/
diference between element-wise product and matrix product ?
Video do Otavio Good : “A visual and intuitive understanding of deep learning”
kernel / filter of convolutional with a shape of 3 x 3
search for edges (left and top)
feature maps
non linearity, relu
max pooling
fastai/courses/dl1/excel
MNIST data base
filter to detects top edges
we get activation after the element-wise product by the convolutional filter
an activation is calculated
Relu means max(0, value)
pytorch stores convolutional filters as a tensor
a tensor is an array with more dimensions (additional axis)
the size of each hidden layer in a CNN is the number of convolutional filters used to get the feature maps
the size of a convolutional kernel has 3 dimensions and the third one is the number of feature maps in the input hidden layer
max pooling : kill the dimension by sub-sampling (keep the max) without over-lapping
fully connected layer (linear matrix product)
but big CNN gives big number of weights in the fully connected layers : risk of overfitting !
VGG (16 layers) : 138 millions of weights
VGG (19 layers) : more than 143 millions of weights
in theses CNN, the number of weights of the convolutional filters is about 20 millions : the majority of the weights comes from the fully connected layers
Resnet and ResNext do not use large fully connected layers
the 50-layer ResNet network has about 26 million weight parameters and computes ~16 million activations in the forward pass (https://www.graphcore.ai/posts/why-is-so-much-memory-needed-for-deep-neural-networks)
the fully connected layers do a classic matrice product
last layer : there is no Relu (than, we can have negative value)
softmax is an activation function that allows to get probabilities
softmax tends to take one thing out of the other (ie, with a probabilities clearly higher than the other ones) : its “personality” is to pick a thing (so, it is perfect for one or 2 label classifier)
sigmoid is an activation function uses for multi-label classifier because it gives a number between 0 and 1 (looks like a probability) for each label
Relu is an activation function too but it does not get probabilities
an activation function is a function applied on activations
in Deep Learning, an activation function adds a non-linearity
we must know log, exp
activation functions have a personality
we can not use softmax for multi-label classification
if your objective is to classify multi-labels images, you can not use ImageClassifierData.from_paths because an image can not be in more than a folder. Then, you need to use ImageClassifierData.from_csv
Good news : the Fastai library will recognize in your csv file if they are more than 2 labels (multi-label classification)
data.val_ds (ds como data set in pytorch) : gives you a single image (or object) back
data.val_dl (dl como data loader in pytorch) : gives you a transformed mini batch
in pytorch, to get the next mini batch, we use a generator (iterator) : next(iter(data.val_dl))
if you know python, you learn pytorch naturally
zip takes 2 lists and combines them : list(zip(data.classes,y[0]))
1 hot encoded vector
CatsDogs and DogsBreed were a single-label classification
images from The Planet competition are not like ones used in Imagenet competition
you can change the input image size during the training for the NN that have an adaptative pooling before the first fully connected layer like Resnet (but not VGG) : learn.set_data(get_data(sz))
get data (imagens) resize before to pass them to the data object thanks to data.resize(int(sz*1.3), ‘tmp’) : speed-up ! (faster than resize directly in the tfms)
after dogsbreed, try to run the Planet jupyter notebook
metrics for accuracy : metrics = [f2] (f2 uses fbeta_score) and pass it to the learn object : learn = ConvLearner.pretrained(arch, data, metrics=metrics)
in the Fastai library, everything can be changed
sigmoid function is used for logistic regression
fastai chooses automatically softmax or sigmoid activation function
when you use a pretrained CNN network, it means that the weight of the first layer of your new models are not random but the ones of the last fully connected layers you added, are random. Then, you need to train firstly theses last layers before to unfreeze and train teh whole network. If not, the random weight of the last layers will destroy the weights of the first layers (from the pretrained model)
the GPU takes a center crop on each input image of size sz. That’s why it is important to do Data Augmentation before on the input dataset
in the fastai library, there is a concept of layer groups
learn.summary()
tables of data : structured data
audio, images, natural linguaguem : unstructured
Grocery Sales Forecasting competition in Kaggle
Rossman data
from fastai.structured import *
from fastai.colum_data import *
pandas (book : Python for Data Analysis)
test = pd.read_csv(f’{PATH}test.csv’, parse_dates=[‘Date’])
there is a difference with the DogsCats dataset : we do a lot of preprocessing on these structured data
enter kaggle and do competitions !

jcatanza · April 30, 2018, 4:14pm

If you are heading into Lesson 4, don’t miss @Rachel’s post An Introduction to Deep Learning for Tabular Data

sashank06 · May 2, 2018, 8:57pm

Hi @jeremy and @rachel,
I was about to start Lesson 3 of the part 1 course and I keep getting a youtube error “Watch this video on youtube playback on other sites has been disabled by the video owner” and if I use the youtube link, there is a caption posted by Jeremy which states, “NB: Please go to http://course.fast.ai to view this video since there is important updated information there. If you have questions, use the forums at http://forums.fast.ai” . Which one should I view?

jcatanza · May 3, 2018, 11:47pm

Hi all.

I am working through the Lesson 3 Rossman notebook

When I get to the command (just below the Sample subsection)

m.fit(lr, 3, metrics=[exp_rmspe])

I get KeyError: <weakref at 0x7fb6eb4e2688; to 'tqdm' at 0x7fb6eed23c88>

The details are below. Has anyone dealt with this error, or know what to do about it?

KeyError Traceback (most recent call last)
in ()
----> 1 m.fit(lr, 3, metrics=[exp_rmspe])

~/fastai/courses/dl1/fastai/learner.py in fit(self, lrs, n_cycle, wds, **kwargs)
285 self.sched = None
286 layer_opt = self.get_layer_opt(lrs, wds)
–> 287 return self.fit_gen(self.model, self.data, layer_opt, n_cycle, **kwargs)
288
289 def warm_up(self, lr, wds=None):

~/fastai/courses/dl1/fastai/learner.py in fit_gen(self, model, data, layer_opt, n_cycle, cycle_len, cycle_mult, cycle_save_name, best_save_name, use_clr, use_clr_beta, metrics, callbacks, use_wd_sched, norm_wds, wds_sched_mult, use_swa, swa_start, swa_eval_freq, **kwargs)
232 metrics=metrics, callbacks=callbacks, reg_fn=self.reg_fn, clip=self.clip, fp16=self.fp16,
233 swa_model=self.swa_model if use_swa else None, swa_start=swa_start,
–> 234 swa_eval_freq=swa_eval_freq, **kwargs)
235
236 def get_layer_groups(self): return self.models.get_layer_groups()

~/fastai/courses/dl1/fastai/model.py in fit(model, data, n_epochs, opt, crit, metrics, callbacks, stepper, swa_model, swa_start, swa_eval_freq, **kwargs)
121 if hasattr(cur_data, ‘val_sampler’): cur_data.val_sampler.set_epoch(epoch)
122 num_batch = len(cur_data.trn_dl)
–> 123 t = tqdm(iter(cur_data.trn_dl), leave=False, total=num_batch)
124 if all_val: val_iter = IterBatch(cur_data.val_dl)
125

~/fastai/courses/dl1/fastai/imports.py in tqdm(*args, **kwargs)
45 if in_notebook():
46 def tqdm(*args, **kwargs):
—> 47 clear_tqdm()
48 return tq.tqdm(*args, file=sys.stdout, **kwargs)
49 def trange(*args, **kwargs):

~/fastai/courses/dl1/fastai/imports.py in clear_tqdm()
41 inst = getattr(tq.tqdm, ‘_instances’, None)
42 if not inst: return
—> 43 for i in range(len(inst)): inst.pop().close()
44
45 if in_notebook():

~/anaconda3/envs/fastai/lib/python3.6/site-packages/tqdm/_tqdm.py in close(self)
1096 # decrement instance pos and remove from internal set
1097 pos = abs(self.pos)
-> 1098 self._decr_instances(self)
1099
1100 # GUI mode

~/anaconda3/envs/fastai/lib/python3.6/site-packages/tqdm/_tqdm.py in _decr_instances(cls, instance)
436 with cls._lock:
437 try:
–> 438 cls._instances.remove(instance)
439 except KeyError:
440 if not instance.gui: # pragma: no cover

~/anaconda3/envs/fastai/lib/python3.6/_weakrefset.py in remove(self, item)
107 if self._pending_removals:
108 self._commit_removals()
–> 109 self.data.remove(ref(item))
110
111 def discard(self, item):

KeyError: <weakref at 0x7fb6eb4e2688; to ‘tqdm’ at 0x7fb6eed23c88>

pasqal · May 7, 2018, 4:12pm

Same thing here, I get the “Watch this video on Youtube” message, too if I try to play the embedded video directly from http://course.fast.ai/lessons/lesson3.html .

I am browsing from Barcelona, Spain. Thought that might be the problem but switched to browsing through a proxy in the US and still got the same message.

shubham3121 · May 8, 2018, 5:02am

Hi guys, here is a multi label image classification challenge for you to implement your skills learnt in this lesson:
https://www.hackerearth.com/challenge/competitive/deep-learning-3/

shwetap7 · May 8, 2018, 6:41am

hey @shubham3121
its nice problem to work with multi label classification problem

esc9 · May 8, 2018, 10:23am

Hi to everyone, I’ tried to fine-tunning ResNet50 with Keras. It works perfectly fine but the validation accuracy is stuck at 0.50. I really can’t figure out why because the code is very simple and it works without problems with sequential custom models.

I attach a pdf file of the notebook with the incriminated code

Thank you so much.

resnet50_keras_fine_tune.pdf (70.6 KB)

RogerS49 · May 9, 2018, 11:12pm

http://forums.fast.ai/t/subject-lesson-3-video-is-this-the-intended-behaviour/16072

I think it’s fixed now

RogerS49 · May 9, 2018, 11:18pm

Please note that the kaggle client interface has changed. Please go to

https://github.com/Kaggle/kaggle-api

RogerS49 · May 9, 2018, 11:39pm

Not sure you have a problem. Please note some of your code is missing in your pdf at a page break.

Okay I see whats at the page break.

Your problem may be you’ve taken two steps out of the note book if you compare to this one. Each adds to the next. So you are missing the the 3 epoch block which is followed by the split block and then the fine-tuning fit. Your train model has only one epoch.
The other big difference is you only have 8000 images belonging to 2 classes, I think it should be 23000
See:-

https://github.com/fastai/fastai/blob/master/courses/dl1/keras_lesson1.ipynb

esc9 · May 10, 2018, 12:05am

Thank you @RogerS49 so much for your help. The dataset has less images because I intentionally reduced the number of images and there is only one epoch because also with more the validation accuracy is stuck at 0.5.

I noticed that also in the original code there is a similar problem and exactly for this reason I’d like to know if there is a possible solution or at least a reasonable explanation Thanks.

pzyxian1 · May 14, 2018, 6:07am

Question on how to save weights on every restart of cosine annealing:

learn.fit(lrs, 3, cycle_len=1, cycle_mult=2)
This runs 2 restarts : (1), (1,2), (1,2,3,4) each digit here represents a cycle length of 1 epoch

Is there any way to save weights for each cosine restart rather than saving the weights in the end?
Sometimes when all is done the model might be quite overfitted.