Lesson 6 - Official topic

sgugger · April 22, 2020, 3:23am

Oh you should make a PR to fix the notebook then

JPKab · April 22, 2020, 3:23am

Pretty fitting that Jeremy mentioned the movie “The Mask.”

bwarner · April 22, 2020, 3:23am

I was corrected below, fast.ai does support this.

~~Fast.ai doesn’t support it out of the box, but~~ you can modify the network by either changing the initial convolution from three layers to four or more and initializing those weights with the current weights, either a mean, or copying them. The pretrained model will probably require more training than when used on a three channel image.

You can see an example from Iafoss on Kaggle here.

sgugger · April 22, 2020, 3:24am

Yet it does

tanguyen14 · April 22, 2020, 3:24am

Without cross validation, how do we measure bias / variance tradeoff in deep learning models ? Or that’s also not a thing in deep learning ?

jwuphysics · April 22, 2020, 3:25am

Don’t know about any tutorial, but you may like this pull request and related discussion on the forums. The short answer is that you can pass in something like:

model = create_cnn_model(resnet34, n_in=4, n_out=dls.c)

(Also tagging @giacomov and @bwarner in case it’s useful – check out the updated fastai2 codebase!)

arunslb123 · April 22, 2020, 3:26am

Can we say collaborative filtering is same as tabular dataset with only categorical variables? where we convert the categories as embedding and train a neural network.

champs.jaideep · April 22, 2020, 3:26am

if i used fit one cycle for say 20 epochs but i find that there still need for more training. fit one cycle begins lr all from beginning .
What is way to resume the training further from previous check points . i loaded the weights of previous chk points. How should i adjust parm of foc so it resume from previous left point

FraPochetti · April 22, 2020, 3:28am

I am having hard times convincing myself of this.
I guess the same reasoning applies to progressive resizing, e.g. changing images shapes along the way.

My point is that we are basically applying an already trained model, which has weights matrices of specific shapes, so we are applying the exact same matrices to our problem, and I am trying to visualize how all fits together.

I get that convolutions are image-shape independent and, at the end of the day, it all boils down to how many filters we use. Still I am trying to wrap my head around it

imrandude · April 22, 2020, 3:28am

To put in simpler terms, Tabular dataset, you have a Xs and ys. In collaborative filtering, you have sort a table for each candidate with holes to be filled on the table. It doesn’t necessarily need to have categorical variables.

giacomov · April 22, 2020, 3:28am

Never found a proper way. If I can’t restart from scratch and just use more epochs in fit_one_cycle, I usually keep going with fit and a small learning rate, instead of fit_one_cycle. But I’d love more guidance here.

champs.jaideep · April 22, 2020, 3:30am

yes i also tend to adopt similar approach,just that i also reduce pct start to say 0.1 so to not overshoot previous learning points again

chengwliu · April 22, 2020, 3:30am

Question: what if there are some “super popular” movies that every one watches. Does that affect how we train a collaborative filtering model?

sgugger · April 22, 2020, 3:31am

Jeremy will answer that question a bit later

vijayabhaskar · April 22, 2020, 3:32am

have you looked at Resume training with fit_one_cycle this is for fastaiv1 but the API is the same I guess.

steef · April 22, 2020, 3:33am

Just to answer my own question and for reference for others who might not have gotten to it by reviewing today’s course yet: Jeremy today said that he will touch on NLP in one of the following lessons. Yay!

Dina · April 22, 2020, 3:35am

By adding a channel to an image, I am referring to something like what Jermey mentioned as encoding time in the image frame. Basically handling cases in which you have an image and some other vector of numbers that you want to feed into the model

giacomov · April 22, 2020, 3:35am

Thanks for the pointer, but it’s not the same problem. Here we are talking about having completed successfully an entire fit_one_cycle run, but realizing that it was not long enough (say maybe train loss and validation loss were still decreasing and accuracy was still going up). So now what do we do? Of couse we can restart from scratch with a larger number of epochs, but for big models this is very time consuming. So, is there a way to “keep going” without having to restart?

What I’m doing is to “keep going” with fit and a small lr, instead of fit_one_cycle. Another possibility, would be to use another fit_one_cycle with a very low lr.

sfyash · April 22, 2020, 3:35am

In the original dataset, there are a number of blank ratings. Does the loss function assume these are ‘zero’? If so, why does that not matter?

sgugger · April 22, 2020, 3:36am

No those blank ratings are the things we have to predict.