Lesson 6 - Official topic

Oh you should make a PR to fix the notebook then :slight_smile:

Pretty fitting that Jeremy mentioned the movie “The Mask.” :slight_smile:

3 Likes

I was corrected below, fast.ai does support this.

Fast.ai doesn’t support it out of the box, but you can modify the network by either changing the initial convolution from three layers to four or more and initializing those weights with the current weights, either a mean, or copying them. The pretrained model will probably require more training than when used on a three channel image.

You can see an example from Iafoss on Kaggle here.

1 Like

Yet it does :wink:

3 Likes

Without cross validation, how do we measure bias / variance tradeoff in deep learning models ? Or that’s also not a thing in deep learning ?

1 Like

Don’t know about any tutorial, but you may like this pull request and related discussion on the forums. The short answer is that you can pass in something like:

model = create_cnn_model(resnet34, n_in=4, n_out=dls.c)

(Also tagging @giacomov and @bwarner in case it’s useful – check out the updated fastai2 codebase!)

7 Likes

Can we say collaborative filtering is same as tabular dataset with only categorical variables? where we convert the categories as embedding and train a neural network.

if i used fit one cycle for say 20 epochs but i find that there still need for more training. fit one cycle begins lr all from beginning .
What is way to resume the training further from previous check points . i loaded the weights of previous chk points. How should i adjust parm of foc so it resume from previous left point

2 Likes

I am having hard times convincing myself of this.
I guess the same reasoning applies to progressive resizing, e.g. changing images shapes along the way.

My point is that we are basically applying an already trained model, which has weights matrices of specific shapes, so we are applying the exact same matrices to our problem, and I am trying to visualize how all fits together.

I get that convolutions are image-shape independent and, at the end of the day, it all boils down to how many filters we use. Still I am trying to wrap my head around it :smiley:

To put in simpler terms, Tabular dataset, you have a Xs and ys. In collaborative filtering, you have sort a table for each candidate with holes to be filled on the table. It doesn’t necessarily need to have categorical variables.

1 Like

Never found a proper way. If I can’t restart from scratch and just use more epochs in fit_one_cycle, I usually keep going with fit and a small learning rate, instead of fit_one_cycle. But I’d love more guidance here.

2 Likes

yes i also tend to adopt similar approach,just that i also reduce pct start to say 0.1 so to not overshoot previous learning points again

Question: what if there are some “super popular” movies that every one watches. Does that affect how we train a collaborative filtering model?

1 Like

Jeremy will answer that question a bit later :slight_smile:

1 Like

have you looked at Resume training with fit_one_cycle this is for fastaiv1 but the API is the same I guess.

5 Likes

Just to answer my own question and for reference for others who might not have gotten to it by reviewing today’s course yet: Jeremy today said that he will touch on NLP in one of the following lessons. Yay! :slight_smile:

5 Likes

By adding a channel to an image, I am referring to something like what Jermey mentioned as encoding time in the image frame. Basically handling cases in which you have an image and some other vector of numbers that you want to feed into the model

1 Like

Thanks for the pointer, but it’s not the same problem. Here we are talking about having completed successfully an entire fit_one_cycle run, but realizing that it was not long enough (say maybe train loss and validation loss were still decreasing and accuracy was still going up). So now what do we do? Of couse we can restart from scratch with a larger number of epochs, but for big models this is very time consuming. So, is there a way to “keep going” without having to restart?

What I’m doing is to “keep going” with fit and a small lr, instead of fit_one_cycle. Another possibility, would be to use another fit_one_cycle with a very low lr.

3 Likes

In the original dataset, there are a number of blank ratings. Does the loss function assume these are ‘zero’? If so, why does that not matter?

2 Likes

No those blank ratings are the things we have to predict.

1 Like