Lesson 6 - Official topic

The matrix is not full, it is sparse. (I mean the matrix from users/movies to ratings)

1 Like

This is a general problem in AI that doesn’t have easy answers. Recognizing out-of-distribution samples is difficult and usually requires a multi-pronged approach (i.e., some sort of unsupervised learning). See, for example, this paper on discovering new categories, or this one on the issue of “domain adaptation” (when your training data set is shifted from the test set in some way).

5 Likes

but svd is still possible on a non full rank matrix

2 Likes

I think Jeremy might talk about this, but I think practically speaking these latent factors are much more important and results in a much simpler model. Video based Architectures tend to be very heavy.

do we need to always build the path for image and pass to image block. earlier fastai was doing it by itself building the path +item+suffix

aha! Got him! He run out of batteries
 so he IS an AI :wink:

10 Likes

It depends on which API you feel like using. For example, in the mid-level API, you can do something like

block = DataBlock(
    [...]
    get_x=ColReader(col_name, pref=f'{PATH}/to/images/', suff='.jpg'),
    get_y=[...]
)
1 Like

is “@” for matmul overloaded for pytorch or fastai.core or base python?

dumb question would collaborative filtering be similar to svd for finding similar documents or similarity in a corpus for NLP?

It comes from PyTorch.

1 Like

thanks

in docs do we have the usage shown for various versions of datablocks like from df,from func etc etc which we had in earlier version

What might help here is converting from e.g. RGB to CMYK – both of which are standard color models that are used commonly. CMYK is more typical for print whereas RGB is more typical for computer displays – but often they can be used interchangeably and conversion is pretty easy.

1 Like

Isn’t that crazy expensive? a o(n**2) operation against a o(1) operation?

2 Likes

ah, ok, answered already in the class :wink:

dunder = double under[score]

3 Likes

Does DNN based models for Collab Filtering work better than more traditional approaches like SVD / other Matrix Decomposition ?

3 Likes

Yep, you should definitely look these up in the documentation! For vision the DataBlock/DataLoaders page is very straightforward, e.g., http://dev.fast.ai/vision.data#ImageDataLoaders.from_df

1 Like

Thanks for the detailed explanation @jwuphysics! That was my point - AFAIK it’s not that easy/trivial to acknowledge an unknown class and I wouldn’t expect that simply by using multi-label this problem would be solved. Unless the BCEWithLogitLoss loss function mentioned by @imrandude is robust enough for such handling such type of situation. If yes, then that would be great news! :slight_smile: But it seems that I will have to test by myself to check what happens.

2 Likes

Can anyy Matrix Factorizations be modeled as (Deep) Neural Network ? A papers that explains this ?

In theory NN can approximate any function, so it might be feasible
 I don’t know of any papers for matrix factorization.