Lesson 2 - Official Topic

When we look at what is being recognized at these low layers, we’re human beings doing the recognizing. Is it not true that often we won’t know what the machine is actually ‘recognizing’? Are we not just kind of cherry-picking those layers that we can recognize?

1 Like

I think class activation maps could answer this question. Which I think they will cover later since I can see it in the book.

1 Like

From what I saw, Captum provides a cool UI which says which part of the image are used to classify, say a cat image as cat instead of dog… Would love to integrate it with Fast AI

It needs to be the same kind of model, so it won’t work across tasks like CV vs text if they don’t use the same kinds of models.

1 Like

Yes, I see that the Deconvolution , Neuron Deconvolution functionality in Captum implements the same paper Jeremy is discussing now. Would be helpful if there’s a fastai callback for this.

1 Like

This is being looked at by fastai users: Captum model interpretability library

4 Likes

If you use transfer learning for text classification and you trained your model using web data, is there a possibility of copyright infringement by training with such data? Same question for images.

2 Likes

Not a data science here but if we add more layers would it make the model better just because I can do more recognition of the images or would it actually make the model overfit or worse just because it has more layers? I guess what I’m asking is that is there like a good rule of thumb for selecting the architecture resnet50 vs 34?

2 Likes

That is a complicated question and we don’t know the answer. Neural nets can certainly pick more than you think they would, so you should try to train models on the data you want to publish before making it public and see if you can recover sensitive information or not.

4 Likes

Great question! It really depends on the data you have. For small datasets, 34 will probably be better, but if you have more, 50 will likely get better results.
The right answer is: try it! It’s not as if it takes a long time to train a model with transfer learning :wink:

3 Likes

Are there cases where we would use another pre-trained model than imagenet? Like for sound for example?

We don’t have pretrained models for sound AFAIK, but for texts, pretrained models are everywhere since… last year.

3 Likes

I believe here’s the post mentioned: Splunk and Tensorflow for Security: Catching the Fraudster with Behavior Biometrics

Edit: I’ll add these to the lecture wiki during break/after the lecture :slight_smile:

7 Likes

Where in the book are the terms?

2 Likes

Not sure about audio but different domains do have different pretrained models. For example in many natural language processing problems models are pretrained on a dataset called WikiText (a large collection of Wikipedia articles).

4 Likes

Yes, if you have a model pretrained on data more similar to your dataset, you should use that one.

E.g. using a model pretrained on x-rays would be a better starting point if you are doing something with x-rays, compared to ImageNet.

2 Likes

Best way to learn! Code experimentation! :slight_smile:

1 Like

From Dinesh C.: During fine tuning should we focus solely on metric or should we compare training loss vs validation loss understand undercutting/overfitting ?

1 Like

this chapter

1 Like

Are filters independent? By that I mean if filters are pretrained might they become less good in detecting features of previous images when fine tuned?

2 Likes