Lesson 2: further discussion ✅

Q: to make model generalize well, even to an extent where ‘test’ images look ‘different’ in some ways that the images that model was trained on. but the ‘test’ images are not available just expected to look ‘different’ in some ways. are the ‘standard’ over fitting techniques good enough or do i need to do something extra? like higher dropout, more aggressive transformations in data augmentation, less cycles/epochs, higher learning rate? to say even to sacrifice the validation score?

Q: Sometimes (especially after training the head layers and unfreezing) the learning rate finder doesn’t show a characteristic downslope:
image
(this image is taken from the lesson 1 nb https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson1-pets.ipynb). From last year’s course we are used to look for a figure like:
image
(this image is taken from https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson2-planet.ipynb)

My question: How to interpret the learning rate finder results if it doesn’t have a downslope?

27 Likes

So normally in DL, there’s a very vague understanding of what we mean by generalization. Sure there is the fact that we would want the model to be invariant to all kinds of transforms and even some distortions, there are limitations to this. One of the most crucial assumptions when we develop DL models is that the train, val and test set come from the same distribution which simply means the same dataset.

Although what you suggested might make the model more robust to invariance i wonder if it would lead to better generalization.

One other thing to mention is most real world examples are noisy in nature, the thing about datasets we most often don’t see are these datasets have been created through careful curation. So to make it robust to noise as well, adversarial training can be done.

1 Like

what if we wanted for example take model trained on images from ‘south’ to ‘north’ or from ‘east’ to ‘west’. We could expect some differences in images. But how to train a model that would work on both image sets, but without having access to both image sets. We can only train on say ‘south’ data set but make inference on ‘north’ data set. are there techniques for that?

1 Like

I didn’t understand inference. Why are we creating a new data bunch, instead using the previous one?

can’t we just use the older data bunch and input our image?

empty_data = ImageDataBunch.single_from_classes(path, data.classes, tfms=get_transforms()).normalize(imagenet_stats)

I have a question for after class regarding the “delete photos from dataset” concept (new widget) introduced:

In which cases does it make sense to do delete images that “don’t belong”?
In which cases is it better to create a new “other” category in order for the network to be able to discern between the actual classes and random bullshit (“none of the above”)?

Especially in “real-world” multiclass settings involving real people you will always get those (people uploading hotdog photos to the cat/dog classifier app etc.)?

I have wondered this e.g. in the google quickdraw dataset. There is no “none” /“other”/“random” category, although clearly a lot of times people just doodle random stuff not belonging to any of the 340/345 categories. Would it not be helpful to distinguish this instead of predicting one of the existing known classes? Or would this hinder the network from learning the actual classes?

Is it better to train only on the correct categories and then have a mechanism that based on very low probabilities across categories will say “none of the above”? (Isn’t this difficult when using softmax, because that will still give you some “winner” category most of the time)

4 Likes

You can while you are still in the same notebook, same session, and have everything initialized anyways.

What this method refers to is if you have trained a model, that “phase” of the project is finished, and now you just want to run that model as part of an app (and most likely not within a notebook). You don’t want to load any training data then or validation data, you just want to reload your trained model and weights and do inference, meaning making preditions using a learned model.

2 Likes

I want to add a note about downloading images. This is the process I did:

  • Download some images from Google using this tool to my laptop.
  • Then clean out bad images by hand.
  • Resize them and create a tarball.
  • Upload it into a GitHub release.
  • Then use that dataset in the notebook as usual.

Here’s the whole process with some more details.

Unfortunately, I couldn’t use untar_data due to an issue. So, I had to come up with a replacement function.
But I’ll try to fix it and do a PR this week.

5 Likes

Data Curation, Deleting images that don’t belong is a part of that. In our case in the download images example what we did was download it from the website and we had no filter to check if the data actually belonged to the classes we wanted.
So when we “deleted” the images which didnt belong we were simply curating the data. Another reason why we delete the images that don’t belong is that ultimately these images in our dataset are assigned to some class, and if we keep them in the dataset, two things happen the network might learn a wrong representation of that class and it might misclassify. Now none of this usually happens since the amount of these examples is very low. But i think its more of “safe than sorry” practice.

I think the call to have an “other” category is more of a choice and not a necessity in the sense that if you are sure that you’ll only input images belonging to the classes you have then it would make little sense to have additional classes. On the flip side not breaking the model when you input images not belonging to the classes is a big reason to have an “other” category.

In the case of the quickdraw dataset, i wonder if doodle’s were actually added. I think the dataset was preprocessed and these outliers were removed before being released. I’m not entirely sure.

1 Like

If I have several instances of the fastai.vision.image.Image class, what is the best way to display them in a grid?

For example:

x1 = open_image('tmp_027.jpg')
x2 = open_image('tmp_029.jpg')
x1.show(title='027',figsize=(5,5))
x2.show(title='029',figsize=(5,5))

will place the two images vertically, but I’d like to put them side by side. If I have more images, then I’ll want to put them in a grid.

Although the above example uses open_image to create the image, in general, the images I want to display are calculated rather than read from disk.

Can plt’s subplot be used? An example would help. Thanks.

1 Like

Which data loader did you use for Quickdraw dataset?

Question: I’m excited to deploy a previous model I wrote and create a web app around it after yesterday’s class. I wrote my Data Preparation code in Apache Spark. What is the recommended way to prepare the data during inference for realtime predictions?

Not sure if Spark is the right choice as in inference we wont have huge batches of data but single instances of data points to transform for prediction. But at the same time, If we choose some other data processing engine, I have to re-write data processing code in that language.

I might be completely wrong here, in Leslie Smith’s paper it was said that the loss reaches the minimum and then shoots up, thus if we follow back from the shoot-up, the minimum loss must be the lowest loss, here for the first figure it is 1e-4 and the most recent bulge is at 1e-5 thus the slice should be slice(1e-5,1e-4) but I don’t know why the nb has it as slice(1e-6,1e-4). If the initial part before overshoot is completely flat it would have been better if we could zoom. but can we do that in a jupyter-nb?

1 Like

Take a look at the source code for show_batch and tell us what you can figure out from that. Let us know if you get stuck!

I’m currently participating in the Human Protein Atlas Image Classification challenge on Kaggle and trying to use FastAI V1 for it.

The challenge is that instead of a single 3-channel RGB image, you have 4 grayscale images of the same subcellular structure under different filters (i.e. different chemicals). Each image highlights a different part of the cell, shown below: the protein (green), microtubles (red), nucleus (blue), endoplasmic reticulum (yellow)

The green image is the one that needs to be classified, and the rest are for reference (but surely useful!). It’s a multilabel classification problem with 28 classes (like “Cytosol” and “Plasma membrane” above).

I have 2 questions:

  1. How to load the 4 images together into a single 4-channel image using FastAI’s ImageDataBunch?
  2. How can we do transfer learning using Resnet34, since the backbone expects a 3-channel RGB image, but here there are 4?
11 Likes

Hi.

I have a question about deployment. Would it be a good idea to use a minimalist python codebase for inference time?

I wonder if I should export the trained fastai network and use pytorch-cpu (without the fastai library) at my web app. I guess the overhead for fast.ai would be minimal but it’d be one less package to install/ worry about…

Does this make sense?

AFAIK there is no pretrained architecture for 4 channel images currently available. So the easiest thing to make use of fastai is to first generate RGB images and save them to disk. After that you can use the standard approach learned in class. Have a look at the kernels, there are different methods of doing that, incl. the simplest being simply not using the yellow channel or using cv2 to merge/blend the 4 images to generate 3channel images.

The other approach would be to modify an architecture to load in 4 channels in the first layer instead of 3, but then you have to do a lot of stuff manually. There is actually also a kernel using fastai 0.7 showing that.

I think there was nothing discussed here that is not on kaggle already but just as a reminder, questions re ongoing competitions should be asked on kaggle.

6 Likes

Thanks for the detailed explanation @marcmuc !

Not really looking to discuss any strategy, simply wanted to figure out a good way to load the data and train using FastAI v1. The idea of separately generating RGB images seems like a good starting point. But it would be nice to have the ability to specify custom image loading logic in ImageDataBunch.

I checked out the fastai 0.7 kernel you mentioned. From what I understand, it rewrites some internal classes from scratch. It’s probably going to be a bit harder to do with v1, especially since it’s under active development. But I’ll eventually give it a shot.

Cheers!

1 Like

Interesting question.
Do we need to clean our data from photos with text or watermarks?
There are nearly half from all photos that include text or watermark.
So what should we do?

1 Like

Jeremy said training loss being lower than validation loss does not always mean that we are overfitting. Honestly this conflicts what i have learned before. Is this specific to deep learning due to things i don’t know yet?

I always think training loss being lower than validation loss would mean model memorizing specific pixel values that unites a certain structure rather than learning a general structure. Does traditional machine learning algorithms differ from deep learning regarding this?

1 Like