Lesson 3 In-Class Discussion ✅

You know, I still don’t understand what happens if you feed a pretrained model with an image bigger than the ones it was trained upon (224 or 299).

I imagined the tensor would have been just cropped, but from what you say, I think that won’t happen, am I right?

But you would be using the LRfinder for such task, and in doing so you’d get in a recursive disaster.

Any suggestions on this question or any advice would be very helpful!

If you are training the model with separate folders, it means your are training a multi-class single-label classification model, so the learner will create a model which in the last layer will have a function (softmax) that “likes to pick just one value”: the class with the higher probability (in fact, you get probabilities for each class, all of them adding to 1, but the only useful is the higher one). If then you show to your model an image with multiple animals, it will predict just one of them.

Instead, if you train a multi-label model (like the Amazon from Space dataset), the last layer will be a sigmoid layer, which make a prediction (0 or 1) for each individual class.

1 Like

Thank you for the reply but here in the Amazon from space dataset we have the data which has multi labels inside a single image in our training dataset itself, but if we consider a case where our training data has single label images - lion, monkey , croc with no images together and now if we want to pass on an image which has supposedly these 3 in a single image what will it do ?

Question re: threshold selection: I understand the motivation here, but just wondering…is this approach viewed as a tradeoff of ease-of-use vs. accuracy? Are the results of grouping all classes together, setting prediction probability thresholds, and returning the classes that meet that threshold roughly comparable to having multiple models that classify their particular category (so in the satellite image example, you’d have one trained model for terrain type, another one for weather, etc.). Seems like the results are really good in this example, but are there others (if there are tons of categories, or a more complex image set) where the threshold approach doesn’t adequately predict and you’d need to break the problem out into several single-classification methods? Hope that made sense, thanks!!

Even if that slope is positive?

Yes, it has. Look at the unlabeled dents.

Eight. Mhh.

You can think of the convolutional layers as a “feature extractor” that detects a lot of patters in the images. Then the final linear layer, with one output per class, just matches certain patters from the “feature extractor” to specific classes. In this way you are reusing most of the network, instead of creating multiple single-classifications models. On top of that, I guess this approach captures relationship between classes (e.g., in the satellite images, if its “clear” it is not “cloudy”), that training independent models would be loosing. Finally, regarding the thresholds that you mention, I think you can set different values for each class.

1 Like

Is anyone else facing: https://forums.fast.ai/t/unet-binary-segmentation/29833/ . Any fix?

Thanks; I had tried many different lr’s but hadn’t tried changing ps (dropout). Now I’ve done that, but even cutting ps down to 0.1 (which seems like a lot) doesn’t get train_loss < valid_loss. Here’s an example with resnet34, 24 epochs:


At this point I think I’ll just wait until we cover regularization in class so I know what I’m doing! I noticed in almost every example Jeremy showed last night train_loss was > valid_loss so it appears to be very common situation.

1 Like

Between 1e-2 and 1e-1, you have 8 graduations for 2e-2, 3e-2, 4e-2, 5e-2, 6e-2, 7e-2, 8e-2 and 9e-2 :wink:


Folders with . are hidden folders. you will not see them in Jupyter notebook browser. In terminal type ls -la

Since every filter detect some feature, even if the image dimension increase they are able to detect the same feature.

probably because you’re running multiple jupyter nootbook in the same machine
run nvidia-smi
check in there are multiple python instances
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
| 0 1955 C /opt/anaconda3/bin/python 9075MiB |
| 0 3325 C /opt/anaconda3/bin/python 7191MiB |

If yes just shutdown the notebook and run it again

Hi - I am stuck trying to change my classifier to a regression model. I am extracting labels from images in their filenames and want to use the labels as numbers and do a rmse as my loss function.

data = ImageDataBunch.from_name_func(path_img, fnames, label_func=get_float_labels, ds_tfms=get_transforms(),
                                     size=224, bs=bs)

The labels come out as strings just fine ‘24’ etc. But if I try to change them to floats using a label_function, I get 0 classes.

My label function is a slight modification of the original:

pat = r'/([^/]+)_\d+.jpg

Also weird thing is I was at least able to get the strings converted to floats until today I did updated the fastai library. Not sure if something changed to cause this…

pat = re.compile(pat)
def get_float_labels(fn): return float(pat.search(str(fn)).group(1))

Also weird thing is I was at least able to get the strings converted to floats until today I did updated the fastai library. Not sure if something changed to cause this…

Test first convert into int and see uf that works.

1 Like

ints work! But floats do not - weird

I guess I don’t care as long as it does the loss like I want it.

learn = create_cnn(data, models.resnet34)
learn.loss_func = MSELossFlat()

If I fit it though I get an error:
RuntimeError: Expected object of scalar type Float but got scalar type Long for argument #2 ‘target’

1 Like