If you are training the model with separate folders, it means your are training a multi-class single-label classification model, so the learner will create a model which in the last layer will have a function (softmax) that “likes to pick just one value”: the class with the higher probability (in fact, you get probabilities for each class, all of them adding to 1, but the only useful is the higher one). If then you show to your model an image with multiple animals, it will predict just one of them.
Instead, if you train a multi-label model (like the Amazon from Space dataset), the last layer will be a sigmoid layer, which make a prediction (0 or 1) for each individual class.
Thank you for the reply but here in the Amazon from space dataset we have the data which has multi labels inside a single image in our training dataset itself, but if we consider a case where our training data has single label images - lion, monkey , croc with no images together and now if we want to pass on an image which has supposedly these 3 in a single image what will it do ?
Question re: threshold selection: I understand the motivation here, but just wondering…is this approach viewed as a tradeoff of ease-of-use vs. accuracy? Are the results of grouping all classes together, setting prediction probability thresholds, and returning the classes that meet that threshold roughly comparable to having multiple models that classify their particular category (so in the satellite image example, you’d have one trained model for terrain type, another one for weather, etc.). Seems like the results are really good in this example, but are there others (if there are tons of categories, or a more complex image set) where the threshold approach doesn’t adequately predict and you’d need to break the problem out into several single-classification methods? Hope that made sense, thanks!!
You can think of the convolutional layers as a “feature extractor” that detects a lot of patters in the images. Then the final linear layer, with one output per class, just matches certain patters from the “feature extractor” to specific classes. In this way you are reusing most of the network, instead of creating multiple single-classifications models. On top of that, I guess this approach captures relationship between classes (e.g., in the satellite images, if its “clear” it is not “cloudy”), that training independent models would be loosing. Finally, regarding the thresholds that you mention, I think you can set different values for each class.
Thanks; I had tried many different lr’s but hadn’t tried changing ps (dropout). Now I’ve done that, but even cutting ps down to 0.1 (which seems like a lot) doesn’t get train_loss < valid_loss. Here’s an example with resnet34, 24 epochs:
At this point I think I’ll just wait until we cover regularization in class so I know what I’m doing! I noticed in almost every example Jeremy showed last night train_loss was > valid_loss so it appears to be very common situation.
probably because you’re running multiple jupyter nootbook in the same machine
run nvidia-smi
check in there are multiple python instances
±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1955 C /opt/anaconda3/bin/python 9075MiB |
| 0 3325 C /opt/anaconda3/bin/python 7191MiB |
±----------------------------------------------------------------------------+
If yes just shutdown the notebook and run it again
Hi - I am stuck trying to change my classifier to a regression model. I am extracting labels from images in their filenames and want to use the labels as numbers and do a rmse as my loss function.
data = ImageDataBunch.from_name_func(path_img, fnames, label_func=get_float_labels, ds_tfms=get_transforms(),
size=224, bs=bs)
data.normalize(imagenet_stats)
The labels come out as strings just fine ‘24’ etc. But if I try to change them to floats using a label_function, I get 0 classes.
My label function is a slight modification of the original:
np.random.seed(2)
pat = r'/([^/]+)_\d+.jpg
Also weird thing is I was at least able to get the strings converted to floats until today I did updated the fastai library. Not sure if something changed to cause this…
pat = re.compile(pat)
def get_float_labels(fn): return float(pat.search(str(fn)).group(1))
Also weird thing is I was at least able to get the strings converted to floats until today I did updated the fastai library. Not sure if something changed to cause this…