Lesson 1 official topic

The short answer is no. When you train a model it will find the easiest path to correctly answering the question and if the answer is always True it will learn that very quickly. You can run into this type of problem as well if you have very imbalanced classes. Ex: if you have 10 images of a Cats and 500 of Dogs the model will learn quickly that the answer is usually Dogs and may not generalize well. There are some techniques you can use to counteract this, for example by applying a much bigger ‘weight’ to Cat class when the loss is computed which increases the ‘penalty’ (larger loss) when it predicts Dog when the real answer is a Cat. An analogy is that the Dog questions are worth 1 point on a test and the Cat questions are worth 100 points. You can always add anything for the ‘negative’ class when training, but if it is not representative of the ‘negative’ examples the model would experience ‘in real life’ then it’s likely your model will not perform well when faced with negative classes in real life.

4 Likes

Typically, you would need a negative class dataset. Depending on your problem, it might be easy to create a synthetic negative class.

Basically, you need some way to calculate a “distance” between samples and then set a threshold to determine positive/ negative.

However that’s a very general answer. It would help to know the specifics of your problem. In which domain (vision, tabular, etc), what is a positive class etc. It might be possible to use pre-trained models and / or a Siamese architecture depending on the task at hand.

2 Likes

Yes this would definitely be helpful. There are potential ways around this with synthetic data or by possibly reframing the question you’re asking the model. Ex: Instead of doing a binary classification on an image, you could turn it into a semantic segmentation problem. Ex: if you only have pictures of Dogs (vs non-Dog pictures) you could train the model to predict the label of each pixel as a ‘Dog’/‘Not Dog’ classification. This definitely increases the complexity of the labelling effort in this case, but it’s an example on how you can reframe the question you’re asking your model to get it to work. This is something I’ve had to do in the past on an early phase proof of concept project where I did not have a sufficient dataset otherwise.

3 Likes

Hi everyone, I attended the live session and recommended I have,

  • Watched again the lesson 1 video,
  • Read chapter 1 of Fast AI book
  • Run the code that was taught in the class (bird vs forest)

Now for my own dataset, I used images of zucchini and cucumber to train a classification model.
When I pass an image of zucchini, the model correctly predicts the label as zucchini but when I print the probability it shows

Can someone help why the probability is low, but the prediction is correct?
This is the kaggle notebook that I created for this exercise.

The link to the Kaggle notebook doesn’t work for me. Could you check it? I can take a look.

1 Like

I saw the same with my custom categories. Pretty sure the categories end up in alphabetic order - so the correct probability for zuccini would be probs[1] and not probs[0]. Someone quicker than me will no doubt tell us how to show this properly by using some kind of indexer.

(edited after looking in Chapter 2)

Something like:
pred,pred_idx,probs = learn.predict(PILImage.create(‘metro.png’))

print(f"This is a: {pred}.")
print(f"Probability: {probs[pred_idx]:.04f}")

6 Likes

if you defined the classes/categories in the order [‘cucumber’,‘zucchini’] then zucchini would be preds[1] … I had the same issue, I just flipped the indexing but yes it threw me off for a bit. @brismith answer makes sense.

2 Likes

To add to great responses above. There is an alternative approach: instead of generating negative examples one can view this as outlier detection problem. For example you can train an autoencoder to embed your “positive” data into reduced dimension vector. You will have some kind of cluster in the embedding space corresponding to a positive class. The embeddings for negative data should fall further away from the cluster center. Then you can classify all the data-points far from cluster center as negatives.

Depending on your particular problem one or both of this approaches can be applicable. If you have particular task in mind you can share it and people who worked on similar problems before will likely share some experience

2 Likes

I think you made the same mistake I always do - forgot to make it public! Click ‘edit’, then the share arrow, and set it to public.

2 Likes

Thanks Jeremy. I have made it public and updated url now.

Thanks @brismith, your suggestion worked now I’m getting correct probability.
@mike.moloch , I didn’t explicitly set the order of classes anywhere in the notebook .Seems the order is picked up alphabetically.

4 Likes

This still doesn’t work for me, btw, but glad you seem to have found the source of the error, in any case!

1 Like

If you want to get all the labels aligned to the probabilities you can use dls.vocab to get the vocabulary from the dataloader. This has a method o2i for converting a vocab name to its corresponding index.

For example

print(f"Probability it's a zucchini : {probs[dls.vocab.o2i['zucchini']]:0.2%}")

Or if you want to print all the labels and probabilities in order of descending likelihood

for prob, label in sorted(zip(probs, dls.vocab), reverse=True):
    print(f'{label}: {prob:0.2%}')
8 Likes

4 posts were merged into an existing topic: Help: SGD and Neural Net foundations :white_check_mark:

Great - I see it now :slight_smile:

1 Like

That’s a bit too advanced a topic for lesson 1 IMO. Feel free to ask it again in lesson 7 though!

3 Likes

Many thanks for your thoughtful comment. Rather than deleting, I’ll put my thoughts here in a reply since it’s useful to set some boundaries around our forum discussions.

In general, I’d like to avoid politically-oriented discussions on these forums, since they can take over everything else. Whilst it would be nice if mask-wearing was not a political discussion, that’s not the case… masks have become very politicised unfortunately. Therefore, let’s avoid discussing it here.

7 Likes

I’m having difficulty with learn.export() I’ve tried it on the Is it a bird? model on both my own machine and Kaggle server. I get an error: TypeError: can't pickle _thread.lock objects I’ve used this command successfully in prior versions of the course. I’ve gone back to some of those nbs and now the command fails with the same error. I’ve looked through the documentation to see if the signature of the function has changed and that doesn’t seem to be the case.

Can anyone suggest what has changed or more likely what I am doing wrong?

EDIT: Several other observations subsequent to the error after using learn.export():

  1. Even though there is an error, the file export.pkl is created.
  2. learn. show_results() now fails with: ValueError: This DataLoader does not contain any batches
  3. nvidia-smi indicates that the error has not released the memory from the process, which must be shutdown with kill -9 PID
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_33/1466232964.py in <module>
----> 1 learn.export()

/opt/conda/lib/python3.7/site-packages/fastai/learner.py in export(self, fname, pickle_module, pickle_protocol)
    376         #To avoid the warning that come from PyTorch about model not being checked
    377         warnings.simplefilter("ignore")
--> 378         torch.save(self, self.path/fname, pickle_module=pickle_module, pickle_protocol=pickle_protocol)
    379     self.create_opt()
    380     if state is not None: self.opt.load_state_dict(state)

/opt/conda/lib/python3.7/site-packages/torch/serialization.py in save(obj, f, pickle_module, pickle_protocol, _use_new_zipfile_serialization)
    377         if _use_new_zipfile_serialization:
    378             with _open_zipfile_writer(opened_file) as opened_zipfile:
--> 379                 _save(obj, opened_zipfile, pickle_module, pickle_protocol)
    380                 return
    381         _legacy_save(obj, opened_file, pickle_module, pickle_protocol)

/opt/conda/lib/python3.7/site-packages/torch/serialization.py in _save(obj, zip_file, pickle_module, pickle_protocol)
    482     pickler = pickle_module.Pickler(data_buf, protocol=pickle_protocol)
    483     pickler.persistent_id = persistent_id
--> 484     pickler.dump(obj)
    485     data_value = data_buf.getvalue()
    486     zip_file.write_record('data.pkl', data_value, len(data_value))

TypeError: can't pickle _thread.lock objects

I can report that I tried this on the Bird kaggle kernel and get the same error, fastai version is 2.6.2

On my local setup fastai version is 2.6.0 (the 4/25/22 paperspace container) and I don’t get this error.

EDIT: I upgraded my local fastai to 2.6.2 and now I’m getting the same error when doing learn.export()

Thanks @mike.moloch. So it’s not just me.
I would also add that even though export.pkl is created, it fails to load with load_image(‘export.pkl’). The error is: UnidentifiedImageError: cannot identify image file 'export.pkl'

1 Like