Lesson 1 official topic

sambit · April 26, 2022, 5:30pm

Yes .

strickvl · April 26, 2022, 6:34pm

Correct.

jeremy · April 26, 2022, 7:02pm

Sure that’s fine. Thanks for checking!

jeremy · April 26, 2022, 7:03pm

It’s a Surface Studio Laptop. I’m really happy with it - great computer!

kodzaks · April 26, 2022, 10:16pm

Also open source https://labelstud.io/

jeremy · April 26, 2022, 10:55pm

Except that I’ve got 4 new-ish kernels that (hopefully) are on their way to gold medals already!

balnazzar · April 27, 2022, 10:38am

One thing that I don’t like: highly reflective screen.

balnazzar · April 27, 2022, 12:17pm

Admins, please delete or move this reply if it’s not appropriate for this thread

Jeremy wears a mask, no matter being alone at the podium. Just to say that I’m glad to see that someone still takes the matter of wearing masks seriously.

In Europe, since the onset of the war, people no longer seem to care about covid and associated precautions. And that’s just stupid. We are still having 300 deaths per day, on average, in my country.

FraPochetti · April 27, 2022, 12:54pm

Completely agree.
I am in Belgium.
COVID has just “vanished”.

jeremy · May 6, 2022, 5:02am

10 posts were merged into an existing topic: Help: SGD and Neural Net foundations

jeremy · May 6, 2022, 5:02am

4 posts were merged into an existing topic: Help: SGD and Neural Net foundations

prairieguy · April 29, 2022, 12:01am

I started my fastai adventure with v1. Thrilled to be back! In addition to re-engaging with deep learning, this time around I want to study the design of the fastai software more closely. In browsing through the source code, I discover so many interesting techniques, not just specific to python, but to programming in general. It’s a real pleasure to review such beautiful code. @jeremy, @sgugger and everyone else involved deserve a ton of credit!

An amazing example is: L, a drop-in replacement for List. I’m a long time user of lispy languages and L allows for a much more functional approach to python programming. It really should be a part of core python. I plan on using it in my non-fastai python coding.

Zakia · April 29, 2022, 12:10am

Hi all,

I have a general question, please.

Is it possible to use the fast.ai library to train a model if you only have for example, a positive labeled dataset (and there is No negative dataset)?
But then, still use the trained model above, to thereafter test any similar data (could be either positive or negative)?

Is the above possible? Or any other pertinent information will be appreciated.

Thanks in advance.

Kind Regards,
Zakia

wgpubs · April 29, 2022, 2:08am

I don’t think so.

But, you could augment your dataset by adding in examples (images, text, whatever) with a negative label and train away. If you are concerned about the model seeing things in the wild that it wouldn’t see during training, you can structure the problem as a regression task instead of a classification task and tune the threshold for the positive class.

matdmiller · April 29, 2022, 2:11am

The short answer is no. When you train a model it will find the easiest path to correctly answering the question and if the answer is always True it will learn that very quickly. You can run into this type of problem as well if you have very imbalanced classes. Ex: if you have 10 images of a Cats and 500 of Dogs the model will learn quickly that the answer is usually Dogs and may not generalize well. There are some techniques you can use to counteract this, for example by applying a much bigger ‘weight’ to Cat class when the loss is computed which increases the ‘penalty’ (larger loss) when it predicts Dog when the real answer is a Cat. An analogy is that the Dog questions are worth 1 point on a test and the Cat questions are worth 100 points. You can always add anything for the ‘negative’ class when training, but if it is not representative of the ‘negative’ examples the model would experience ‘in real life’ then it’s likely your model will not perform well when faced with negative classes in real life.

gautam_e · April 29, 2022, 2:12am

Typically, you would need a negative class dataset. Depending on your problem, it might be easy to create a synthetic negative class.

Basically, you need some way to calculate a “distance” between samples and then set a threshold to determine positive/ negative.

However that’s a very general answer. It would help to know the specifics of your problem. In which domain (vision, tabular, etc), what is a positive class etc. It might be possible to use pre-trained models and / or a Siamese architecture depending on the task at hand.

matdmiller · April 29, 2022, 2:37am

Yes this would definitely be helpful. There are potential ways around this with synthetic data or by possibly reframing the question you’re asking the model. Ex: Instead of doing a binary classification on an image, you could turn it into a semantic segmentation problem. Ex: if you only have pictures of Dogs (vs non-Dog pictures) you could train the model to predict the label of each pixel as a ‘Dog’/‘Not Dog’ classification. This definitely increases the complexity of the labelling effort in this case, but it’s an example on how you can reframe the question you’re asking your model to get it to work. This is something I’ve had to do in the past on an early phase proof of concept project where I did not have a sufficient dataset otherwise.

Zeeba · April 29, 2022, 6:36pm

Hi everyone, I attended the live session and recommended I have,

Watched again the lesson 1 video,
Read chapter 1 of Fast AI book
Run the code that was taught in the class (bird vs forest)

Now for my own dataset, I used images of zucchini and cucumber to train a classification model.
When I pass an image of zucchini, the model correctly predicts the label as zucchini but when I print the probability it shows

Can someone help why the probability is low, but the prediction is correct?
This is the kaggle notebook that I created for this exercise.

strickvl · April 29, 2022, 7:11pm

The link to the Kaggle notebook doesn’t work for me. Could you check it? I can take a look.

brismith · April 29, 2022, 8:40pm

I saw the same with my custom categories. Pretty sure the categories end up in alphabetic order - so the correct probability for zuccini would be probs[1] and not probs[0]. Someone quicker than me will no doubt tell us how to show this properly by using some kind of indexer.

(edited after looking in Chapter 2)

Something like:
pred,pred_idx,probs = learn.predict(PILImage.create(‘metro.png’))

print(f"This is a: {pred}.")
print(f"Probability: {probs[pred_idx]:.04f}")