Wiki: Lesson 1

shoof · January 23, 2018, 9:17pm

I think they are moved to the fastai repo.

However, I don’t see the folded table of contents in the notebook from lesson 1. Am I missing something?

tech9connect · January 24, 2018, 7:30am

Hi,
I have just started with fastai Part1v2 course and have finished watching the first video. How are the files divided in Train/Valid etc? How do I know more about these terms and dividing files accordingly?

samching · January 24, 2018, 2:17pm

Interested to find out more about this too!

@alessa / @lukebyrne do you guys have any insights? I saw a post [Wiki: Lesson 1] which talked briefly about this, but I’m not sure if this was ever resolved.

Reason I’m asking is that the ImageClassifierData.from_paths method takes the following args:

            trn_name: a name of the folder that contains training images.
            val_name:  a name of the folder that contains validation images.
            test_name:  a name of the folder that contains test images.

Any insights into the train/valid/test split required to feed into this method will be really helpful.

Thanks!

alessa · January 24, 2018, 2:45pm

Here you find more details stackoverflow

Usually the dogs and cats examples have only train and valid dataset, where the training dataset is 12500 files per class, and the validation dataset is 1000 files per class (~7% of the data).

What you need to pay attention is when you build your datasets, to cut sample files from training dataset and move then in the validation dataset. (Instead of just copy paste).

If you do kaggle competition for example, you will have also a test dataset (with no labels/classes). In this case, you can train your model using the cross-validation technique. And in the end, you can put all of your files in the training dataset (no more validation set), and this will be your final weights.

alessa · January 24, 2018, 2:49pm

Here is a short video on how to split the data by Andrew Ng Coursera

[60% training, 20% validation, 20% test]
You can change these params and check how it affects your final model performance.

samching · January 24, 2018, 6:27pm

Thanks @alessa! This was helpful for a general train / test / split. I was wondering more specifically -
do you know of any specific setup requirements for the train / test / split for fastAi’s method?

Thanks again.

Matthew · January 25, 2018, 10:53am

When choosing a learning rate with the LR finder, you can plot a vertical line to ensure you choose the correct x-coordinate of the point you’re interested in. Otherwise it can be difficult to interpret values on the x-axis, since they’re in log scale.

import matplotlib.pyplot as plt
learn.sched.plot()
plt.axvline(x=1.6e-2, color="red");

ecdrid · January 25, 2018, 12:43pm

Also adding %matplotlib notebook seems awesome (edit the same plot until created a new one)

gnavink · January 25, 2018, 4:05pm

Kaggle CatsDogs Redux Kernel competition asks us to report whats the probability of that image to be a dog.hence interested in calculating dog probability

jeremy · January 25, 2018, 6:49pm

Good idea

Matthew · January 25, 2018, 6:51pm

Thanks. I invite you to the LR finder plots thread.

jeremy · January 25, 2018, 9:46pm

I thought some one had created a timeline for this lesson, but I can’t find it - am I imagining things? @hiromi @EricPB where did we get to with this for the new version of the video? Sorry for my poor memory!

hiromi · January 25, 2018, 10:17pm

I believe there was one for the original lesson 1 video, but I don’t recall one for the re-taped version. I can certainly create one

jeremy · January 25, 2018, 10:18pm

That would be quite wonderful! I’ve nearly finished the new course web site and suddenly discovered we don’t have a timeline!

jmoney · January 25, 2018, 10:23pm

@jeremy, question about V2 vs. Machine Learning For Coders: I’ve had significant dev experience and want to complete one of these, do you have a suggestion of which course to take? Is Machine Learning For Coders ready for the public? I’ve started both of the first videos of the respective courses. Thanks.

EricPB · January 25, 2018, 10:24pm

Master @Jeremy,

You indeed found the nasty secret in the “Video Timelines for Part 1 V2”: there is none existing so far for Lesson 1, but Lessons 2 to 7 are covered with the help of your humble servants here (hiromi was super efficient/fast at fixing my mistakes on L7)

I will work on it tomorrow/this weekend.

Did anyone mention that he/she was looking for your notebook on Favorita Comp, including Preds and Submission, now that it’s over ?

jeremy · January 25, 2018, 10:24pm

They’re both ready. Perhaps read the experiences of other students on the forum and see what you think. They’re both worth doing.

jeremy · January 25, 2018, 10:26pm

You and @hiromi are both very kind

I’m rather embarrassed that I never got around to creating a groceries model that I’m actually happy with. But I’ll endeavor to dig up my notebook after I get this course out…

hiromi · January 25, 2018, 10:28pm

@EricPB, I’m at a hackathon tonight so I’ll see how far I can get during my breaks. You can make it prettier for me when you get a chance

EricPB · January 25, 2018, 10:34pm

I will do that @hiromi, things are a bit hectic here in Stockholm with family and the Recruit Visitor Forecast Kaggle competition.
But I’m sure you and I can build a basic Lesson 1 Video Timeline that we can improve as it goes, just like you made it happen for Lesson 7.

@jeremy: we don’t expect your Favorita notebook to reach Top 3 positions like Rossmann, just that some of us had trouble using the Fastai library to move from Training (Check !) to Predicting/Submitting a CSV (Fail !).