Wiki thread: lesson 1


Hi All. I had a play using v1 instead and seem to be able to get everything to work. The short of it is that a bunch of functions from need to be copy + pasted over, and the feather loading is slightly different. Nothing else major had to change that I came across.

I have made a condensed gist of lessons 1 + 2 notebooks into one, that works with the current version of Hope it helps:

I am not sure why these functions were completely thrown away from the repo, but there is a new tabular section for NN which might be worth taking a look at. It would be interesting to hear from @jeremy what his plan is for this course and in particular if things are in for a shake up now v1 is out?

(Jeremy Howard (Admin)) #66

Thanks for doing this! I would love to see a fastai v1 compatible version of all the course. If there are important missing bits of missing functionality, I’d be happy to discuss ways to make them work. I’d like to find a more integrated way of doing things overall - fastai v1 is much more carefully designed than 0.7, so hopefully we can find neat ways of incorporating all the functionality required.

(This will require a community effort however - it’s not something I have time to do myself at the moment.)


I was secretly hoping you would have run a course this year with v1 or will be soon, and would update accordingly :slight_smile: As mentioned, there is just a handful of helper functions required (at least for the random forest portion of the course), so I think it would not be hard to keep it working / alive.

Integrating it with the new structure (which looks quite impressive!) I can’t comment on, but I’m hoping to play with the new features in the coming weeks. I have found the random forest portion of the course fascinating though (such a good insight despite already having been exposed to them previously) and it would be great to keep the simple functionality of them alive.

(Harsh Jain) #68

Can someone please help me to download the data for lesson 1 of machine learning?
in kaggle its asking for my phone number and I am from INDIA, so sms(pin) cant be reached
pls help

(Joseph Catanzarite) #69

Hi @harrshjain – All 5 of the notebooks (and the associated datasets) for the ML lessons are available on Kaggle, if that helps.

(Harsh Jain) #70

Wow, you really saved the day!
Thankyou sir!

(Harsh Jain) #71

This error is coming in jupyter notebook, although I have alrerady installed all the packages of fastai and updated them. What should I do?


you want to use v0.7 of fastai, not v1. make sure you have the right version of the libraries installed



When doing the initial processing of a dataframe, is it better to run the function add_datepart to all columns of dtype ‘datetime64’ ?

I’ve come up with the following function to run the add_datepart() function if the column is of the datetime dtype:

columns = list(df_raw)
n_columns = len(columns)
for n in range(n_columns):
if df_raw[columns[n]].dtype  == '<M8[ns]':
    add_datepart(df_raw, columns[n])

Do you think this is good?



I would like to know how one can add its own functions to the fastai library to make it available to all notebooks.

For example, I have written the following small function to convert every datetime column into categories:

columns = list(df_raw)
n_columns = len(columns)
for n in range(n_columns):
if df_raw[columns[n]].dtype  == '<M8[ns]':
    add_datepart(df_raw, columns[n])

I’d like to save it somewhere so I can use it in future notebooks. I guess I can just write a python file, but I don’t know where to save it. Also, I’m afraid that it will be overwritten whenever I git pull. Does anyone have any advice to give me on this?



I’d like to know how to better approach categories order after running the function train_cats(). In lesson 1, Jeremy rearranges the order of the category ‘UsageBand’.

Do we have to look at each category created and update their order if it is wrong? It seems like a slow process to do this for each category column.

Does anyone have any experience with this?

(Ben Lebovitz) #76

Based on a fork of this and the work @mrbruce did above, I got Lesson 1 working in a kaggle kernel.

The kernel is here:

Just one little tweak from @mrbruce’s work was I had to change is_string_dtype to pd.api.types.is_string_dtype

Also, I might be doing something wrong, as I’m getting pretty different results in some spots from others were getting. I’m going to go through the lesson again with this working and see what’s what.