About the Intro to Machine Learning (2018) category

fastai1 · January 12, 2019, 3:50pm

I get a really low oob score of 0.4716 and an rmse of 0.027 on the red wine quality dataset. Why is my oob_score so low? Does it mean that I am overfitting?

I have trouble knowing how to handle small sets of data. This one has 1599 records only. As soon as I create a validation set of size 30, my r^2 drops from 0.90 to 0.11! So I don’t use a validation set, instead I use oob_score_. Now r^2 is 0.90 but the oob_score_ is 0.47.

This is my model without a validation set:

m = RandomForestRegressor(n_estimators=80, max_features=0.5, n_jobs=-1, oob_score=True)
m.fit(df_trn, y)
print_score(m)

This is my model with validation set:

m = RandomForestRegressor(n_estimators=80, max_features=0.5, n_jobs=-1, oob_score=True)
m.fit(X_train, y_train)
print_score(m)
[0.0393104889825399, 0.08282278268007248, 0.9278235796906971, 0.11669103577122453, 0.4711981193537419]

Could someone tell me how to interpret those big differences in result and tell me what errors I do wrong?

ashirwad · January 19, 2019, 1:50am

Convert the training and test data in the same way if you don’t want the error to appear. So whenever you’re encoding the categorical variables apply the same method on both training and test data.

fastai1 · January 20, 2019, 9:27am

Thank you, I finally managed to make it work.

Anne · February 15, 2019, 11:08am

Hello everybody

Edit: I finally found the thread which tells you how to (re)install fastai 0.7 and the dependencies from Colab!
Seems to work

andrew77 · February 18, 2019, 1:58am

I’m using Google Colab too.

do I need to run

!pip install fastai==0.7.0

every time I start a new notebook/chapter? Thanks

Anne · February 18, 2019, 10:56am

Yes I think so. I have to do this too. Also when I take the same notebook again the next day.

Well while your run the fastai ‘downgrade’ in colab, you can get another coffee in the meantime…

IMO it is worth it though, it’s really excellent

andrew77 · February 19, 2019, 5:56am

@Anne
BTW, did encounter crashes when you’re doing the Lesson 3 grocery store?

df_all.unit_sales = np.log1p(np.clip(df_all.unit_sales, 0, None)) 
add_datepart(df_all, 'date')

Anne · February 19, 2019, 9:07am

Hello Andrew, in which notebook is it, and whereabout in the notebook?

Cheers from Norway

andrew77 · February 19, 2019, 9:26am

@Anne, there’s no notebook for this it was discussed at the first part of Lesson 3 ( Grocery). https://youtu.be/YSFG_W8JxBo?t=1590

MahdiRezaei · March 26, 2019, 4:00pm

Hi all

in ML course (2017)
lesson 3 - in rf_interpretation

i tried plotting hierarchical clustering without feature importance (finding the most effective independent features)
i mean instead of just keeping the 30 important columns (first image)
i used all columns (second image)

then i got this error “Distance matrix ‘X’ must be symmetric”

why is that ?

maybe some column has problem or number of columns are too much

stack trace

{https://github.com/mahdirezaey/as/blob/master/forums.ipynb }

anujsharma · March 28, 2019, 5:09pm

Thanks for the solution

dev_indigo · April 7, 2019, 2:33pm

What is the minimum required system specifications for running advanced machine learning algorithms and how can one integrate a command line tool.
Like say i want a dataset to differentiate between hotels and non hotel images for an hotel website like https://timbu.com/

psychevisions · April 7, 2019, 8:39pm

Help me understand the difference between a generative and a discriminative algorithm, keeping in mind that I am a beginner?

VinodSaratchandran · April 15, 2019, 8:25am

This infographic would give you a simple understanding of Machine Learning https://www.fingent.com/blog/machine-learning-deciphering-the-most-disruptive-innovation-infographic

coe557 · May 23, 2019, 11:53am

Hi!

I am new here, I have been reading around this website for the past days, and some questions have arise that i don’t find the answer anywhere.

1- Is this the right place to post about the new Machine Learning course for coders forum (released end of 2018) ? [i read somewhere Jeremy saying that he wasn’t going to share with us the forum for ML for coders where the Master students were. I also am unclear about this move as well, i remember reading somewhere that we share the same forum as the students that were in the classes as well, but that may be just for deep learning?

2- I am trying to do both, Machine learning for coders (end of 2018 course) and Deep Learning for coders (released 2019), everything for the new version of DL course seems to be clear, and thus i already installed GCP to work through the lessons. I have been looking for resources around to make GCP work with the Machine Learning course and haven’t seen any. My guess is just add the Jupyter NB files to the same instance of Fast.ai v3 DL course. I do not have to create another instance right?

3- The most updated Jupyter NB material for ML 2018 course for coders, that i would get from Github, would be this right? However, i only see 5 lesson notebooks on Github; On the main page, i see 12 video lessons…

all this may seem very straight forward to you, but i am very new to do this. Any help would be appreciated!

Harling · July 8, 2019, 3:25pm

Hi Jeremy, after signing up on crestle,I am only able to find a server for fastai v3, which does not have the contents for Introduction to ML (Random Forest). Are the contents still available, if so, under which server, also, crestle has changed quite a bit since you made that video.
Anyway, much thanks for your lessons!

MichaelHendriks · October 1, 2019, 4:58am

I am looking for some good machine learning online courses hope this forum will me to find some good ideas.

p2327 · October 9, 2019, 6:52am

Hello

I am new here

Should I start with this course for fast.ai?

brycer · November 14, 2019, 4:07am

Should I work through Intro to Machine Learning before taking Practical Deep Learning for Coders?

stephan.kuo · November 26, 2019, 3:32am

It’s not required. I finished the DL course without taking the ML course.
Welcome to the community!