Another treat! Early access to Intro To Machine Learning videos

jeremy · February 13, 2018, 1:42am

You’re doing great! Here’s the thing to think about: regularization penalizes coeffs that are larger. By using NB features, we don’t have to use such large coeffs to get the same result, compared to using plain binary features.

Once you’ve understood that, you’ll soon realize that NB-SVM still isn’t ideal - since we’d really like a zero coeff to represent our prior expectation as to the behavior of that feature. At that point, we can start to talk about the extension I made to NB-SVM which is the current state of the art in linear models for sentiment analysis! (Which no-one has written up yet - so if you get to the point you understand this bit, you can be the first person to put it down in writing… You’re well on the way to being there.)

Brad_S · February 13, 2018, 3:25am

is there a way to follow you on twitter without being on twitter? rss to email or somesuch?
you know … just while I’m spending all my time on here learning like crazy and already being distracted by so much to read

nileshgarg · February 14, 2018, 6:37am

how did you manage to solve this ?

ecdrid · February 14, 2018, 6:38am

It’s not an error…
It always happen no matter what I do…

nileshgarg · February 14, 2018, 6:48am

You can try adding
C:\ProgramData\Anaconda3\Library\bin\graphviz in user variable section of environment variables section.

nileshgarg · February 14, 2018, 7:02am

When trying to execute

fi = rf_feat_importance(m, df_trn); fi[:10]

in Feature Importance section of Notebook 2 “lesson2-rf_interpretation” I am getting :

ValueError: arrays must all be same length.

ecdrid · February 14, 2018, 8:03am

Your full notebook will help…

chamin · February 14, 2018, 9:00am

Hi all,
I’m while attempting the Kaggle competition[1] “House Prices: Advanced Regression Techniques”. I pretty much follow the exact same approach and I managed to obtain a score of 0.94 with RandomForestRegressor. My next step was to use test data set (this competition provides both training and test data separately). I pretty much did the exact same things to the test data before using predict() function to predict the values. When I try to apply the predict function to test set I get the following error “ValueError: could not convert string to float: ‘Normal’”

I appreciate your help in resolving this issue.
Thanks !

[1] https://www.kaggle.com/c/house-prices-advanced-regression-techniques

ecdrid · February 14, 2018, 11:51am

We need to do all.the pre processing steps which you did for training set…

Like traincats, proc-df etc…

Also in this particular dataset,
proc_df will not work as we want…

(Check out the columns on which you have trained your model.and what you are testing your model with…)

Brad_S · February 14, 2018, 1:14pm

Does it make sense now to have a ML part on the forum (like DL part1 , DL part2)?
It seems like ML could have at least 4 wiki style posts for topic based questions, rather than them all coming as comments here

ecdrid · February 14, 2018, 1:15pm

It’s already there but isn’t public…

Brad_S · February 14, 2018, 2:06pm

oh… wishing I was in on that

ahmadarib · February 15, 2018, 1:15am

Me too man, perhaps we should enroll in USF master program next academic year, haha.

Ah anyway, did anyone of you guys tried Random Forest Classifier using fastai?
Or any resource to read, any implementation, any notebook sample I could see?
Just finish Random Forest Regressor for my own problem, now I had classification problem for structured data, wanna give Random Forest Classifier a shot, but, lost direction, haha, please advise.

Brad_S · February 15, 2018, 3:41am

no. but if you’ve figured out fastai with regressor, and need to move to classifier then along with shift-tab on the fastai classifier function, you can look at any sklearn docs on classifiers.
http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

if you’re stuck on specific code / calls, put it here (or stack overflow etc)

Brad_S · February 15, 2018, 7:10am

Is this a Windows thing? I’m getting it on another data set. The slow method works

ecdrid · February 15, 2018, 8:03am

Yes it’s on Windows and the best thing is sometimes you see this warning like of thing and sometimes nothing…

Haven’t dig further as I can’t interpret the warning…

Brad_S · February 15, 2018, 8:11am

@jeremy - do you know if anyone has had the parallel tree fastai implementation succeed under Windows?
thanks

sebastian · February 15, 2018, 10:00am

http://docs.python-guide.org/en/latest/writing/gotchas/ has a nice overview of common gotchas, that can be confusing to Python newcomers

Jed · February 15, 2018, 11:04am

Where can I find a notebook used in lesson 3?
Lessons 2 and 3 from github contain buldozers data set while Lesson 3 on youtube is focused on groceries data.

Could you please update github with the missing notebook?

Thanks
Jed

Jdemlow · February 16, 2018, 4:24pm

I have made it to Lesson 8 and everything in the fastai repo has worked perfectly up to this point. I have a windows 10 machine with 64 bits. I have read many forums that this maybe something that will work once pytorch comes out with a windows version. I have a paperspace account, where this works fine, but would love to be able to have it work on my machine locally.

I have come across an issue of: DLL Load Failed and this has to do with torch_imports and the fastai.io import

What i have tried:

Reinstalling the fastai to see if that is the problem using

I also have spacy downloaded, but i am not sure if thats the answer because i don’t know how to use the spacy environment.

There was a similar question to this on an installation page and these were to the two answers and i am unsure on how to set PYTHONPATH=/path to the fastai directory

I also have created the custom AI kernel to see if that would work

If there is any advice on where to go from here it would be greatly appreciated.