Another treat! Early access to Intro To Machine Learning videos

We’ll be officially releasing a new course (tentatively) called Machine Learning For Coders soon (it’s not up on the website yet). It’s being recorded with the masters students at MSAN. I’ve decided to share the videos with you all as well, since for those of you that haven’t done any ML before, you might find this a helpful additional resource. It uses the same fastai library as our deep learning course, so your existing repo and AWS instance will work fine.

Here’s the videos - I’ll update this thread as they’re available:

Please use this thread for any questions about the Machine Learning videos, or related ML topics.


Thank you so much!

1 Like

The day just keeps getting better! Thanks @jeremy! :smiley:

1 Like

That’s bonus after bonus, @jeremy! Thank you for all this :slight_smile:


1 Like

Thank you Jeremy. This is great. Are the notebooks also available?

Aaah, just found it.


Ha, I was actually about to ask about this after seeing the ml1 directory and the notebooks in the repository. It’s great that you’re sharing this, will be very helpful, thanks!

1 Like

Thanks @jeremy, what topics do you cover on this course? Having an outline would be nice :wink:

One treat each day, you’re spoiling us!
Thanks a lot, will be useful to see again come of the basics! I have a colleague that would be VERY interested in watching this: can we maybe watch these together?

###Awesome Blogs Explaining Decision Trees-:slight_smile:


Hope it’s Useful…


Two doubts I have after watching the first video are:
1)When we impute the missing values, is it worthwhile to think about dividing the columns into groups(like holidays and working days) and imputing with group median rather than column median or the fact that we are assuming something is important introduces unnecessary bias.

2)When we try to impute every missing categorical variable with zero, aren’t we skewing the data from its original distribution? Why we are not imputing with the most common value.


Thank you so much!! :smiley:

Thank you! Going to go thru this today afternoon!

You could always go deeper into the data, figure out “local groups” and assign group medians to missing values. There’s nothing wrong with this strategy. It’s just your way of doing it and you can always use cross-validation to see what works.

The premise that you are skewing the data depends on the kind of model you use and the way your model treats the imputed value.

For example, if you use tree-based models, imputing with -1 is a commonly used strategy and one intuition why it works is “your model treats all these missing values as a separate level by itself”. It’s like one faulty machine in a large production line not recording this variable and hence they are missing in your dataset and your model is treating all these missing values as originating from one unit.

However, if you use linear equation based models (linear / svms / neural networks), a mean / median / 0 based approach is preferred since the linear optimization is severely affected by the imputation process. Hence, as the mean / median doesn’t change the distribution of the feature, you could take this approach.


Thanks for the clarifications. Yes, I will try to test various approaches on the data and use cross- validation to see what works best.

Can you clarify how 0 based approach doesn’t change the distribution of the feature?

Hey @jeremy I just noticed at 17:00 you explain how to retrieve datasets from Kaggle and upload them to your deep learning instance. I find the process a bit overwhelming for something really simple to do.
If you’re interested I’ve created a library to automatically download Kaggle datasets from code.

For the library if you don’t have a couple of login/password from Kaggle (like if you registered from Google OAuth) you can create one by logging out and clicking on “password recovery” from there.

Hope it helps somehow :slight_smile:


Apologies for not being very clear.

The univariate distribution of the feature does indeed change by introducing a zero. You are right. However, when you use approaches like matrix factorization, the absence of a feature in a row is similar to having a zero. Data formats of libsvm, libfm treat both the cases similarly. These algorithms factorize the dense matrix into sub-matrices and latent factors are obtained.

Hence, if you use the factorization approach, you are good to go with zero-based imputation. To re-iterate, the downstream algorithm has a say in your imputation strategy.


Matrix factorization?

Awesome, thank you! I’m particularly looking forward to your Lesson 2 on Random Forest interpretation.

1 Like

@jeremy, you keep taking chances sharing things with us early. Thanks for believing in us!

1 Like