Wiki / Lesson Thread: Lesson 6

(melissa.fabros) #1

This is a forum wiki thread, so you all can edit this post to add/change/organize info to help make it better! To edit, click on the little pencil icon at the bottom of this post. Here’s a pic of what to look for:

<<< Wiki: Lesson 5Wiki: Lesson 7 >>>

Lesson resources

Course Notes: under construction

Random forest interpretation techniques & review

  • Confidence based on tree variance
  • Feature Importance
  • removing redundant features
  • partial dependence
  • Tree interpreter
  • Extrapolation

Lesson 5 wiki
About the Intro to Machine Learning category
(Jeremy Howard (Admin)) #2

I’ve just added the lesson video to the top post (currently uploading - will be available in ~30 mins).

(Deena Liz John) #3

@jeremy Could you please share the slides you showed us in this lecture? The one on ML applications in different industries.

(Jeremy Howard (Admin)) #4

Thanks for the reminder. I’ve added it to git in the ‘ppt’ folder.

(antoine mercier) #5

There is something in this lesson that I would like to clarify.

In the paper written by Jeremy back in 2012 (Designing great data products), at the end there is a link to a YouTube video: Jeremy Howard - From Predictive Modelling to Optimization: The Next Frontier. Around minute 12:03, Jeremy says:

“One of the big insights I want you to take away from this is … really what you want is data that tells you about causality not correlation. … Generally speaking, you do not have data about causality, you’ve got data about business as usual.”

Then he goes on to explain how he convinced his client to conduct randomized experiments in order to collect data about causality.

But in lesson 6 I believe Jeremy doesn’t talk about conducting randomized experiments in order to collect data about causality.

So my question is: is it the case that the invention of partial dependence plots has replaced the need for conducting randomized experiments?

(Jeremy Howard (Admin)) #6

Unfortunately not.