Coloring with Random Forests

I wrote a blog post on a visual representation of random forests. This an alternative perspective on how random forests work.

11 Likes

This is so cool :slight_smile: Since I think itā€™s great, I hope you donā€™t mind if I provide some feedback which I think might make it even betterā€¦

Specifically: it would be worth spending some time cleaning up the text, since the content is too amazing to have text with little problems in itā€¦ For instance, looking at the first couple of paragraphs:

Random forest are typically described using trees.

  • ā€˜forestā€™ should be plural. Also, I donā€™t think this sentence is a strong introduction to the post. If there are some folks in class you know to be strong writers, perhaps ask them to help you create a compelling first paragraph?

To explore this representation, lets collect some data

  • ā€œletsā€ should be ā€œletā€™sā€

Below we record the location and color of many points. These points are plotted below.

  • I found this confusing. What are these points? Are they generating synthetically? Is there some structure they represent? Why are you showing them to us? (I figured out after reading further and looking more closely that they are probably are from synthetically generated data designed to have a particular structure)

For the data we have been given, the random forest draws partitions as shown below.

  • This actually shows a tree partitioning, not a random forest partitioning.

Anyway, as I mentioned, this is fantastic content, and Iā€™d love to share it regardless of how much time you can find to polish the prose. So just let me know when you consider it ā€œdoneā€. And thanks a lot for sharing!

3 Likes

Very cool! The plots made me think of the different channels of an image. Maybe itā€™s been done before, but do you think there is potential for a (very simple) novel alternative to CNNs using this perspective? If you partition each of the RGB channels of, say, a 20 x 20 pixel image separately using random forests, it feels like there is a way to develop a random forests model that can take the spacial information of each layer and classify the image. Just thinking out loud.

Thanks, for the feedback. I fixed the issues that you noted. I tried to note that the data was generated with a specific structure, and that the random forests would be able to identify it. I would call it ā€œdoneā€ at this point.

1 Like

Whatā€™s your twitter handle, so I can give you due credit and people can find you? (If you donā€™t have one, you should probably create one - and note that people will look up your past tweets, so if you have a twitter handle that hasnā€™t been posting data science or coding topics, you may want to create a separate one for that purpose and make it the one that potential employers/clients can find).

1 Like

My twitter handle is @Tyler_V_White. Hereā€™s a link to the tweet.

1 Like

Fantastic!!!

And hereā€™s my tweet! :smiley: https://twitter.com/jeremyphoward/status/933383689496440832

super cool visualization Tyler!

i retweeted