Coloring with Random Forests

I wrote a blog post on a visual representation of random forests. This an alternative perspective on how random forests work.


This is so cool :slight_smile: Since I think it’s great, I hope you don’t mind if I provide some feedback which I think might make it even better…

Specifically: it would be worth spending some time cleaning up the text, since the content is too amazing to have text with little problems in it… For instance, looking at the first couple of paragraphs:

Random forest are typically described using trees.

  • ‘forest’ should be plural. Also, I don’t think this sentence is a strong introduction to the post. If there are some folks in class you know to be strong writers, perhaps ask them to help you create a compelling first paragraph?

To explore this representation, lets collect some data

  • “lets” should be “let’s”

Below we record the location and color of many points. These points are plotted below.

  • I found this confusing. What are these points? Are they generating synthetically? Is there some structure they represent? Why are you showing them to us? (I figured out after reading further and looking more closely that they are probably are from synthetically generated data designed to have a particular structure)

For the data we have been given, the random forest draws partitions as shown below.

  • This actually shows a tree partitioning, not a random forest partitioning.

Anyway, as I mentioned, this is fantastic content, and I’d love to share it regardless of how much time you can find to polish the prose. So just let me know when you consider it “done”. And thanks a lot for sharing!


Very cool! The plots made me think of the different channels of an image. Maybe it’s been done before, but do you think there is potential for a (very simple) novel alternative to CNNs using this perspective? If you partition each of the RGB channels of, say, a 20 x 20 pixel image separately using random forests, it feels like there is a way to develop a random forests model that can take the spacial information of each layer and classify the image. Just thinking out loud.

Thanks, for the feedback. I fixed the issues that you noted. I tried to note that the data was generated with a specific structure, and that the random forests would be able to identify it. I would call it “done” at this point.

1 Like

What’s your twitter handle, so I can give you due credit and people can find you? (If you don’t have one, you should probably create one - and note that people will look up your past tweets, so if you have a twitter handle that hasn’t been posting data science or coding topics, you may want to create a separate one for that purpose and make it the one that potential employers/clients can find).

1 Like

My twitter handle is @Tyler_V_White. Here’s a link to the tweet.

1 Like


And here’s my tweet! :smiley:

super cool visualization Tyler!

i retweeted