I wrote a blog post on a visual representation of random forests. This an alternative perspective on how random forests work.
This is so cool Since I think itās great, I hope you donāt mind if I provide some feedback which I think might make it even betterā¦
Specifically: it would be worth spending some time cleaning up the text, since the content is too amazing to have text with little problems in itā¦ For instance, looking at the first couple of paragraphs:
Random forest are typically described using trees.
- āforestā should be plural. Also, I donāt think this sentence is a strong introduction to the post. If there are some folks in class you know to be strong writers, perhaps ask them to help you create a compelling first paragraph?
To explore this representation, lets collect some data
- āletsā should be āletāsā
Below we record the location and color of many points. These points are plotted below.
- I found this confusing. What are these points? Are they generating synthetically? Is there some structure they represent? Why are you showing them to us? (I figured out after reading further and looking more closely that they are probably are from synthetically generated data designed to have a particular structure)
For the data we have been given, the random forest draws partitions as shown below.
- This actually shows a tree partitioning, not a random forest partitioning.
Anyway, as I mentioned, this is fantastic content, and Iād love to share it regardless of how much time you can find to polish the prose. So just let me know when you consider it ādoneā. And thanks a lot for sharing!
Very cool! The plots made me think of the different channels of an image. Maybe itās been done before, but do you think there is potential for a (very simple) novel alternative to CNNs using this perspective? If you partition each of the RGB channels of, say, a 20 x 20 pixel image separately using random forests, it feels like there is a way to develop a random forests model that can take the spacial information of each layer and classify the image. Just thinking out loud.
Thanks, for the feedback. I fixed the issues that you noted. I tried to note that the data was generated with a specific structure, and that the random forests would be able to identify it. I would call it ādoneā at this point.
Whatās your twitter handle, so I can give you due credit and people can find you? (If you donāt have one, you should probably create one - and note that people will look up your past tweets, so if you have a twitter handle that hasnāt been posting data science or coding topics, you may want to create a separate one for that purpose and make it the one that potential employers/clients can find).
My twitter handle is @Tyler_V_White. Hereās a link to the tweet.
Fantastic!!!
And hereās my tweet! x.com
super cool visualization Tyler!
i retweeted