This has been my very first machine learning project. The Kaggle competition “Predicting Bike Rental Demand” is about predicting the number of rented city bikes depending on time and weather. I chose this project because, as in the first lesson of “Machine Learning for Coders: Introduction to Random Forests”, a numerical value should be predicted, and the accuracy should be measured with an RMSLE. So, I could transfer the shown steps and solve the task with a Random Forest Regressor.
First, due to my inexperience, I divided the training set into a training set and a test set as in the lesson, although there was already a separate test set here. Therefore, I had only half of the training data but could calculate the score myself thanks to the actual y-values from the test half. I am very satisfied with the RMSLE of about 0.35 for the first time.
In the second run, I used the data sets as intended and cleaned up mostly formal errors. However, since the competition is already closed, I will not get a score back.
Much more important to me than the result is what I learned in the process:
Be persistent. Talk to your Styrofoam duck, it’s your best friend. Many problems are solved by simply looking at code and data. It’s not witchcraft, just a lot of puzzle work.