Can you comment on real-time applications of Random Forests? In my experience they tend to be too slow for real-time (latency bound) use cases, like a real reccomender system. A NN is much faster when run on the right hardware.
The only other option I found that is good from the performance perspective is XGBoost or Cat boost (boosted decision trees).
Yes, but I only have outcomes for sales that happened or not in my training set. Basically, yes/no. But I’d like to know if all conditions are very favorable for making this big sale. I guess, my outcome variable will have to be yes or no then for my test set.
I agree. Try all of them. There’s an argument I recommend changing in Random forest. I’m not sure if its there in XGBoost. Try changing the class-weight argument to “balanced” to deal with the class imbalance. That’s what I use. In addition, F1 score is better in evaluating the model.
Have you looked at the current Abstract and Reasoning Challenge competition at Kaggle, which asks whether a computer can learn complex, abstract tasks from just a few examples? Can you share some thoughts on it?
Jeremy, I heard that you won every Kaggle competition for 5 years straight. Is this true? Do you have any favorite stories of Kaggle competitions you were involved in?
@ilovescience I’m giving it a try and seeing what happens. If it doesn’t work, I’l either join the GPU competition or try to get the best of both worlds, i.e., data augmentation with fastai2 and learning with tf