We’re starting to see some great blog posts now, but I’d love to see more kernels! If you have a kernel that’s getting a few votes, please share with us too. And if you’ve got a kernel you’d like help to improve, share that too so we can all give our ideas.
Hi! This was my kernel I put up for the Text Normalisation Challenge. Feedback on what I should keep in mind while writing kernels in the future is welcome!
Hi, this is my kernel to understand transfer learning in iceberg challenge context. Looking forward to great discussion on the kernel.
Hi! I played around with XGBoost modeling and parameters tuning for Porto Seguro competition. Please see my Kernel here: https://www.kaggle.com/mashavasilenko/porto-seguro-xgb-modeling-and-parameters-tuning
@mvasilenko What was the LB score that you were getting from this?
And you did undersampling to balance class. I am not sure how much would that matter, but oversampling is recommended more.
@groverpr Hi! Yes, I know that, but I was running the notebook on my local machine and oversampling was taking too much time, so I decided to do undersampling first. I did oversampling for RF but wouldn’t say it gave me as much advantage as parameters tuning.
It’s an old one and couldn’t gather much votes. But here is what I tried. It could be useful for someone who needs a head-start for Statoil competition.
Here is a kaggle I wrote for the text normalization competition. It was greatly inspired by two other kernals, one of them being Neerja’s.
Any feed back would be great!
Sharing a basic kernel on data cleaning and merging multiple tables for the new Kaggle competition:
Hi, this is my kernel on Text Normalization. I am hoping to build these functions into packages. Any ideas or suggestions regarding the same will be really great. Thanks!
Here’s an EDA kernel I made for the Kaggle survey data. It wasn’t for a competition, but was good practice pulling out a story from the data and communicating insights. https://www.kaggle.com/smcnish71/what-should-job-seekers-do-to-get-a-job
This is terrific - I really like the way you’ve told a story here, and also you’ve presented it nicely. Good use of the pipe operator in R too.
Your observation that job seekers think Matlab is more important that those actually in jobs was cute…
Here is my kernel: an initial EDA for the mercari competition.
Here is my kernel for Quora Question Pairs. I am currently using Logistic Regression, and will try Random Forest later.
If I knew this would’'ve happen, I wouldn’t put my passport photo
Thanks for sharing ! Just without knowing this I’ve luckily finished and added adagrad, rmsprop and adam
For most of the kaggle competitions, I have been using some variant of
gradient boosting algorithm without fully understanding it. (just knowing how to tune, run the models and take predictions works too :D) .
So to gain better understanding, I tried to write the code from scratch (on top of DecisionTree from rf) and also wrote a blogpost (draft and not public yet) explaining what I understood. It could be useful to someone who is looking for understanding the same. Those who know it already, please correct me if I am wrong somewhere.
Kaggle kernel with code - https://www.kaggle.com/grroverpr/gradient-boosting-simplified/
Very cool! I like your approach and it’s really cool that you have an implementation behind it. I do think there’s a lot you can do to improve the writing and explanation - be sure to have a strong writer go through it carefully with you to help with the prose, and think about how to use your code to help with your explanation.