For anyone who is somewhat new to data science, I’d strongly suggest checking out this wonderful new paper, Good Enough Practices in Scientific Computing. All the practices recommended there will save you a lot of time and trouble as you continue to work on your projects over the coming weeks.
It’s also critical that you fully understand the nature of overfitting and underfitting in machine learning - a couple of excellent resources for this are:
Kaggle has some very thoughtfully designed processes to ensure that you avoid (or at least are aware of) over-fitting. Take a look at the Kaggle member FAQ to learn about these important tools and processes.
If you have any other suggested resources that may help the community with general machine learning and data science issues, for have any questions, please post below!