Hi all!
I wrote a blog post about a new package that I created called parfit!
The package is a more flexible version of GridSearchCV (also used for sklearn machine learning models), developed with speed (parallel processing) and ease of use (one function call to run) in mind, and built-in visualizations of the results.
The key advantages of this package are as follows:
- Flexibly choose the validation set you wish to score on (no need to cross-validate on the training set. i.e. useful for time series data e.g. grocery kaggle competition and many other real-world problems)
- Flexibly choose your scoring metric (any function that compares two vectors should work, but intended for sklearn metrics)
- Optionally (by default) plot the scores over the grid of hyper-parameters entered (visualize scores that vary of 1-3 parameters)
- Automatically return the best model (associated hyper-parameters) and score of that model
- Do all of this with a single function call (or split it up into multiple component function calls if you wish)
Since this is my first time creating a package and writing about it, I have some humble requests:
I would absolutely love and appreciate it if anyone would be willing to (A) give me some feedback on my first blog post (5min read) before I publish it,
(B) test out my package and see if it is (i) helpful for you or (ii) if you can break it,
and © contribute to the package!
Thank you so much for any input any of you can provide!