Parfit — quick and powerful hyper-parameter optimization with visualizations

jmcarpenter2 · November 25, 2017, 3:01am

Hi all!

I wrote a blog post about a new package that I created called parfit!

The package is a more flexible version of GridSearchCV (also used for sklearn machine learning models), developed with speed (parallel processing) and ease of use (one function call to run) in mind, and built-in visualizations of the results.

The key advantages of this package are as follows:

Flexibly choose the validation set you wish to score on (no need to cross-validate on the training set. i.e. useful for time series data e.g. grocery kaggle competition and many other real-world problems)
Flexibly choose your scoring metric (any function that compares two vectors should work, but intended for sklearn metrics)
Optionally (by default) plot the scores over the grid of hyper-parameters entered (visualize scores that vary of 1-3 parameters)
Automatically return the best model (associated hyper-parameters) and score of that model
Do all of this with a single function call (or split it up into multiple component function calls if you wish)

Since this is my first time creating a package and writing about it, I have some humble requests:
I would absolutely love and appreciate it if anyone would be willing to (A) give me some feedback on my first blog post (5min read) before I publish it,
(B) test out my package and see if it is (i) helpful for you or (ii) if you can break it,
and © contribute to the package!

Thank you so much for any input any of you can provide!

jeremy · November 26, 2017, 6:40pm

Very nice job! I like the way it’s all very well polished I don’t have anything much to add I think - let me know when you’re happy to publish it.

jmcarpenter2 · November 26, 2017, 7:06pm

@jeremy thank you for the feedback and encouragement! The article is now published, at the same web address in the links above, or listed here

fryanpan · November 26, 2017, 9:12pm

Very nice!

jeremy · November 26, 2017, 10:25pm

What’s your twitter handle?

jmcarpenter2 · November 26, 2017, 10:38pm

jcarpenter542

daschumacher · November 27, 2017, 12:09am

Jason, in your post, when you say:

Notice below that I have only specified the RandomForestClassifier class, not instantiated the object with parentheses ().

What does it mean to instantiate an object?

jmcarpenter2 · November 27, 2017, 2:58am

I mean that for my package, users simple need to reference the class name rather than create an object of that class using parentheses (instantiate). It would be redundant to do so as the creation of the model is also done inside of my fitModels function. Maybe it would be more effective communication if I said “create an object of that class” instead of instantiate.

daschumacher · November 27, 2017, 3:40am

I think its fairly common language, I just didn’t know what it meant