Tuning hyper parameters in NLP with a small dataset

I’m working with a small data set on a multi class problem, and want to know how I can best find the right bptt, batch size, etc. for this specific project. I’ve seen this question asked around a couple times on other threads but haven’t seen any answers.

Thanks

Random search is the best option for finding good hyperparameters. How to do that? I don’t know yet :slight_smile:

This looks pretty cool: https://ray.readthedocs.io/en/latest/tune.html I will also try it myselft too…