Has anyone tested BoTorch and Ax with fastai?

I recently heard that facebook released two new tools based on pytorch. Do these tools overlap fastai’s role? I would like to know your thoughts whether these are promising to invest time, if you have tested it.


I’d be interested in combining Ax and BoTorch with Fastai as well

1 Like

To answer some questions :

  • Botorch implements a low level bayesian optimization framework (suitable for people familiar with bayesian optimization)
  • Ax implements a high level optimization framework (suitable for anybody having something to optimize)

While build on pytorch (which is used to run the algorithms on GPU and compute gradients), these tools deal with the optimization of problems with no known gradient, they do not overlap with fastai.

However they can be used to optimize the parameters of a neural network. You would just list the parameters, their potential values and define your cost function as the accuracy after 5 or 20 epochs.
Ax would then train your model for 5 (or 20) epochs many times while trying various parameters in order to find a sweet spot (that is a slow process but I would be curious about someone trying it on imagewoof).

1 Like

Ok, I just tried Ax on a personnal dataset and it worked great.

They have a small tutorial on using their code to find hyperparameters for a pytorch model that is trivial to translate to fastai.

I was fairly easy to optimize both boolean parameters (should i use batch norm ?), numerical parameters (which learning rate to use ?) and more general factors (which loss function to use):

# the list of all parameters that will be optimized
dropout_proba = {"name": "dropout_proba", "type": "range", "bounds": [0.0, 1.0]}
weight_decay = {"name": "weight_decay", "type": "range", "bounds": [1e-6, 1.0], "log_scale": True}
learning_rate = {"name": "learning_rate", "type": "range", "bounds": [1e-5, 0.5], "log_scale": True}
use_bn =  {"name": "use_bn", "type": "fixed", "value": True}
#loss_func = {"name": "loss_func", "type": "choice", "values": ["mae", "mrse"]}
loss_func = {"name": "loss_func", "type": "fixed", "value": "mae"} # not optimized as I want mae
parameters=[dropout_proba, weight_decay, learning_rate, use_bn, loss_func]

def evaluation_function(parameters):
    "the function that will be minimized during the optimization"
    # gets the hyper parameters for the trial
    dropout_proba = parameters["dropout_proba"]
    weight_decay = parameters["weight_decay"]
    use_bn = parameters["use_bn"]
    loss_func = mae if parameters["loss_func"] == "mae" else rmse
    learning_rate = parameters["learning_rate"]
    # builds a model with the given parameters
    learn = tabular_learner(data, layers=[200, 100], ps=dropout_proba, wd=weight_decay, use_bn=use_bn, loss_func=loss_func)
    # puts validation set away to avoid wasting time on it during training
    validation_set = learn.data.valid_dl
    learn.data.valid_dl = None
    # trains without displaying a progress bar
    with progress_disabled_ctx(learn) as learn: learn.fit_one_cycle(10, max_lr=learning_rate)
    # computes the error on the validation set
    learn.data.valid_dl = validation_set
    error = learn.validate(metrics=[mae])[0]
    print("mae:", error)
    return error

best_parameters, best_values, experiment, model = optimize(parameters=parameters, evaluation_function=evaluation_function, minimize=True, total_trials=20)

As I used a tabular model I also tried to use it for feature selection and, while getting optimal results took some tweaking of the default strategy, I’m happy with its decisions :slight_smile:


@nestorDemeure Have you tried Optuna? Do you have any thoughts? I find it very powerful and easy to use.

Before choosing Optuna, I read Ax and Optuna docs. From that, I had the impression that Optuna is easier. So, I end up learning Optuna instead Ax (Ax was the best option I found after Optuna).

I have not tested it but the API does look fairly straightforward which is nice. I would be curious as to how both compare in terms of results.