Hi guys,

I trained a tabular model, first using just fit_one_cycle.

```
learn = tabular_learner(dls, metrics=[accuracy])
learn.fit_one_cycle(2, cbs=[EarlyStoppingCallback(monitor='accuracy', min_delta=0.01, patience=2)])
```

Then I did bayesian optimization like the following:

```
def fit_with(lr:float, wd:float, dp:float, n_layers:float, layer_1:float, layer_2:float, layer_3:float):
print(lr, wd, dp)
if int(n_layers) == 2:
layers = [int(layer_1), int(layer_2)]
elif int(n_layers) == 3:
layers = [int(layer_1), int(layer_2), int(layer_3)]
else:
layers = [int(layer_1)]
config = tabular_config(embed_p=float(dp),
ps=float(wd))
learn = tabular_learner(dls, layers=layers, metrics=accuracy, config = config)
with learn.no_bar() and learn.no_logging():
learn.fit(5, lr=float(lr))
acc = float(learn.validate()[1])
return acc
hps = {'lr': (1e-15, 1e-01),
'wd': (4e-4, 0.4),
'dp': (0.01, 0.5),
'n_layers': (1,3),
'layer_1': (50, 200),
'layer_2': (100, 1000),
'layer_3': (200, 2000)}
optim = BayesianOptimization(
f = fit_with, # our fit function
pbounds = hps, # our hyper parameters to tune
verbose = 2, # 1 prints out when a maximum is observed, 0 for silent
random_state=RANDOM_SEED
)
optim.maximize(n_iter=10, init_points=5)
opt = optim.max["params"]
layers = [int(opt["layer_1"]), int(opt["layer_2"]), int(opt["layer_3"])]
learn = tabular_learner(dls, layers=layers, metrics=[accuracy])
learn.fit(2, lr=float(opt["lr"]), config=opt, cbs=[EarlyStoppingCallback(monitor='accuracy', min_delta=0.01, patience=2)])
```

However, the results are much worse than the attempt without fit_one_cycle.

What could be the problem?

Thanks a lot!