How does metric optimization work during training?

Hello. I need a bit of clarification on how training a model works in reference to the metrics.

When you’re training the learner, what does it optimize for? Is it trying to minimize the valid_loss, or maximize/minimize the metric you’ve passed?

I would assume it’s trying to optimize the metric you’ve passed; in that case, if you want to pass multiple metrics to observe but only want it to optimize for one, how would you specify that?

If it’s just trying to minimize the valid_loss, then metrics are effectively just views passed to interpret the performance of the model, but don’t affect the training process? It could just be that I had a weird experience, but interestingly enough when I passed accuracy and AUROC as metrics the model maxed out at 87.5% accuracy, whereas when I passed FBeta it maxed out at 95% accuracy.

Metrics = human-readable understanding for model performance, no weight propagation (etc) are performed here.
Loss function = how the internal model adjusts it’s weights to optimize for and what dictates weight propogation

If you passed FBeta as a loss function, that’s understandable. If it was a metric instead (IE accuracy and FBeta), could have been just some interesting training, especially if you are doing anything randomized (splits, augmentation, etc) and not setting the seeds for reproducability

To me sklearn evaluation results of the training vary significantly from using F1Score.

Of course only the loss function affects the gradient descent, but I don’t know if there
is another callback like a model checkpointer, which uses the metric to select the trained model, to try to prevent overfitting?

BalancedAccuracy or Accuracy. The recall is much higher for the weak class when using Accuracy (InceptionTime model).

Any idea why this happens?

PD: This is actually contrary with the intuition that F1 and BalancedAccuracy would focus more on balancing weak classes.