Didn’t have time to check that issue yet, so I’m not sure.

# F1 Score as metric

**ajan1019**(Azarudeen) #22

@wyquek I tried to use `Fbeta_binary`

. I got below error.

NameError: name ‘clas’ is not defined

**wyquek**(魏璎珞) #23

try this

```
@dataclass
class Fbeta_binary(Callback):
"Computes the fbeta between preds and targets for single-label classification"
beta2: int = 2
eps: float = 1e-9
clas:int=1
def on_epoch_begin(self, **kwargs):
self.TP = 0
self.total_y_pred = 0
self.total_y_true = 0
def on_batch_end(self, last_output, last_target, **kwargs):
y_pred = last_output.argmax(dim=1)
y_true = last_target.float()
self.TP += ((y_pred==self.clas) * (y_true==self.clas)).float().sum()
self.total_y_pred += (y_pred==self.clas).float().sum()
self.total_y_true += (y_true==self.clas).float().sum()
def on_epoch_end(self, **kwargs):
beta2=self.beta2**2
prec = self.TP/(self.total_y_pred+self.eps)
rec = self.TP/(self.total_y_true+self.eps)
res = (prec*rec)/(prec*beta2+rec+self.eps)*(1+beta2)
self.metric = res
```

If you want F1 for `label 1`

```
learn = text_classifier_learner(data_clas, drop_mult=0.5)
learn.load_encoder('fine_tuned_enc')
learn.metrics=[accuracy, Fbeta_binary(beta2=1,clas = 1)]
```

OR if you want F1 for `label 0`

```
learn = text_classifier_learner(data_clas, drop_mult=0.5)
learn.load_encoder('fine_tuned_enc')
learn.metrics=[accuracy, Fbeta_binary(beta2=1,clas = 0)]
```

OR if you want F1 for both label 1 and label 0

```
learn = text_classifier_learner(data_clas, drop_mult=0.5)
learn.load_encoder('fine_tuned_enc')
f1_label1 = Fbeta_binary(1,clas = 0)
f1_label0 = Fbeta_binary(1,clas = 1)
learn.metrics=[accuracy, f1_label1,f1_label0]
```

Here’s a notebook example.

I think there are lots of metrics such as these mentioned in this PR that forummers can help fastai build, but they most probably have to be written as callbacks

**nikhil_no_1**(Nikhil Utane) #25

Basic question, how does loss minimization happen in case of multiple metrics?

I am working on a text classification problem which is showing high accuracy but not predicting correctly when I am inferring?

I am doubting it is because I am using accuracy (default) as a metric. Now changed to fbeta_binary and checking.

**wyquek**(魏璎珞) #26

My understanding is that minimizing loss will lead to better metric but up to a point, and beyond that the particular metric would start to get worse as the NN overfits. So it’s possible that, for metric F1, the over-fitting would start at, say, epoch 25, while for another metric, accuracy, the overfitting would start at epoch 15. So you can’t train the NN such that both metrics are at their best, unless by coincidence. Most likely you would have to choose one.

**nikhil_no_1**(Nikhil Utane) #27

Is it just the no. of epochs and over-fitting? Coz it would then mean the training happens in exactly same way regardless of which metric is used.

My understanding is that if a different metric function is used then calculated loss will be different which would lead to a completely different path during training. Isn’t that the case?

BTW, I used your approach. It works but I don’t see any output. I went back to only having accuracy and even that didn’t display the valid_loss and accuracy values which I get when I don’t set `learn.metrics`

```
learn = text_classifier_learner(data_clas, drop_mult=0.5)
learn.load_encoder('fine_tuned_enc')
f1_label1 = Fbeta_binary(1,clas = 0)
f1_label0 = Fbeta_binary(1,clas = 1)
learn.metrics=[accuracy]
#learn.metrics=[f1_label1,f1_label0]
learn.freeze()
learn.fit_one_cycle(1, 2e-2, moms=(0.8,0.7))
Total time: 09:27
epoch train_loss valid_loss accuracy
1 0.145246
```

Any idea? Thanks.

**wyquek**(魏璎珞) #28

**Isn’t that the case?**

Yes, that’s what I meant as well.

**Any idea? Thanks.**

Hmm…weird. Not sure what’s happening to be honest

**nikhil_no_1**(Nikhil Utane) #31

I have used the IMDB notebook as a reference which has the same sequence. It worked earlier until i tried to set the metric as Fbeta_binary. And for some reason it is not working even after reverting and restarting the kernel.

**AbuFadl**(Abu Fadl) #32

That notebook uses the same var for lm and clas learner. Be careful you are not mixing. Maybe clear the models and tmp directories (persist after kernel restart but not after reset all runtimes (on colab, at least).

**nikhil_no_1**(Nikhil Utane) #33

Its not anymore. Perhaps that was an issue earlier.

This is my notebook.

This is my first attempt at kaggle competition so just want it to work first (will adhere to the rules of the competition later).

I saw in another notebook that the threshold value is ~0.33. Perhaps everything has worked well and I just need to use a different threshold value when predicting?

```
IN: learn.predict("Why are men selective?")
OUT: (Category 0, tensor(0), tensor([0.6202, 0.3798]))
```

**GDB**(Gary Biggs) #34

I think I’m missing something significant in this thread. Wanted to surface an F1 metric for my NPL binary classifier. Fiddled with `Fbeta_binary`

but it didn’t work on the first try. Was going to invest some real effort figuring out how to implement it in my notebook but, just for chuckles and grins, I decided to try this:

```
learn = RNN_Learner(md, TextModel(to_gpu(m)), opt_fn=opt_fn)
learn.reg_fn = partial(seq2seq_reg, alpha=2, beta=1)
learn.clip=.25
learn.metrics = [accuracy, f1]
```

Results:

```
epoch trn_loss val_loss accuracy f1
14 0.211136 0.232183 0.912444 0.857092
```

These results were from a binary classifier I built using the `News Headlines Dataset For Sarcasm Detection`

Kaggle dataset and the `fwd_wt103.h5`

pre-trained model.

**AbuFadl**(Abu Fadl) #35

I believe the current fastai (v1.0x) does not have f1, but existed in old fastai.

**GDB**(Gary Biggs) #36

Thanks Abu. Seems odd that such a common metric was not carried over from the old version to the latest.

**sgugger**#37

FYI, SvenBecker added a lot of new metrics, and in passing renamed `Fbeta_binary`

to `FBeta`

(there are more options than just binary), this will be for v1.0.39 and onward.

How do we get f1 scores for our validation set?

**AbuFadl**(Abu Fadl) #40

I see `RuntimeError: Expected object of backend CUDA but got backend CPU for argument #2 'other'`

when using FBeta (fastai 1.0.39) on colab gpu. Is this a bug?

**SBecker**(Sven Becker) #41

Not sure about this. Did it work with the old Fbeta_binary? Can you link the notebook/code?

Currently the calculation of FBeta as well as some other metrics relies on the computation of the confusion matrix (it makes it easier to perform different averaging approaches). To initialize the matrix, the number of classes has to be specified. Defaults are: `FBeta(beta=2, n_classes=2, average="binary")`

where the *average* values are inline with the ones used by *sklearn* (except for ‘samples’).