F1 Score as metric

AbuFadl · December 24, 2018, 2:33pm

That notebook uses the same var for lm and clas learner. Be careful you are not mixing. Maybe clear the models and tmp directories (persist after kernel restart but not after reset all runtimes (on colab, at least).

nikhil_no_1 · December 24, 2018, 3:17pm

Its not anymore. Perhaps that was an issue earlier.
This is my notebook.
This is my first attempt at kaggle competition so just want it to work first (will adhere to the rules of the competition later).
I saw in another notebook that the threshold value is ~0.33. Perhaps everything has worked well and I just need to use a different threshold value when predicting?

IN: learn.predict("Why are men selective?")
OUT: (Category 0, tensor(0), tensor([0.6202, 0.3798]))

GDB · December 27, 2018, 4:26am

I think I’m missing something significant in this thread. Wanted to surface an F1 metric for my NPL binary classifier. Fiddled with Fbeta_binary but it didn’t work on the first try. Was going to invest some real effort figuring out how to implement it in my notebook but, just for chuckles and grins, I decided to try this:

learn = RNN_Learner(md, TextModel(to_gpu(m)), opt_fn=opt_fn)
learn.reg_fn = partial(seq2seq_reg, alpha=2, beta=1)
learn.clip=.25
learn.metrics = [accuracy, f1]

Results:

epoch      trn_loss   val_loss   accuracy   f1
14         0.211136    0.232183   0.912444   0.857092

These results were from a binary classifier I built using the News Headlines Dataset For Sarcasm Detection Kaggle dataset and the fwd_wt103.h5 pre-trained model.

AbuFadl · December 27, 2018, 9:12am

I believe the current fastai (v1.0x) does not have f1, but existed in old fastai.

GDB · December 27, 2018, 4:52pm

Thanks Abu. Seems odd that such a common metric was not carried over from the old version to the latest.

sgugger · December 28, 2018, 9:38am

FYI, SvenBecker added a lot of new metrics, and in passing renamed Fbeta_binary to FBeta (there are more options than just binary), this will be for v1.0.39 and onward.

AbuFadl · December 28, 2018, 3:29pm

Great! So, in next version (1.0.39+), the ‘normal’ F1 = FBeta()?

sgugger · December 28, 2018, 4:51pm

More like FBeta(beta=1) since the default for beta is 2.

AbuFadl · December 29, 2018, 10:26am

I see RuntimeError: Expected object of backend CUDA but got backend CPU for argument #2 'other' when using FBeta (fastai 1.0.39) on colab gpu. Is this a bug?

SBecker · December 30, 2018, 1:20am

Not sure about this. Did it work with the old Fbeta_binary? Can you link the notebook/code?

Currently the calculation of FBeta as well as some other metrics relies on the computation of the confusion matrix (it makes it easier to perform different averaging approaches). To initialize the matrix, the number of classes has to be specified. Defaults are: FBeta(beta=2, n_classes=2, average="binary") where the average values are inline with the ones used by sklearn (except for ‘samples’).

Ref.: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score

AbuFadl · December 30, 2018, 6:26am

Yes, it worked with Fbeta_binary. Failed using FBeta(beta=1). I have 2 classes: 0,1
These are the relevant error lines:

 1 learnc0.fit_one_cycle(1, 2e-2, moms=(0.8,0.7))   [my code]

/usr/local/lib/python3.6/dist-packages/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, wd, callbacks, **kwargs)
     20     callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor,
     21                                         pct_start=pct_start, **kwargs))
---> 22     learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
....
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
    170         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    171         fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
--> 172             callbacks=self.callbacks+callbacks)
....
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     92     except Exception as e:
     93         exception = e
---> 94         raise e
...
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     87             if not data.empty_val:
     88                 val_loss = validate(model, data.valid_dl, loss_func=loss_func,
---> 89                                        cb_handler=cb_handler, pbar=pbar)
...
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in validate(model, dl, loss_func, cb_handler, pbar, average, n_batch)
     52             if not is_listy(yb): yb = [yb]
     53             nums.append(yb[0].shape[0])
---> 54             if cb_handler and cb_handler.on_batch_end(val_losses[-1]): break
...
/usr/local/lib/python3.6/dist-packages/fastai/callback.py in on_batch_end(self, loss)
    237         "Handle end of processing one batch with `loss`."
    238         self.state_dict['last_loss'] = loss
--> 239         stop = np.any(self('batch_end', not self.state_dict['train']))
...
/usr/local/lib/python3.6/dist-packages/fastai/callback.py in __call__(self, cb_name, call_mets, **kwargs)
    185     def __call__(self, cb_name, call_mets=True, **kwargs)->None:
    186         "Call through to all of the `CallbakHandler` functions."
--> 187         if call_mets: [getattr(met, f'on_{cb_name}')(**self.state_dict, **kwargs) for met in self.metrics]
...
/usr/local/lib/python3.6/dist-packages/fastai/callback.py in <listcomp>(.0)
    185     def __call__(self, cb_name, call_mets=True, **kwargs)->None:
    186         "Call through to all of the `CallbakHandler` functions."
--> 187         if call_mets: [getattr(met, f'on_{cb_name}')(**self.state_dict, **kwargs) for met in self.metrics]
...
/usr/local/lib/python3.6/dist-packages/fastai/metrics.py in on_batch_end(self, last_output, last_target, **kwargs)
    116     def on_batch_end(self, last_output:Tensor, last_target:Tensor, **kwargs):
    117         preds = last_output.argmax(-1).view(-1)
--> 118         cm = ((preds==self.x[:, None]) & (last_target==self.x[:, None, None])).sum(dim=2, dtype=torch.float32)

RuntimeError: Expected object of backend CUDA but got backend CPU for argument #2 'other'

SBecker · December 30, 2018, 10:02am

Ok fixed it. A pull request has been submitted #1416 @sgugger.

mayank4 · January 7, 2019, 11:12am

I am also having the same problem. My dataset has 5 classes and working fine with accuracy but it is showing the same error with fbeta metric. Have you got it fixed?

Preka · February 8, 2019, 4:10pm

hmmmmm… it seems like this issue is still persisting. Using fbeta as a metrics for a multi-classification problem gives me the following error:
The size of tensor a (24) must match the size of tensor b (16) at non-singleton dimension 1

jbuzza · February 14, 2019, 5:05pm

Yes, I also get a similar error.

AbuFadl · February 20, 2019, 7:42am

I also get The size of tensor a (2) must match the size of tensor b (64) at non-singleton dimension 1 when using fbeta metric (fastai 1.0.45). Binary classification.
Edit: should have used FBeta() for binary.

dmitryako · February 28, 2019, 12:40am

Has anyone solved this issue? I am trying to use
learn = tabular_learner(data, layers=[50,10], ps=[0.1,0.1], emb_drop=0.1,
metrics=[accuracy, fbeta])

jls · February 28, 2019, 2:49am

Try metrics=[Fbeta(1)]. Also you may want to check out the docs.

austinmw · March 1, 2019, 6:09pm

So just to clarify, fbeta is a metric intended for multiclass+multilabel, so we’d need to adapt it for single label multiclass, not just for binary classification, right?

sgugger · March 1, 2019, 6:11pm

fbeta is for multilabel, Fbeta is for singlelabel and can handle multiclass (check the various possible modes).