Difference between printed metrics and actual metrics

rother · May 21, 2019, 11:44am

Hi,

I’m running the following code:

f_score = FBeta(average=‘macro’, beta=1)
learn_tsh = text_classifier_learner(data_task_specific_head, AWD_LSTM, metrics=[accuracy, f_score])
learn_tsh.fit_one_cycle(1, max_lr=1e-2, moms=(0.95,0.85))

Which produces this output:

Total time: 00:47

epoch	train_loss	valid_loss	accuracy	f_beta	time
0	0.636780	0.601293	0.674651	0.485052	00:47

If I check on the validation data (note that validation loss and accuracy are correct):

results = learn_tsh.validate(learn_tsh.data.valid_dl)
results

[0.60129267, tensor(0.6747), tensor(0.5832)]

So why is the printed value (table output) for FBeta different from the actual results, when it’s correct for the “builtin” things like accuracy and validation loss? I got a bit lost in the code trying to follow what gets printed where

My fast.ai library version is 1.0.52-1 (should be tha latest as of this writing).

rother · May 21, 2019, 11:47am

Upon further investigation, it seems that the values are incorrect once the average parameter is something other than micro. No parameter defaults to micro.

f_score = FBeta(average=‘macro’, beta=1) # wrong
f_blank = FBeta() # correct
f_blank2 = FBeta(beta=1) # correct
f_micro = FBeta(average=‘micro’) # correct
f_macro = FBeta(average=‘macro’) # wrong
f_weighted = FBeta(average=‘weighted’) # wrong

epoch	train_loss	valid_loss	accuracy	f_beta	f_beta	f_beta	f_beta	f_beta	f_beta	time
0	0.629249	0.604950	0.668663	0.545243	0.847554	0.782152	0.668663	0.547606	0.645194	00:52

results
[0.6049503,
tensor(0.6687),
tensor(0.5816),
tensor(0.8476),
tensor(0.7822),
tensor(0.6687),
tensor(0.5674),
tensor(0.6617)]

sgugger · May 21, 2019, 2:34pm

My guess would be that very something wrong in the callbacks calls in validate, which screws up the metric somehow. I’ll look into it when I have a bit of time.