Setting up Multi-label classfication using ImageDataBunch.from_csv

clarkeaa13 · October 29, 2018, 11:13pm

From what I understand, ImageDataBunch.from_csv() will accept multi-label inputs by simply space-separating the classes in the labels column of the csv.

However, I have done this for my dataset so that there are multiple labels per image, and it appears that now my loss is incorrect, giving negative values.

e.g.

epoch train_loss valid_loss accuracy
1 -2.241073 -5.858467 0.473379

Is the negative loss due to the loss function itself? If so, how can I change it within fastai such that it properly handles a multi-label dataset?

In summary, I would like to know:

Is fastai supposed to handle multi-label datasets “under the hood” without me explicitly changing anything? If so, why is it showing a negative loss?
What loss function(s) are appropriate for multi-label datasets and how should I implement them in fastai?

anish · October 30, 2018, 12:08am

I’m also working on a Multi-label classification problem.

Regarding your 1st Question:

Yes, fastai should be able to handle the dataset automatically. It actually creates a object of type ImageMultiDataset that should contain all the various labels and classes.
I’m not completely sure why a negative loss is showing. Are you explicitly passing in the “sep” keyword argument when calling from_csv()? Looking at the source code , I believe you need to set the sep key word argument in order for ImageMultiDataset to be used.

Regarding Your 2nd Question:

The ImageMultiDataset uses Pytorch’s binary_cross_entropy_with_logits for multi-label problems.

I hope this helps.

clarkeaa13 · October 30, 2018, 12:32am

Thank you, setting the ‘sep’ seemed to change my loss function to Binary Cross Entropy with Logits as you said.

However, now I am receiving a different error when trying to use learn.fit_one_cycle():
RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 ‘target’

How did the targets change to floats? Shouldn’t they be just integers from 0 to N-1 representing N classes?

I am using fp16 training, if that makes any difference. I don’t know for sure if fp16 works with multi-label datasets because this is my first time trying.

Full traceback:

Traceback (most recent call last):-----------------------------| 0.00% [0/127 00:00<00:00]
File “hptrain-amazon-64.py”, line 91, in
HP_types, def_vals, test_vals, timestamp)
File “/home/nyc1/platform-hyperparams/scripts/hputils3.py”, line 406, in train_main
ds_size=ds_size)
File “/home/nyc1/platform-hyperparams/scripts/hputils3.py”, line 321, in train_all_models
num_eps=num_eps)
File “/home/nyc1/platform-hyperparams/scripts/hputils3.py”, line 261, in train_one_model
num_eps=num_eps)
File “/home/nyc1/platform-hyperparams/scripts/hputils3.py”, line 158, in train_pipeline
learn.fit_one_cycle(5, lr, wd=wd)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/train.py”, line 19, in fit_one_cycle
learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 161, in fit
callbacks=self.callbacks+callbacks)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 93, in fit
raise e
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 83, in fit
loss = loss_batch(model, xb, yb, loss_func, opt, cb_handler)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 22, in loss_batch
loss = loss_func(out, *yb)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/torch/nn/functional.py”, line 1530, in nll_loss
return torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 ‘target’

edwincv0 · October 30, 2018, 4:39am

Try changing the metric. Ran into a similar issue when also running a multi-label dataset after setting sep param in data. Think maybe accuracy and error_rate only work if the predict is a single class.

clarkeaa13 · October 30, 2018, 4:37pm

I attempted to use Fbeta and instead got this error:

File “/home/nyc1/platform-hyperparams/scripts/hputils3.py”, line 155, in train_pipeline
learn.fit_one_cycle(5, lr, wd=wd)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/train.py”, line 19, in fit_one_cycle
learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 161, in fit
callbacks=self.callbacks+callbacks)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 93, in fit
raise e
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 88, in fit
cb_handler=cb_handler, pbar=pbar)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 54, in validate
if cb_handler and cb_handler.on_batch_end(val_losses[-1]): break
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/callback.py”, line 238, in on_batch_end
stop = np.any(self(‘batch_end’, not self.state_dict[‘train’]))
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/callback.py”, line 186, in call
if call_mets: [getattr(met, f’on_{cb_name}’)(**self.state_dict, **kwargs) for met in self.metrics]
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/callback.py”, line 186, in
if call_mets: [getattr(met, f’on_{cb_name}’)(**self.state_dict, **kwargs) for met in self.metrics]
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/callback.py”, line 270, in on_batch_end
self.val += last_target.size(0) * self.func(last_output, last_target).detach().item()
AttributeError: ‘Fbeta’ object has no attribute ‘detach’

This is on fastai v1.0.15.

sgugger · October 30, 2018, 7:12pm

Fbeta wouldn’t go on that part of the code normally. Could you paste out your code? In particular did you instantiate the class (with parenthesis)?

clarkeaa13 · October 30, 2018, 8:12pm

The only thing that I did was change the metrics from [accuracy] to [Fbeta]:
learn = create_cnn(*other_args, metrics=[Fbeta])

sgugger · October 30, 2018, 8:38pm

Like I said, you didn’t instantiate the class. You should use Fbeta() if you want all the default args. Fbeta(beta=something) for another value of beta etc…

clarkeaa13 · October 30, 2018, 8:45pm

Once the class is instantiated, would I still pass it as a metric to a learner?

sgugger · October 30, 2018, 10:44pm

Yes indeed.
Note that we just discovered a bug in Fbeta and removed it. Now you should use the function fbeta (v1.0.17).

clarkeaa13 · October 31, 2018, 3:18pm

I modified the code to the default arguments:
create_cnn(metrics=[fbeta()])
but I am still receiving an error:
TypeError: fbeta() missing 2 required positional arguments: ‘y_pred’ and ‘y_true’

Shouldn’t the y_pred and y_true come after the learner starts training? Why would it ask for y_pred and y_true before doing anything?

This is in v1.0.18.

EDIT: I tried changing it back so it has no parenthesis:
create_cnn(metrics=[fbeta])

This runs for an epoch and then throws the following error:
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/train.py”, line 22, in fit_one_cycle
learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 162, in fit
callbacks=self.callbacks+callbacks)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 94, in fit
raise e
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 84, in fit
loss = loss_batch(model, xb, yb, loss_func, opt, cb_handler)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 22, in loss_batch
loss = loss_func(out, *yb)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/torch/nn/functional.py”, line 1530, in nll_loss
return torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 ‘target’

So the targets are still of type Float. I have confirmed that the loss function from “learn.loss_func” is actually BCE with Logits, yet the program attempts to use nll_loss. Why is this?

sgugger · October 31, 2018, 4:41pm

Sorry I was unclear, fbeta is now a function, like accuracy, so doesn’t take parenthesis. The previous one was a class, that’s why it needed to be instantiated.

clarkeaa13 · October 31, 2018, 4:47pm

OK. So then the following:
create_cnn(metrics=[fbeta])
is the correct usage?

What about the issue with the loss function? I printed out learn.loss_func and received:

<function binary_cross_entropy_with_logits at 0x7f7169c6eea0>

which means that it should be using BCE with logits as expected. However I still received the following error:

return torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 ‘target’

which suggests that it is maybe using the wrong loss function?

EDIT: I attempted manually forcing the learner to use BCE as loss function:

learn = create_cnn(data, arch, path=PATH,
metrics=[fbeta],
ps=[p1, p2],
pretrained=True,
callback_fns=[ShowGraph])
learn.loss_func = nn.BCEWithLogitsLoss

but I got this error instead:

File “hptrain-amazon-64.py”, line 84, in
learn.lr_find()
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/train.py”, line 30, in lr_find
learn.fit(a, start_lr, callbacks=[cb], **kwargs)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 162, in fit
callbacks=self.callbacks+callbacks)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 94, in fit
raise e
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 84, in fit
loss = loss_batch(model, xb, yb, loss_func, opt, cb_handler)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/fastai/basic_train.py”, line 22, in loss_batch
loss = loss_func(out, *yb)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/torch/nn/modules/loss.py”, line 568, in init
super(BCEWithLogitsLoss, self).init(size_average, reduce, reduction)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/torch/nn/modules/loss.py”, line 15, in init
self.reduction = _Reduction.legacy_get_string(size_average, reduce)
File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/torch/nn/functional.py”, line 48, in legacy_get_string
if size_average and reduce:
RuntimeError: bool value of Tensor with more than one value is ambiguous

sgugger · October 31, 2018, 5:24pm

Again, nn.BCEWithLogits is a class, so you need to instantiate it to have a function .

clarkeaa13 · October 31, 2018, 5:28pm

Yes, I had just realized that

However it still gives me the same error as not forcing it manually:

return torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 ‘target’

Even multilabel margin loss gives the same kind of error:

File “/home/adrian/anaconda3/envs/fastai11/lib/python3.7/site-packages/torch/nn/functional.py”, line 1863, in multilabel_margin_loss
return torch._C._nn.multilabel_margin_loss(input, target, reduction)
RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 ‘target’

sgugger · October 31, 2018, 5:30pm

What is weird is that you get to a nll_loss when we want binary crossentropy with logits (that expects target as float so this is normal). What is learn.loss_func after you have the error?

clarkeaa13 · October 31, 2018, 5:40pm

It definitely prints out BCE loss:

<function binary_cross_entropy_with_logits at 0x7f7169c6eea0>

clarkeaa13 · October 31, 2018, 5:47pm

OK I think I have found the problem. I used this code to try and debug:

try:
learn.fit_one_cycle(5, lr, wd=wd)
except RuntimeError:
print(“Your loss function was \n”)
print(learn.loss_func)
raise

This prints out

Your loss function was

<function nll_loss at 0x7f61a0e14ae8>

which is the incorrect loss function, so the loss function was changed somewhere in my code. I will look into it more. Thank you for your help!

krash · December 1, 2018, 11:39am

Any advancement? @clarkeaa13

adeperio · August 12, 2019, 7:28am

Any luck here anyone? Running into similar issues