Where is loss being set?

From what I can tell the loss is getting set initially in the DataBunch class which is getting the value sometime when the train_ds is created


returns the loss function

<function torch.nn.functional.cross_entropy(input, target, weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='elementwise_mean')>

But I don’t see anywhere that that is actually passed. The closest thing I can find is this line in DataBunch which is the super for ImageDataBunch:

    def loss_func(self)->Dataset: return getattr(self.train_ds, 'loss_func', F.nll_loss)

So what this is saying is that if self.train_ds has a loss_func attribute, that it should use it, but if it doesn’t, it should use F.nll_loss which would look like this:

<function torch.nn.functional.nll_loss(input, target, weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='elementwise_mean')>

So at some point, self.train_ds is acquiring the loss_func attribute of cross entropy, but I don’t see anywhere that does that.

I am wondering if maybe it comes from here:

    def __getattr__(self,k):
        "Passthrough access to wrapped dataset attributes."
        return getattr(self.ds, k)

from DatasetTfm, but that is just a hunch because it is passing through any attributes that are already associated to the attributes.

After the loss is set here though, it keeps using it going through model creation. Because the DataBunch has a loss already, the ConvLearner will just use that loss as defined here in the Learner code:

self.loss_func = ifnone(self.loss_func, self.data.loss_func)

So if ConvLearner.loss_func is None, then it will use the ConvLearner.data.loss_func which comes back to the original question. Where does the loss initially get set to cross entropy?


I found out where cross entropy is being set! It is in ImageClassificationDataset.

So ImageDataBunch.from_name_re() calls ImageDataBunch.from_name_func() which calls ImageDataBunch.from_lists()

Here is the code of ImageDataBunch.from_lists():

    def from_lists(cls, path:PathOrStr, fnames:FilePathList, labels:Collection[str], valid_pct:int=0.2, test:str=None, **kwargs):
        classes = uniqueify(labels)
        train,valid = random_split(valid_pct, fnames, labels)
        datasets = [ImageClassificationDataset(*train, classes),
                    ImageClassificationDataset(*valid, classes)]
        if test: datasets.append(ImageClassificationDataset.from_single_folder(Path(path)/test, classes=classes))
        return cls.create(*datasets, path=path, **kwargs)

The actual loss_func gets set inside of ImageClassificationDataset()

Here is the init code for that class:

    def __init__(self, fns:FilePathList, labels:ImgLabels, classes:Optional[Classes]=None):
        self.classes = ifnone(classes, list(set(labels)))
        self.class2idx = {v:k for k,v in enumerate(self.classes)}
        y = np.array([self.class2idx[o] for o in labels], dtype=np.int64)
        super().__init__(fns, y)
        self.loss_func = F.cross_entropy

Finally tracked down where all of this starts! This is the line:

self.loss_func = F.cross_entropy

So that sets the loss_func attribute for both train_ds and valid_ds, then when the DataBunch is created, it looks at train_ds to see if a loss function is already set (if it isn’t, it will use F.nll_loss). Since train_ds already has F.cross_entropy set, it will use that.

Next, when the ConvLearner is built, it checks to see if a loss function is set. If it isn’t, it looks to the DataBunch to tell it what loss function to use. I definitely didn’t expect that when I started digging into it!


There’s a reason for looking into the DataBunch for the loss function. In the case of multi-label classification problem (in which a single input image can have multiple labels), ImageDataBunch.from_csv (or from_folder) creates an ImageMultiDataset, which sets the loss to F.binary_cross_entropy_with_logits, which is the appropriate loss function in this case.

1 Like

That makes sense. So if you wanted to change the loss, you would build the DataBunch and then do something like data.loss_func = Whatever new loss function you want to use before building the learner or does it make more sense to create a new Dataset creator that defines the Loss how you want it to? The first way seems easier if it works, but I’m not sure if you would miss some things by doing that.

See below, what the docs have to say :slight_smile:

the kwargs will be passed on to Learner , so you can put here anything that Learner will accept ( metrics , loss_func , opt_func …)



In case somebody goes looking for this again, the code has changed a bit due to the introduction of the data_block api.

The loss_func now gets set in the ItemLists in data_block.py:


class CategoryList sets it to CrossEntropyFlat()
class MultiCategoryList sets it to BCEWithLogitsFlat()
class FloatList (for regression problems) sets it to MSELossFlat()

The type of ItemList can either be set manually but if not specified is itself automatically inferred from the datatype and/or presence of separator (for multi-class) given in the labeling function (.label_from_xxx)

This is done in the get_label_cls() method in data_block.py

    def get_label_cls(self, labels, label_cls:Callable=None, sep:str=None, **kwargs):
        "Return `label_cls` or guess one from the first element of `labels`."
        if label_cls is not None:               return label_cls
        if self.label_cls is not None:          return self.label_cls
        it = index_row(labels,0)
        if sep is not None:                     return MultiCategoryList
        if isinstance(it, (float, np.float32)): return FloatList
        if isinstance(try_int(it), (str,numbers.Integral)):  return CategoryList
        if isinstance(it, Collection):          return MultiCategoryList
        return self.__class__