BCE_Loss loss for Image Classificaiton

Hello! I am a bit confused about the implementation of the BCE_Loss() function in Lesson 9 for image classification. If I understand it correctly we remove the background class and apply a binary cross entropy over the actual classes (so we also ignore the last filter, corresponding to the background). I am not sure I understand how is this last filter learning at all, or why do we need it, if we don’t use it at all when defining the loss function. Can someone explain this to me? Thank you!