Part 2 Lesson 9 wiki

jeremy · March 31, 2018, 5:52pm

Yup this is confusing! And quite possibly I’m doing it in a sub-optimal way…

Let’s first discuss the loss function. We want to 1-hot encode, but if the target is ‘background’, then we want all-zeros. So in the loss function we use a regular 1-hot encoding function, then remove the last column. There are of course other ways we could do this - if anyone has a suggestion for something that is more concise and/or easier to understand, please say so.

As for why we add one in the convolutional output - well frankly I can’t remember! I suspect it is a redundant hold-over from when I used softmax (which is what I did when I started on this). My guess is that you could remove it and get just as good results (if not better). If you try this, please let us know how you go!