I’m currently playing about with a couple of binary classification problems (outputs are either disease or no disease) and am finding that one hot encoding of the outputs - e.g. disease [0 1] and no disease [1 0] - is giving better results than just disease (0) vs normal (1).
Do other people find this?
I remember when I first started playing about with NNs (too long ago to count now …) I used to use MATLAB, and that is the approach they have always taken in their examples - e.g.
However, most of the binary classification examples I’ve followed using keras and pytorch tend to use the approach of 0 / 1 and a single dense sigmoid neuron. Is there a reason for this that I’ve missed?