Sigmoid vs softmax

I have a question on softmax vs sigmoid. In the lecture (lesson 3), it is mentioned that softmax is better for binary classification vs sigmoid is better for multi-label.
But then I look around it seems say something opposite:

Do I miss anything here? thx.

1 Like

You are confusing multi-label and mutl-class classifications.

Mulli-class means you choose from a number of mutually exclusive classes. Good example is dog’s breed classification. A dog cannot be of two breeds at the same time.

While the planet competition in the course is a an example of multi-label classification. One image can have 2 or more labels at the same time. It is perfectly Ok for a single photograph to contain both “Agriculture” and “River” while it is “Cloudy”.

As Jeremy explains in his lecture, softmax is extremely good at picking a single label. While sigmoid function simply differentiates between two classes. So, when you train a separate classifier for each label you may end up with a few positive values for a few different labels.


thx for your reply. that’s clear.