@naveenperera @yat626

The threes need not necessarily be greater than 0, or sevens need not be less than zero. We can interchange them, or even set the threshold as any other number.

The idea is to create a mathematical distinction.

Our loss function is a function that will have a higher value whenever the network gives a wrong prediction - for example, in this case, if the network predicts the value of a three as less than zero, the loss would be higher in value, and whenever the network gives a correct prediction (for example, if it predicts the three as greater than zero, or seven as less than zero), the loss would be lower.

Now, we use an algorithm, called gradient descent, that is designed to MINIMIZE the loss function, which basically means, adjust the weights to give the most correct predictions ( because correct predictions give lower loss function value).

So basically, we randomly initialize the weights, and later on, adjust them according to our criteria. Initially the neural network knows nothing about our data, so it would give horrible predictions, but we will continuously improve it by adjusting our weights, such that the loss functions is reduced (or in other words, the network gives better prediction). (This is also a great point to understand why we need a loss function - the loss function acts as the mathematical function that can be used to understand how well the neural network is working. It is a mathematical way to tell the network, how well the predictions are, and when we give the Neural Network the task to minimize the loss function, we are in fact, making it âlearnâ and give better predictions)

To make it clearer, even if you would have set the threshold in the loss function such that the threes gave a value less than zero and sevens gave a value greater than zero, the gradient descent would adjust the weights accordingly, so that the loss function value would be minimum (ofcourse, meaning, that weights would be adjusted by the gradient descent algorithm such that the best predictions are given, which now, are negative for our threes and positives for our sevens.)

You can even try setting the threshold as a non zero value, and find out it still works.

Hope it makes it clearer

Cheers