Lesson 4 - Official Topic

Antoine.C · April 15, 2020, 1:24pm

I am going to add to the other answers (e.g. @ram_cse), catering perhaps to the more mathematically inclined, but I hope this will be helpful to others.

I am splitting my answer into two parts. This first part is on notation.

Let’s forget for a moment neural nets. The conventional notation in math textbooks it to write a fuction as

y=f(x)

where x is the variable with respect to which the typical math exercise asks to run the optimization. See for example the cell in the fastbook notebook 04_mnist_basics.ipynb with:

def f(x): return x**2
plot_function(f, 'x', 'x**2')
plt.scatter(-1.5, f(-1.5), color='red');

(the function plot_function is defined in fastbook/utils.py).

Note that diagrams in subsequent cells show a graph with parameter along the horizontal axis, rather than x.

In training a neural net, on the other hand, the (loss) function to optimize involves two very different types of variables:

weights (and biases): let’s collectively denote them with w;
data samples: let’s collectively denote them with x.

Thus, writing explicitly the dependence of the loss function on its weights and the data gives

y=f(w,x)

But in the training phase, the optimization is with respect to w (the weights), not x (the data). That is, x remains fixed. (Alternatively, we could decide to incorporate the dependence on x in the symbol f, so that the function that we are interested in optimizing would simply be written y=f(w), but it is helpful to keep track of the data that we are using, so we shall keep the explicit dependence on x in the notation as well.)

Revisiting the earlier example in fastbook notebook 04_mnsit_basics.ipynb, we now obtain:

def f(w): return w**2
plot_function(f, 'w', 'y')
plt.scatter(-1.5, f(-1.5), color='red');