Ian Goodfellow: Input/Parameter linearity in DNNs

Ian Goodfellow says at 18:20 in https://www.youtube.com/watch?v=CIfsB_EYsVI : “The input of the model to the output of the model is close to being linear or piecewise linear with relatively few pieces. The mapping of the parameters of the model to the output is highly non linear. So the parameters have highly non linear interactions and thats what makes training much harder. Thats why optimising parameters is much harder than optimising inputs.”

I don’t understand how the inputs are linear and parameters non linear. I can kind of grasp inputs being linear as activations are monotonic functions but not parameters. Can someone pls explain this. He goes on to show an image to illustrate this as well.

I think this is very poorly explained. I hope someone here could explain this more easily for us but for now I don’t think this is relevant information to understand. What I got from here was that this guy tried to show that one layer is non linear but when we add multiple of these together we get almost linear layer which is the predictor function. I’m not sure is this true what I understood so hopefully someone can correct me.