Lesson 10 Discussion & Wiki (2019)

Ah you caught me! :open_mouth: No we didn’t.

But… we did show that conv is just a matrix multiply, with some tied weights and zeros, and we’ve already done that from scratch; so I figured we don’t gain much doing conv from scratch too. And it would be soooooo slooooow.

But for folks still feeling a little unsure about what a conv does - you absolutely should write it yourself! :smiley:

2 Likes

It’s fine to have a negative class for a binary problem (NLP, vision, or anything else) since it’s simply sigmoid activation and we don’t have this same issue.

But we don’t have a negative class for multi-class NLP problems IIRC…

1 Like

Yes, if you know you have one and exactly one class represented in each data item, then softmax is best, since you’re helping the model by giving it one less thing to learn.

nano doesn’t really do enough to be useful. I wouldn’t suggest spending time learning it. Use vim or emacs. Emacs is a little easy to get started with, although vim is better for manipulating datasets (although there are emacs extensions to help there).

1 Like

Yes that’s what I was using. It’s pretty basic but it’s ok.

It’s negligible. But you can check for yourself - use %timeit to see how long an if statement takes in python. Then compare that to the number of batches we do to train a model, and see what you think.

1 Like

I’ll make a post on that tomorrow.

2 Likes

Other ressources to help with understanding convolutions and building them from scratch:

3 Likes

as Jeremy said in lesson 8, it suppose to keep us busy until next course.

1 Like

Is there a specific reason why we continue using standard deviation in our convnet model, after Jeremy explains that mean absolute deviation is often better? Or does this statement not apply at all to Batchnorm etc. somehow? (Maybe that is one of those “try blah and see” experiments? :wink: )

Because that’s what everyone has always done, so I made something I knew would work to show in class. :slight_smile: It would be interesting to try abs instead. My guess is would work about equally well. Let me know what you find if you try it!

4 Likes

Why is the Runner not part of learner? If intention is to keep the learner free of code then why not just have the runner as a member of learner like model, data, loss? It seems weird to call runner.fit(1, learn) rather than just learn.fit(1).

Look at 09b :wink:
We thought of it later, but the Runner will be merged inside the Learner ultimately.

5 Likes

and you can write a few more great blogs to make it easier for others to understand :slight_smile:

1 Like

I wasn’t clear in my question, but what if we are using a non convnet architecture? Would there be a channel dimension then?

Just to add one more: https://www.coursera.org/learn/convolutional-neural-networks/home/week/1

Yes. :slight_smile:

If you don’t feel ready to take on Vim or Emacs yet, I suggest you download VS code as an intermediate step. As Jeremy mentioned, you can hover or right click on, say an object that is inherited by a class and it will take you to the source code.

Note that it is important to be in the correct Python environment for VScode. You have to select an interpreter, see instructions here. Folders are also important. I can show you what I know so far - text me some times when you’re available.


I liked Jeremy’s mantra from lesson 10:

“activations are things we calculate
parameters are things we learn”

(corrected - thanks @Kaspar)

2 Likes

We are going to learn about audio… How about Jeremy showing how to win this competition :smile:

7 Likes

It will probably be a good homework exercise after the next lesson :slight_smile:

2 Likes