For myself reading notes is important part of learning experience. On last courses, students have made great summaries of the lectures and that way made it easier to recall things. I wanted to continue writing these notes also in part 2 and I thought that it might be a good idea to gather all these notes into one place. Due to that, I suggest that people share their notes here instead of in discussion thread.
I love the freedom that writing in markdown provides. It is clean and at the same time allows many complex formatting, including latex equations and images. It will be my default tool for writing my notes during the class. I hope to be useful for those who want to quickly recap some topic that weāve seen. I did a review this morning, but is still a little chaotic, specially when writing notes from jupyter parts. It is still a working in progress
@Lankinen: sorry to trouble you, especially after all that hard work, but you could you please move your notes to somewhere thatās not Medium? We donāt need lesson notes for the unreleased course to be actually hidden, but I donāt want them to be promoted/shared until after the MOOC is out in June. Medium does a lot of cross-promoting, which can cause problems if folks not in the course start seeing them and asking about stuff which they donāt yet have access to.
Github or Google docs or threads on the forum are all good options.
There is already a thread for lesson notes collaborative lesson notes.
Itās a wiki as well. It will be great if we can add all different versions of notes there. And collaborate into a really good one as Jeremy suggested.
@Lankinen FYI you could share notes as an āunlistedā draft on Medium too. This isnāt promoted, and is accessible only to those who have the link to the post You can later publish this publicly.
Unlisted stories will not appear in the home feed, profile page, tag page, or search on Medium, nor will they appear in notifications or email digests.
To share an unlisted story, simply share the URL of the post. Unlisted stories are not password protected, and anyone who has the link will be able to view the post.
Thank you for noticing that because I wasnāt available to see the thread. Collaborative notes are great idea and I try to help with those again on this part. From now I will publish notes there.
I am looking for help on the markdown used in this forum. This is so I can use vim and use pandoc to convert from ā.mdā but also post here parts of what I create. If I wanted to have a footnote in my post is there away here. We have the icons in this reply box but is there more that can be used. In markdown [^1] is a footnote but does not seem to work here.
In short is there a cheatsheet for markdown feature used in this forum as apposed to markdown used else where.
Thanks
from googling my impression is that discourse (the forum software) uses commonmark to implement markdown so maybe try here? https://commonmark.org/help/
Hereās a quite literal transcription of Lesson 8 in a Jupyter Notebook, including slides and code:
I made this transcription as an experiment to see how much I would learn doing it, but at least to me it is a pretty inefficient way to learn - especially when transcribing everything. Next week, Iāll just make a summary with the core concepts, so I have more time to experiment and focus on the most important things :).
Hope this is useful to at least some people.
Here are some of my notes for Lesson 8, in case anyone finds them useful. I looked at the list of subjects we covered in part-1, and wrote down some of the ones I hear of a lot and have a āsort-ofā idea about, but would have trouble telling you exactly what they were if asked. It was actually a very good exercise, and clarified a lot of topics. I may add to this if I do more; note-taking order is newest topic on top.
I had difficulty searching for specific topics in the part-1v3 course, Hiromiās notes were very useful for finding relevant lecture sections.
Adam builds upon RMSProp which builds upon Momentum.
RMSProp takes the expo-weighted moving average of the squared gradient
the result is: updates will be large if the gradients are volatile or consistenly large; small if constly. small.
Adam takes RMSProp and divides by the squareroot of the previous update (the squared terms).
if the gradients are consistently small and non-volatile, the update will be larger.
Adam is an adaptive optimizer
Adam keeps track of the expo-weighted moving avg (EWMA) of the squared gradients (RMSProp), and the expo-weighted moving avg of the steps (momentum); and divide the EWMA of the prev steps by the EWMA of the squared terms (gradients); and also use ratios as in momentum.
momentum adds stability to NNs, by having them update in more the same direction.
this effectively creates an exponentially-weighted moving average, because all previous updates are inside the previous update, but theyāre multiplied smaller each step.
momentum is a weighted average of previous updates.
any function can be approximated arbitrarily closely by a series of matmuls (affine fns) followed by nonlinearities.
this is the entire foundation & framework of deep learning: if you have a big enough matrix to multiply by, or enough of them ā a function thatās just a sequence of matmuls and nonlinearities that can approximate anything ā then you just need a way to find the particular values of the weight matricies in your matmuls that solve your problem. We know how to find the values of parameters: gradient descent. So thatās actually it .
parameters = SGD ( affine-fn ā nonlinear-fn )
parameters are the values of the matrices (NN) that solve your problem via their sequence of affine-fnānonlinearity.
an embedding just means to look something up in an array.
multiplying by a one-hot encoded matrix is identical to an array lookup ā an embedding is using an array lookup to do a matmul by a one-hot encoded matrix without ever actually creating it.
a value scaling the sum-of-squares of parameters, which is then added to the loss function.
purpose: penalize complexity. This is done by adding the sum of squared parameter values to the loss function, and tuning this added term by multiplying it be a number: the wd hyperparameter.
I wanted to write about lessons on my personal blog (my domain). Is this cool with the fast.ai team?
I wonāt do full lessons but rather interesting/ challenging parts from the lectures (I.e. matmul section, einsum etc. from lesson 8). With proper referencing fast.ai of courseā¦
In case Jeremy doesnāt notice this I would say that it is ok. At least previously people have published this kind of independent tutorials without problems if you donāt rely to much on the material. Iām very interested of reading these blogs so I hope that you add the link as soon as you publish something.
Please do - but donāt link to unreleased materials of course, and try to focus on individual bits that are still useful to people that donāt have access to the full course yet.
Would that be fine though If I want to write about the PyTorch nn.module that explained in the lesson? I found the refactoring part is quite neat and it may be useful for people does not have a strong software background (like myself).