Lesson 8 (2019) discussion & wiki

Thanks. I think they will matter. This link has some more ideas on the same. It is suggested to read it and all associated links.

However, the best way to double check is to carry out your own experiments with your data.

Hope this helps.

1 Like

I am so happy you shared this tool!!! Soooo convenient and works great. Thank you :slight_smile:

1 Like

Finally calculated broadcasting by hand and the key to understanding it is knowing which dimension to expand the row which you are broadcasting on and what dimension to sum the final results on.

In regular matmul, after multiplying rows by columns we sum across columns (dim=1). In broadcasting, we expand over columns (dim=-1) and sum across rows (dim=0). Both which seem counterintuitive at first but which make sense in order to get the same shape as the original row. Here are the steps if you do it by hand:

  1. Expand row i by column dimension i.e. a[i,None]
  2. Do an element-wise matmul between expanded row i and matrix b
  3. Sum on dim=0 (row axis)

Efficiency comes from getting rid of the for-loop and doing all three steps in one go.

1 Like

TLDR: use test-driven development and your chances of being frustrated and stuck with a hard-to-understand bug will go down dramatically.

I want to share my experience with reproducing notebook for lesson 8. When I got to the step with a class-based 2-layer model with manual backpropagation, I couldn’t make it work. The gradients did not match with the ones calculated by PyTorch’s autograd. I even bluntly copied Jeremy’s code into my notebook, and it still didn’t work. I think I’ve spent 2-3 hours trying to understand the source of the problem. I grew increasingly frustrated, and I made myself stop trying to fight this problem.

The next day I decided to take a more delicate approach. I remembered all the articles that say how good test-driven development is and how much easier it makes developing complex applications. So I fired up my code editor (I use PyCharm and Visual Studio Code, but this time it was PyCharm), set up a new project using Cookiecutter and started developing the model from lesson 8 by first writing a test, and then writing code to make it pass (I used pytest as testing framework). I did it one by one, for Lin, ReLU, MSE and, finally, Model. For each one, I tested separately initialization (where applicable), forward pass and then backward pass. Slowly but surely all my tests were passing, and at the end of the day, I have it all working. Now I can import all these classes into Jupyter notebook and use them, which is fantastic! Moreover, I am also confident that they all work as expected.

Now I am coming to appreciate TDD, and I think I will stick with it until the end of the course, mainly because the software we are building in these lessons will only be more complicated.

P.S. Please share your experiences of TDD, do you use it? If yes, what advice can you give for testing deep learning code specifically? Maybe you have some tricks related to testing?

6 Likes

Definitely an original approach I would say :). Any chance you could share your code? I think it could be interesting to see how you did it. I like TDD for web development in case I know what to build exactly, but didn’t use it yet for anything related to deep learning.

1 Like

And if you’re feeling like doing some TDD, fastai needs your help to write tests. So pick an area/API you’d like to learn and write tests to deepen your understanding and then contribute those tests to the fastai test suite! Please see: Improving/Expanding Functional Tests, but instead of trying to write tests based on fastai “needs” as that thread indicates, write tests that meet your needs.

And the best way to learn how to write tests for DL is to study the existing fastai tests, Plus you will find a lot of tips here: https://docs.fast.ai/dev/test.html.

8 Likes

I just feel it may feel more coherent to use the same color theme for the slide, so I just create a color theme with fastai logo. You can just download the slide and take the theme.

ppt link: https://drive.google.com/file/d/1WrclmCJQo53RjbQ6Ww99ELExmrq97lOw/view?usp=sharing

I create the color theme from this slide:

Original:

After:

1 Like

That’s very thoughtful of you. I just looked thru the sample deck, and it’s too much green for me!.. :wink:

1 Like

I tried to go with a more blue-gray color now. I remember somewhere on twitter people ask why all the orange-red color and your reply was that was coming from powerpoint default. The “design idea” of powerpoint is handy but it is weird that it does not allow you to edit the background color, the only way is to change color theme instead.

https://drive.google.com/file/d/1WrclmCJQo53RjbQ6Ww99ELExmrq97lOw/view?usp=sharing

Much better! I’m still not sure whether I’ll actually use it (since I’m not sure I love our logo color :wink: ) but I might give it a go on the next ppt I try.

Short yet seems to have some useful details with regards to autograd’s implementation https://www.youtube.com/watch?v=MswxJw-8PvE (hope I’m no off topic here)

Appreciate that.:grinning:I didn’t expect you don’t like the logo though. How was the logo created at first place?

About 5 mins before I was to give a presentation I thought I should put a logo on the title slide. So I went to a free logo generator and put something together and pasted it in to my slide.

Then for the Tensorflow Dev Summit they wanted a higher res logo so I spent $10 at Fiverr getting it converted to vector format.

3 Likes

I am a complete neophyte when it comes to software development (which means this part has been a kick in the rear :slight_smile:). Is there any recommended reading you’d highlight on test-driven development? I started googling articles, but wanted to see if you had some “best of” stashed away.

Either way - thanks for this write-up!

I am sure there are some designer or artist from fastai. Or you should made one with GAN. :wink:

I strongly suggest the first part of the book “Clean Architectures in Python” (and it’s free)! You can also read the famous “Obey the Testing Goat” (but it’s very heavily focused around web development and Django, so it might be hard and feel a little irrelevant to a beginner, I felt that way while reading it).

I think my main suggestion would be to start a project and say to yourself that you would try to strictly stick to TDD for this project, and follow through. At first that would be painfully slow but the more you do it the easier it gets. Another benefit, every time you make a test pass, you get a small dopamine release which is a very pleasant feeling, and you will be getting it consistently by the virtue of following TDD.

4 Likes

I exclusively developed with TDD for about 10 years and loved it. However now that I’ve moved to a “jupyter-first” dev approach I do things slightly differently. It still kinda looks like TDD if you squint a bit - but the “tests” look more like experiments and are more iterative, because they’re happening in the notebook as I develop.

I used to write all my tests up-front, but now I iterate the design a lot more in the notebook. I’m finding I’m ending up with APIs I’m happier with this way, and get them working more quickly.

3 Likes

Hello,

What is the rule of thumbs for RNN/LSTM/GRU layers initialization? Is it should be the same as Linear/CNN layers which are mean=0 and std=1?

Thank you in advance.

1 Like

What is a “ranked” tensor?

Is that the number of dimensions of a tensor?

Is the lowest ranked tensor a “rank 2” tensor?

For example: https://youtu.be/4u8FxNEDUeg?t=3631 here the first two tensors are described as “rank 2” and the last one is a “rank 3”.

Also does asking questions here instead of googling it myself help contribute to the forums?

May I suggest a global glossary of terms page for the forums or docs.fast.ai?

The rank signifies how many dimensions a tensor is. A rank 2 tensor, for example, is a traditional matrix (row by columns). A rank 1 tensor would be a vector, a rank 0 a scaler, etc. Does that help?