Plain PyTorch implementation of fast.ai notebooks

Very interesting topic ! Count me on. I tried to replicate the tabular module with pure Python reverse tabular module. This module at this time I think is less challenging than other module (vision, text). Now, I am trying to dig deeper in vision and optimizer

@srmsoumya Fast.ai library is pretty much modular and you can use any piece of code to use it with your custom models. Jeremy will definitely explain some of the internals and would encourage us to understand the structure and choose a problem(classification,recommendation,LM,sentimental analysis etc.) and solve it with fast.ai library.

I like this Idea. Its already done by jeremy. Basically the whole documentation at docs.fast.ai is generated from notebooks Link.
Every function,method is there. Have a look.

Glad you found it useful. When I was initially writing it, I really didn’t know where I was going, but I definitely learned a lot about where the loss function is determined in the new fastai library. I usually will start a new post whenever I find something that I don’t have a good grasp on and if I am able to solve it or ask a well thought-out question, I will post it.

I usually learn more when I am explaining what I learn and I have actually searched errors in the forums and ended up on posts that I started that I had forgotten about so they can actually save your future self some headache time too.

5 Likes

I have implemented SGDR and Snapshot Ensembling in PyTorch when I was doing one of my personal projects.
You can find the code here: https://github.com/jayeshsaita/Speech-Commands-Recognition
The code is documented and easy to understand.
Also, if anyone is interested in the project and wants to know more, I have written a blog. You can find the blog here: https://towardsdatascience.com/ok-google-how-to-do-speech-recognition-f77b5d7cbe0b

3 Likes

I’m doing the same thing myself on different parts of the library. Like @noskill said, the official docs already have all the notebooks created and there are over 30 notebooks there. I think one way to be helpful to the docs at this point is to go through the notebooks and see if there is any error (I’m sure there will be).

I think at this point re-implementing in PyTorch is mostly for our own understanding, and it’s an absolutely necessary step if we want to get a solid grasp of these tools. One can always dig deeper than PyTorch if re-writing in PyTorch is not enough :wink: If we find something that’s missing from the library, we’ll do pull requests to add/update.

What do you think?

1 Like

Great discussion! Here are some ways that you can learn a lot about the library, whilst also contributing to the community:

  • Pick a class, function, or method and write tests for it. For instance, here are the tests for fastai.core. Adding tests for anything without good test coverage is a great way to really understand that part of the library deeply, and have in depth conversations with the dev team about the reasoning behind decisions in the code
  • Document something that is currently undocumented. You can find them by looking for the “new methods” section in any doc notebook. Here’s a search that lists them
  • Add an example of use to the docs for something that doesn’t currently have an example of use. We’d like everything soon in the docs to include an actual piece of working code demonstrating it. Currently we’ve largely only provided working examples for stuff higher up the abstraction ladder.
37 Likes

Agree, a good test coverage and thorough documentation would be really helpful. I guess the library will get a lot of contributors during the course.

1 Like

Yeah! Totally agree on this.

I would love to write test for the library. However I have no experience on it :smiley:, I will search but appreciated if someone can give me some useful resources about it.

Then, when finish, I will do a PR on github ? I have no experience on doing a PR also so I’m sorry in advance if I bother you too much @jeremy.

Thank you

That’s great! Here’s a starting point for you:

https://docs-dev.fast.ai/test.html

Easiest is to read the source for some existing tests, and play around with them to see how they work. And read the pytest docs or a tutorial.

That’s right. It’s wonderfully easy if you install hub:

https://hub.github.com/

And it’s certainly no bother to help folks wanting to help me! :smiley:

14 Likes

I’ve moved this to the ‘advanced’ category.

I have no experience doing any of this but I’m willing to roll my sleeves up and learn while helping!!

1 Like

That’s an awesome attitude! Just yell if you need any help. :slight_smile:

@shoof @jayeshsaita @cedric @devforfu I really like the idea of moving some of the key parts of the fastai library into plain PyTorch, and, even more interestingly, extending some of the functionality. We could create both fastai-independent and dependent versions. If we enhance something sufficiently, we should create a pull request with a dependent version. I’m definitely up to help with a few key parts of this. If we get at least two people to volunteer, then we can split up responsibilities and check in on progress every now and then.

1 Like

That’s an interesting idea. However, I propose to make sure that we really understand how the library works at first, helping with tests, digging into PyTorch, building custom torch.Dataset classes, applications, meaningful PR’s, etc. Then we’ll get enough expertise, I believe, to make our own forks.

Of course, it is only my point of view, based on some experience with building training loops and data loaders. And, probably others already are quite flexible with fastai and pytorch. Anyway, would be glad to participate in fastai development, its forks, or inspired-by libraries.

3 Likes

Also this way you’ll get a lot of help from me and the team here! :slight_smile: We’ll review your code, make suggestions, and give you support if you get stuck.

8 Likes

Yes, agree, that’s the main benefit I would say! :smile: I am sure that fastai team would be a great help for anyone who wants to build something within fastai and pytorch ecosystem.

1 Like

I want to learn internals of this version of fastai.
I did go through the fastai_V1 code internals and tried implementing learner and imagedataloader class in pytorch/python.
I want to contribute to this initiative and be part of code reviews and idea discussions.

Hi @jeremy thanks for the very practical advice. Very useful for beginners like me. So i followed all the instructions to setup the developer install and ran the pytest command. Looks like test_image_data in tests/test_vision_train.py is failing.

Here is what i get:

On little experimenting in the notebook: i think the line assert abs(d.mean()-0.2)<0.1 is the reason why the test is failing. So i tried to check the real mean of the first images in the dataset. See the image below, where i try to replicate the test case.

In particular cell 31 output suggests that mean of the img - 0.2 is 0.1181 whereas the test asserts it to be less that 0.1.

I think this maybe the reason it is failing.

*this is my first time messing around with testing/contributing in an open source library. little guidance for the first time will be much appreciated. In case this is not really an issue, many apologies.

2 Likes

@jeremy Little update, if i change the assert condition mentioned above to assert abs(d.mean()-0.2) < 0.12 all the test pass. Why 0.12 because the mean of the image - 0.2 is 0.1181. Image below of all the test passing.

1 Like