Update on fastai2 progress and next steps

Yes I feel the same way. It shouldn’t be a big job to add either AFAICT.

2 Likes

Hi Jeremy -
I can’t tell you how much I look forward to the new course and your book (which I just pre-ordered)! I am especially sensitive to the need for a course that will work in a Windows platform using the Colab jupyter notebook. Although I thoroughly enjoyed as far as I could get in V3, I had to abandon it due to the notebooks just not working in that environment. I also attempted to run the exercises http://dev.fast.ai/ and couldn’t get it to work at all on the first exercise.
I’m happy to pitch in and give feedback to help in this process. How can I help?

1 Like

I’d look at the Walk with fastai2, if it’s a colab requirement, it’s geared around v2 and focuses on a Colab-Specific outlook. That being said, this should get better with the fact that WSL2 now supports GPU’s, so windows users with a decent GPU can follow the course along.

Perhaps since so many people use it, a big * should be put in front of the course bits that need widgets, however there was also some workarounds in getting a native Jupyter server running in the colab environment specifically (so now it’s just like any other server)

(See here: Platform: Colab ✅)

These should alleviate some of the frustration

Otherwise (Jeremy may add to this too), where and how things break would help us sort out how to fix it :slight_smile: (If you can compile a list), though I believe there is no windows specific support for fastai2 at this moment IIRC

2 Likes

yes, I will start contributing to the where and how descriptions. I hate doing those kinds of posts because they feel like whining about something I should be (and am) really appreciative of.

4 Likes

Please, search the forums for fastai v2 on windows. I can say It works with out any problems currently (I could run it from the very beginning and few times problem with pytorch vesions and fastai v2 on windows) . For the collab part I suggest you look at muellerzr 's code examples and videos.

4 Likes

Given that I had already looked into it (about 3 months ago) and there seems to be tremendous interest in fastai TPU support, I can restart these efforts in the coming weeks then. I had run out of GCP credits but I should be able to get access soon.

Here is the repository with my (messy) work. I will update this repository with my progress as I dive back into this project during these upcoming weeks (just waiting to get funding for GCP use).

4 Likes

I have been working on an anonymous form for my study group, and just thought something similar might be useful if we wanted to identify where people are getting stuck in contributing. Its not the cleanest data, but it can offer a different perspective sometimes.

2 Likes

TL;TR: great explanation, jeremy + more docs + easier to contribute + make fastai to be able to use new SOTA Pytorch stuff + make reusing bits of fastai easier + make Muellerz course “official tutorial” until your course is out.

I was worried about fastai2 direction and future. Thanks jeremy for the clarifying post :slight_smile: I’m relative new in fastai (just started 1 year ago with your course) but I’m being using fastai2 6-7 months. Also, I’m not a big contributor (one 1 PR and isn’t merged :sweat: + reported a couple of bugs).

First of all, having a BDFL could be a good thing. Python or Blender software are good examples. However, community has to understand core decisions. If not, it’s very difficult to be engaged in the development if you felt that your voice will be ignored. Also, I know that you said it but I can’t stress enough to have clear guides in how to contribute to fastai.

I agree with wgpubs in all points except coding style. I like fastai style. For me, it’s more difficult the quirks in fastai2/fastcore (sometime quite obscure) or the steep learning curve. The last can be partially solved with a clear documentation. Regarding the fastai quirks, sometimes, fastai internals try to do in too much fastai style style that only Sylvian or you controls. For example, L lists are :top: but I don’t understand why we have Tuple class. Also, there is a lot of dark magic going around or assumptions that make difficult to adapt fastai to your domain. See the case of object detection / instance segmentation.

Make Muellerz course first citizens (pinned on top of fastaiv2 category) for the time being (until official courses are public). Almost all common problems are resolved by him :smiley: . Right now, I think they are the best courses to start with fastai2.

Here are my $0.02 about other things:

  • fastai2 extension is amazing IF it fits your problem. For example, adding things to the training loop (TPU included), new optimizers, etc. However, if the library isn’t though for your problem, it’s a nightmare due to the internals. Object detection, instance segmentation, etc are the best examples.
  • One/two person only can do up to a certain point. I feel that fastai could become a reference in some areas like tabular data, audio and timeseries. However, they aren’t first citizens although there are great people working on that. ¿Why? I suppose that they fall outside the initial scope of fastai although they are trying hard to make AI approachable.
  • TL;TR: the new pytorch fancy stuff developed by others must work without modifications . As others have said, fastai tries to do the things in it’s own style. Usually, I think that it’s better that way than the standard way. For me, the best example is the optimizers. I could implement QHM with just writing the new step proposed on the paper, without redoing/copying all SGD stuff. However, this comes with a penalty. You can’t reuse Pytorch stuff :confused: . So, fastai will try to catch up with SOTA advances that are implemented in Pytorch but not in fastai. I suggest to make wrappers for the common Pytorch things like dataloaders, optimizers, loss functions to avoid it.
  • Increase fastai usage outside fastai. Make it terrible simply to integrate bits of fastai into another/along side libraries. For example, use fastai Dataloaders + Pytorch Lightning trainer + fastai plotting capabilities. As this will be improved, perfect!.
  • New contributors: fastai dark magic + too many custom classes + assumptions + setting things at runtime instead as class/instance attributes makes difficult to contribute. Documentation will help but I feel that sometimes, code should be easier to understand. Knowing fastcore (typedispatch, pipelines, etc) + jumping to class definition should be enough. If not, fastai internals are a dead wall. You could built on top but not modify/fix fastai.
    Last two months, I was working with object detection problem. Inspired on Muellerz course, I give it a shot. Soon enough, I realize it wasn’t designed for this (reading forums post) but I use Pipeline (I love it!)+ transforms for preprocessing, cleaning, … the data. I faced so much bugs that it was discouraging. Using the standard approach (dataloader + transforms + learner) there was no bugs but using only transforms, things didn’t work. Why? Because bbox transforms assumes things like you pass your TensorBBox through PointScaler, some transforms lost img_sz metadata / modified it in place, some bugs that only appear if you use the transform isolate but not inside a datablock transforms, etc. I wanted to fix them but I didn’t know how.Most of them, I couldn’t track down and setting things at runtime didn’t help. For example, where the hell bbox.get_meta('img_sz') comes from???.
  • Small contributors: small contributors can fix bugs and small features/QoL improvement. So, they remove some burden on core devs. They won’t push the project, they fix low hanging features/bug so core devs can focus on harder/bigger things. However, to have them, the library needs an easier approach to contribute. When I face a bug, it should be easy to fix it (if you know fastcore) + make a PR.

Sorry for the long post :wink:

12 Likes

Thats a good point, maybe its worth continuing to keep the forums as the main knowledge base, I do find it invaluable when I find a solution from a year ago that solves my problem.

Not sure if its workable in practice but GitHub tags could just be related to necessary bug fixes/feature suggestions, while the forums could remain as a help centre for user issues (maybe with some additional help-related tags…)

1 Like

Unless @jeremy says otherwise, I have the WWF2 vision thread pinned until course release (and tab, soon text since it will be going on too), as I think you’re right this is a good idea

5 Likes

Hi,
I really appreciate the work of @jeremy and all FastAI. It has been an amazing journey but I am very curious about the future of ML course which remained v0.7 in this regard. I don’t see it updated and I tried to use it a few times and I encountered lots of problems (trying on my laptop or AWS or Paperspace etc) so finally I gave it up. I don’t have any other comments about ML, DL1 and DL2 courses and how they are taught other than that I loved the courses when I did them a year ago or so.
Looking forward to fastai2 and the book with impatience!

Yes, forums are a great for discussing issues. Especially when the user doesn’t know whether the error is result of a bug or doing something wrong. I have found multiple bugs without knowing they were bug in the first place,I will discuss it on fastaiv2 chat and Sylvian would fix it immediately.

Hi Jeremy, thanks a lot for all the wonderful work you’re doing.

One question, will you continue to use Swift for Tensorflow, or more specifically continue to work on SwiftAI, or is that now paused?

3 Likes

I can give my 2 cents from the experience I have with using Fastai. The documentation allows accessing the higher level API very easy, but it doesn’t help much with the middle and lower layers. @jeremy has already mentioned about the addition of kaggle datasets, training loops etc. , but I think documentation should do a better job in explaining the layered API explicitly.

1 Like

well, just to add to “good first issue” (thought maybe not needed) just found https://goodfirstissue.dev/language/python and how to add it there? so it can be listed there :slight_smile:.

Is the new course only the deep learning courses, or will there be a new “Intro To Machine Learning” course as well?

Also - wholeheartedly agree with everyone here. This library, courses, etc. have been awesome.

I personally like how the library is coded. I get why having universal coding standards are useful in a large organization to have some consistency across many teams from many countries, in large projects, with decent turnover, new hires, internal transfers, etc. That said, that isn’t really the situation with this library and so many of the benefits for the universal coding standard don’t apply. I am finding it super easy to dive into with how it is coded.

2 Likes

My understanding is that there will be no “Intro to Machine Learning” course because this course cover many Introductory ML concepts like linear regression, decision trees/random forests, etc.

1 Like

One can’t ask for a better assurance of quality and continuity than this from an open-source project’s BDFL!

Thank you @jeremy.

4 Likes

Sylvain and I aren’t working on that at the moment. Hopefully the s4tf team will keep working on it though!

This promise on continuous development is the most important things I think, so as a contributor, could keep contributing wo worrying about efforts be wasted, as a user, to believe that problems we have now will eventually be solved. :pray::heart::pray::heart:

IMHO, a design that will definitely work is to have doc like Pytorch, every function has a little example usage below the listed args explanation. It is clean and useful. :smiling_face_with_three_hearts: And it is friendly to Pytorch user, we might catch more users from it.:kissing_smiling_eyes: Maybe we could just micmic Pytorch doc first and then think what can we improve beyond it.:thinking:

Second with this. Especially the naming convention, I still don’t know what is o ,_o, ctx :crazy_face:, even I hack much on Datasets and show_batch

I think I can conclude the discussion about this with something we all can agree with at least. “Jeremy as BDFL, and we need more Sylvain”:joy:

Exactly what I think about fastai. When things doesn’t fit, I can hack it with the help of high degree of modulation, and pr, where I like it. :sunglasses: But there are sometimes I can only get messy workarounds after lots of efforts, mostly those problems reside in data lock transform datasets…:weary:

AFAIK, most of the loss functions are wrapped version of the pytorch ones in layers.py. And also you can pass a list of torch.util.data.Dataloader to Dataloaders so you can fallback to pytorch’s dataloader😘