Update on fastai2 progress and next steps

I’d look at the Walk with fastai2, if it’s a colab requirement, it’s geared around v2 and focuses on a Colab-Specific outlook. That being said, this should get better with the fact that WSL2 now supports GPU’s, so windows users with a decent GPU can follow the course along.

Perhaps since so many people use it, a big * should be put in front of the course bits that need widgets, however there was also some workarounds in getting a native Jupyter server running in the colab environment specifically (so now it’s just like any other server)

(See here: Platform: Colab ✅)

These should alleviate some of the frustration

Otherwise (Jeremy may add to this too), where and how things break would help us sort out how to fix it :slight_smile: (If you can compile a list), though I believe there is no windows specific support for fastai2 at this moment IIRC

2 Likes

yes, I will start contributing to the where and how descriptions. I hate doing those kinds of posts because they feel like whining about something I should be (and am) really appreciative of.

4 Likes

Please, search the forums for fastai v2 on windows. I can say It works with out any problems currently (I could run it from the very beginning and few times problem with pytorch vesions and fastai v2 on windows) . For the collab part I suggest you look at muellerzr 's code examples and videos.

4 Likes

Given that I had already looked into it (about 3 months ago) and there seems to be tremendous interest in fastai TPU support, I can restart these efforts in the coming weeks then. I had run out of GCP credits but I should be able to get access soon.

Here is the repository with my (messy) work. I will update this repository with my progress as I dive back into this project during these upcoming weeks (just waiting to get funding for GCP use).

4 Likes

I have been working on an anonymous form for my study group, and just thought something similar might be useful if we wanted to identify where people are getting stuck in contributing. Its not the cleanest data, but it can offer a different perspective sometimes.

2 Likes

TL;TR: great explanation, jeremy + more docs + easier to contribute + make fastai to be able to use new SOTA Pytorch stuff + make reusing bits of fastai easier + make Muellerz course “official tutorial” until your course is out.

I was worried about fastai2 direction and future. Thanks jeremy for the clarifying post :slight_smile: I’m relative new in fastai (just started 1 year ago with your course) but I’m being using fastai2 6-7 months. Also, I’m not a big contributor (one 1 PR and isn’t merged :sweat: + reported a couple of bugs).

First of all, having a BDFL could be a good thing. Python or Blender software are good examples. However, community has to understand core decisions. If not, it’s very difficult to be engaged in the development if you felt that your voice will be ignored. Also, I know that you said it but I can’t stress enough to have clear guides in how to contribute to fastai.

I agree with wgpubs in all points except coding style. I like fastai style. For me, it’s more difficult the quirks in fastai2/fastcore (sometime quite obscure) or the steep learning curve. The last can be partially solved with a clear documentation. Regarding the fastai quirks, sometimes, fastai internals try to do in too much fastai style style that only Sylvian or you controls. For example, L lists are :top: but I don’t understand why we have Tuple class. Also, there is a lot of dark magic going around or assumptions that make difficult to adapt fastai to your domain. See the case of object detection / instance segmentation.

Make Muellerz course first citizens (pinned on top of fastaiv2 category) for the time being (until official courses are public). Almost all common problems are resolved by him :smiley: . Right now, I think they are the best courses to start with fastai2.

Here are my $0.02 about other things:

  • fastai2 extension is amazing IF it fits your problem. For example, adding things to the training loop (TPU included), new optimizers, etc. However, if the library isn’t though for your problem, it’s a nightmare due to the internals. Object detection, instance segmentation, etc are the best examples.
  • One/two person only can do up to a certain point. I feel that fastai could become a reference in some areas like tabular data, audio and timeseries. However, they aren’t first citizens although there are great people working on that. ¿Why? I suppose that they fall outside the initial scope of fastai although they are trying hard to make AI approachable.
  • TL;TR: the new pytorch fancy stuff developed by others must work without modifications . As others have said, fastai tries to do the things in it’s own style. Usually, I think that it’s better that way than the standard way. For me, the best example is the optimizers. I could implement QHM with just writing the new step proposed on the paper, without redoing/copying all SGD stuff. However, this comes with a penalty. You can’t reuse Pytorch stuff :confused: . So, fastai will try to catch up with SOTA advances that are implemented in Pytorch but not in fastai. I suggest to make wrappers for the common Pytorch things like dataloaders, optimizers, loss functions to avoid it.
  • Increase fastai usage outside fastai. Make it terrible simply to integrate bits of fastai into another/along side libraries. For example, use fastai Dataloaders + Pytorch Lightning trainer + fastai plotting capabilities. As this will be improved, perfect!.
  • New contributors: fastai dark magic + too many custom classes + assumptions + setting things at runtime instead as class/instance attributes makes difficult to contribute. Documentation will help but I feel that sometimes, code should be easier to understand. Knowing fastcore (typedispatch, pipelines, etc) + jumping to class definition should be enough. If not, fastai internals are a dead wall. You could built on top but not modify/fix fastai.
    Last two months, I was working with object detection problem. Inspired on Muellerz course, I give it a shot. Soon enough, I realize it wasn’t designed for this (reading forums post) but I use Pipeline (I love it!)+ transforms for preprocessing, cleaning, … the data. I faced so much bugs that it was discouraging. Using the standard approach (dataloader + transforms + learner) there was no bugs but using only transforms, things didn’t work. Why? Because bbox transforms assumes things like you pass your TensorBBox through PointScaler, some transforms lost img_sz metadata / modified it in place, some bugs that only appear if you use the transform isolate but not inside a datablock transforms, etc. I wanted to fix them but I didn’t know how.Most of them, I couldn’t track down and setting things at runtime didn’t help. For example, where the hell bbox.get_meta('img_sz') comes from???.
  • Small contributors: small contributors can fix bugs and small features/QoL improvement. So, they remove some burden on core devs. They won’t push the project, they fix low hanging features/bug so core devs can focus on harder/bigger things. However, to have them, the library needs an easier approach to contribute. When I face a bug, it should be easy to fix it (if you know fastcore) + make a PR.

Sorry for the long post :wink:

12 Likes

Thats a good point, maybe its worth continuing to keep the forums as the main knowledge base, I do find it invaluable when I find a solution from a year ago that solves my problem.

Not sure if its workable in practice but GitHub tags could just be related to necessary bug fixes/feature suggestions, while the forums could remain as a help centre for user issues (maybe with some additional help-related tags…)

1 Like

Unless @jeremy says otherwise, I have the WWF2 vision thread pinned until course release (and tab, soon text since it will be going on too), as I think you’re right this is a good idea

5 Likes

Hi,
I really appreciate the work of @jeremy and all FastAI. It has been an amazing journey but I am very curious about the future of ML course which remained v0.7 in this regard. I don’t see it updated and I tried to use it a few times and I encountered lots of problems (trying on my laptop or AWS or Paperspace etc) so finally I gave it up. I don’t have any other comments about ML, DL1 and DL2 courses and how they are taught other than that I loved the courses when I did them a year ago or so.
Looking forward to fastai2 and the book with impatience!

Yes, forums are a great for discussing issues. Especially when the user doesn’t know whether the error is result of a bug or doing something wrong. I have found multiple bugs without knowing they were bug in the first place,I will discuss it on fastaiv2 chat and Sylvian would fix it immediately.

Hi Jeremy, thanks a lot for all the wonderful work you’re doing.

One question, will you continue to use Swift for Tensorflow, or more specifically continue to work on SwiftAI, or is that now paused?

3 Likes

I can give my 2 cents from the experience I have with using Fastai. The documentation allows accessing the higher level API very easy, but it doesn’t help much with the middle and lower layers. @jeremy has already mentioned about the addition of kaggle datasets, training loops etc. , but I think documentation should do a better job in explaining the layered API explicitly.

1 Like

well, just to add to “good first issue” (thought maybe not needed) just found https://goodfirstissue.dev/language/python and how to add it there? so it can be listed there :slight_smile:.

Is the new course only the deep learning courses, or will there be a new “Intro To Machine Learning” course as well?

Also - wholeheartedly agree with everyone here. This library, courses, etc. have been awesome.

I personally like how the library is coded. I get why having universal coding standards are useful in a large organization to have some consistency across many teams from many countries, in large projects, with decent turnover, new hires, internal transfers, etc. That said, that isn’t really the situation with this library and so many of the benefits for the universal coding standard don’t apply. I am finding it super easy to dive into with how it is coded.

2 Likes

My understanding is that there will be no “Intro to Machine Learning” course because this course cover many Introductory ML concepts like linear regression, decision trees/random forests, etc.

1 Like

One can’t ask for a better assurance of quality and continuity than this from an open-source project’s BDFL!

Thank you @jeremy.

4 Likes

Sylvain and I aren’t working on that at the moment. Hopefully the s4tf team will keep working on it though!

This promise on continuous development is the most important things I think, so as a contributor, could keep contributing wo worrying about efforts be wasted, as a user, to believe that problems we have now will eventually be solved. :pray::heart::pray::heart:

IMHO, a design that will definitely work is to have doc like Pytorch, every function has a little example usage below the listed args explanation. It is clean and useful. :smiling_face_with_three_hearts: And it is friendly to Pytorch user, we might catch more users from it.:kissing_smiling_eyes: Maybe we could just micmic Pytorch doc first and then think what can we improve beyond it.:thinking:

Second with this. Especially the naming convention, I still don’t know what is o ,_o, ctx :crazy_face:, even I hack much on Datasets and show_batch

I think I can conclude the discussion about this with something we all can agree with at least. “Jeremy as BDFL, and we need more Sylvain”:joy:

Exactly what I think about fastai. When things doesn’t fit, I can hack it with the help of high degree of modulation, and pr, where I like it. :sunglasses: But there are sometimes I can only get messy workarounds after lots of efforts, mostly those problems reside in data lock transform datasets…:weary:

AFAIK, most of the loss functions are wrapped version of the pytorch ones in layers.py. And also you can pass a list of torch.util.data.Dataloader to Dataloaders so you can fallback to pytorch’s dataloader😘

Hi guys,
I just though it could be a good idea to start studying the Deep Learning with PyTorch which was officially released 4 days ago in the mean time till the release of the Jeremy’s book and new course. If you are interested in studying it in a group, join this thread

I second @jeremy’s mission to make AI more accessible. I’ve invested a significant amount of time in fastai2 and I’d like to share my experience so far and how it is related to that mission:

  • If you want to run simple use case, fastai2 works pretty well, but to be honest it’s a matter of preference vs. other libraries. 10 lines of code vs. 20 lines of code?

  • If anything fails, you probably would need to dive into fastai2 internals, just 1 example: .predict(item) if you have metrics defined in your leaner such as auc it will fail b/c if needs more than 1 class with an obscure error b/c metrics are called even if you call .predict() on a single item.

  • I understand naming and coding conventions inside the library may be peculiar which is fine if its internals are hidden from you (although I’ve been forced to do dive into them more than I care to admit); but regardless naming conventions and inconsistencies get out of the library pretty quickly:
    Why is it learn and not learner?
    TfmdLists?
    before_batch vs. after_batchbegin_batch vs. after_batch Why not before vs after or begin vs. end? (I hope it’s not b/c it was meant to be beginning vs end but beginning was cut à la learn)
    In retrospect one of the biggest improvements of fastai vs fastai2 was the naming change from databunch vs dataloaders? It turned out a databunch was just (almost?) a bunch of DataLoaders! I wonder for example if there is a better name for TfmdList, which now that I think understand is really a transformed dataset (indexed by a list).

Please take the above as constructive criticism. I do not have time to become a contributor and I just wanted to use the library, but I feel with fastai2 if you want to use its power you have to become an expert in its internals and I know it’s cheaper to critic something than to contribute fixing it; but given how ideosyncratic fastai2 is, I am not sure things such as naming conventions are open to change.

6 Likes