Really frustrating experience with fastai

I am not really sure where to go with this, but I am many many hours into trying to do some basic things and I wanted to vent some frustration that hopefully can turn into constructive criticism instead of me cursing at my computer and getting angry at fastai – for which I am genuinely extremely grateful. I have learned a lot, in a much shorter time frame than I’d have assumed possible.

I assume, like most people, I came to fastai through the courses (the online lectures). All the lectures are using an old version of the code, while all the conversation s around 1.0. So if anything goes wrong in any of the notebooks, (or if you just try and install fastai and run notebooks) you immediately are up against a lot of confusion and errors since there are API breaking changes between the current version of the library and the version which everyone is introduced.

Like many others, I have been renting GPU time from paperspace and have tried to migrate to google colab which has been a nightmare. Pytorch and fastai are constantly dependency fighting each other.

I am a noob in machine learning and almost a complete neophyte to python as well. But I am very proficient in linux and software development in general; so its not purely a matter of me just being dumb.

Pretty much all the co-lab stuff in these forums is unspecific whether it works with new versions of lectures or old versions of lectures, and where it is specific, it is for a version of the course 2 versions ahead of what is in the lectures available online.

There is a v3 course with a oneline bash script that sets up colab, which is great. But it doesn’t work for the lectures that are available to the public.

I did however try and follow the v3 lesson3-imdb lecture (which seems to be the analog to the v1 lesson4-imdb) notebook which I am trying to adapt to my own data.

This brings me to my second main criticism of fastai – where the heck is the documentation or high level explanation of what is going on?

I have all of my data in a string… I cleaned it outside of python in something I am more familiar with and have it all loaded into a single text file. There is no TextDataBunch.fromString so I don’t know how to get my data into TextDataBunch

so I google around, and there is literally zero documentation outside of blogposts by students and the notebooks themselves on what TextDataBunch actually is. So it’s just like a wall – there is nothing else to read but the source code.

And I have felt that similarly multiple times in this course, which admittedly isn’t that odd when learning something new – However, I distinctly get the feeling that if I am not able to put my data exactly into the pipeline as Jeremy has it; I don’t really have any tools to understand further what to do.

The new lecture says “this does both things for you” but no explanation “what if you did one of those things already” - and this general theme is played out in all the lectures. Which – isn’t exactly a criticism, it’s also the thing that makes the course move so quickly. But it’s a problem when the only thing that exists to learn from are the lectures.

Anyway, in short. I really feel documentation and high level explanation of how to use the fastai library is either missing, or severly lacking in SEO because I cannot find it. I also think that releasing the lectures while simultaneously making api breaking changes to the underlying library was a pretty big mistake and probably caused your MOOC students to spend way more time on sysadmin stuff than is desirable.

I say all the above with a lot of frustration, but I do want to re-iterate, I am very thankful for this course; but I think it could be much easier for beginners without access the the live lectures to begin to grok all this stuff.

2 Likes

Having been through the earlier versions of the class not in real-time, I understand what you are feeling. Been there, bought the t-shirt. Part of the issue is just the speed at which deep learning is evolving. Another is that the library is constantly being updated, so forum search results may or may not apply to the code you are currently working with.

Version 1 documentation (https://docs.fast.ai/) helps a lot with the learning curve. Unfortunately, the public videos don’t use V1, and V1 has streamlined some things so the issue you have with V 0.7 may be a default setting in V1. As you have software dev experience, I might suggest watching the videos for conceptual understanding, but then building your notebooks from scratch with V1.

Thanks, that is actually what I have come to and what I am doing now.

It’s an incredibly frustrating experience but at least it feels like there is some solid ground beneath me when I make progress.

It’s also unfortunate timing ; part 1 v3 version of the course is going on right now, on the latest fastai version. However you had to sign up beforehand in order to follow the course live, and it won’t be released as a MOOC immediately.

If you go to this link you’ll see the repo for the course v3, you can find the new notebooks in there. Hopefully you can watch the v2 lectures to understand the high-level concepts, and then go through the v3 notebooks with the help of the docs in order to understand how to implement them in the latest fastai version.

Hopefully subsequent editions of the course will be more backward compatible as the fastai librairy settles its API.

I understand the feeling. Jeremy is very much of the move fast and break things school of thought. We all get tremendous value from his heroic effort to build fastai, but sometimes it can be frustrating to keep up with a genius.

That said, there is a simple short term hack if you want to follow the v2 version of the lectures while working through the corresponding v2 notebooks. Just clone fastai, then in the root folder there’s a folder named “old”, which as its name suggests contains the pre-1.0 alpha version of the fastai library. Just create a symlink named “fastai” pointing to that from within courses/dl1 and then the v2 notebooks will work fine. There actually is a lot of material on the forums helping with the various quirks of the alpha version, so most problems can be solved with a little mucking around.