I am not really sure where to go with this, but I am many many hours into trying to do some basic things and I wanted to vent some frustration that hopefully can turn into constructive criticism instead of me cursing at my computer and getting angry at fastai – for which I am genuinely extremely grateful. I have learned a lot, in a much shorter time frame than I’d have assumed possible.
I assume, like most people, I came to fastai through the courses (the online lectures). All the lectures are using an old version of the code, while all the conversation s around 1.0. So if anything goes wrong in any of the notebooks, (or if you just try and install fastai and run notebooks) you immediately are up against a lot of confusion and errors since there are API breaking changes between the current version of the library and the version which everyone is introduced.
Like many others, I have been renting GPU time from paperspace and have tried to migrate to google colab which has been a nightmare. Pytorch and fastai are constantly dependency fighting each other.
I am a noob in machine learning and almost a complete neophyte to python as well. But I am very proficient in linux and software development in general; so its not purely a matter of me just being dumb.
Pretty much all the co-lab stuff in these forums is unspecific whether it works with new versions of lectures or old versions of lectures, and where it is specific, it is for a version of the course 2 versions ahead of what is in the lectures available online.
There is a v3 course with a oneline bash script that sets up colab, which is great. But it doesn’t work for the lectures that are available to the public.
I did however try and follow the v3 lesson3-imdb lecture (which seems to be the analog to the v1 lesson4-imdb) notebook which I am trying to adapt to my own data.
This brings me to my second main criticism of fastai – where the heck is the documentation or high level explanation of what is going on?
I have all of my data in a string… I cleaned it outside of python in something I am more familiar with and have it all loaded into a single text file. There is no TextDataBunch.fromString so I don’t know how to get my data into TextDataBunch
so I google around, and there is literally zero documentation outside of blogposts by students and the notebooks themselves on what TextDataBunch actually is. So it’s just like a wall – there is nothing else to read but the source code.
And I have felt that similarly multiple times in this course, which admittedly isn’t that odd when learning something new – However, I distinctly get the feeling that if I am not able to put my data exactly into the pipeline as Jeremy has it; I don’t really have any tools to understand further what to do.
The new lecture says “this does both things for you” but no explanation “what if you did one of those things already” - and this general theme is played out in all the lectures. Which – isn’t exactly a criticism, it’s also the thing that makes the course move so quickly. But it’s a problem when the only thing that exists to learn from are the lectures.
Anyway, in short. I really feel documentation and high level explanation of how to use the fastai library is either missing, or severly lacking in SEO because I cannot find it. I also think that releasing the lectures while simultaneously making api breaking changes to the underlying library was a pretty big mistake and probably caused your MOOC students to spend way more time on sysadmin stuff than is desirable.
I say all the above with a lot of frustration, but I do want to re-iterate, I am very thankful for this course; but I think it could be much easier for beginners without access the the live lectures to begin to grok all this stuff.