In 002_images.ipynb there is a very complex chain of ands, ors and nots (#1):
def get_image_files(c:Path, check_ext:bool=True)->FilePathList:
[...]
return [o for o in list(c.iterdir())
if not o.name.startswith('.') and not o.is_dir()
and (not check_ext or (o.suffix in image_extensions))]
I had a bit of a smoke coming up parsing the last line in my head.
Won’t this be more readable (#2):
if not o.name.startswith('.') and not o.is_dir()
and not (check_ext and o.suffix not in image_extensions)
And then it allows us to drop 2 nots (#3), but the above is fine too - it’s consistent on negating everything and there are less parenthesis:
if not (o.name.startswith('.') or o.is_dir()
or (check_ext and o.suffix not in image_extensions))
Didn’t have time to post a message here yesterday, but the modules have been added in this commit
I made a few changes this morning in this commit then corrected bugs and added the all_ for each module that needs it in this commit.
Finally in this commit I added five examples notebooks to check everything was working well (dogs and cats, cifar10, imdb classification, movie lens and rossmann).
As Jeremy explained, you shouldn’t touch the dev_nb anymore (except to add prose). Bug fixes should be done in the modules directly! You should also use a pip install -e of the new library to test those notebooks, to easily have the latest version installed.
One last commit about module developments for a while. Just added mixup that allows us to get very fast results on cifar10 (6 minutes for 94% accuracy).
After the recent rewrite of history you may get this error when running: git pull (or direct merge)
fatal: refusing to merge unrelated histories
The easiest way to fix this if you have forked fastai/fastai_v1 and you don’t have any branches that you want to keep, is to nuke your fork by following the delete option at the end of:
which depending on when you synced your forked repository with fastai/fastai_v1 may create a gazillion of conflicts or not. In my case it did, so I decided re-doing the fork is the easiest option.
Once you resolved the conflicts, push back in to sync it:
git push --set-upstream origin master
If you don’t use the forked project, but a direct fastai/fastai_v1 checkout you can use -allow-unrelated-histories with git pull, or simply make a new checkout and copy your work files over to the fresh repo checkout.
The initial documentation website commit has been done - we’ve simply imported the standard jekyll documentation template at this point, and are now working on filling it in with our docs. So everything inside doc/ now is from the template. Once we’ve figured out what we need we can remove some of the redundant stuff.
I originally checked in the vendor/ directory for jekyll, but changed my mind after feedback from @stas, which is why I had to rewrite history - see the previous message from @stas if that causes any problems for you (should only have a problem if you have a fork).
Yay it looks like this doc template may just work nicely! I just manually popped in one page for testing, and it’s looking pretty good without even customizing anything much at all:
It made more sense to have DatasetBase and LabelDataset in data
I renamed all the data_from_* function to something more consistent like {type}_data_from_* so for instance image_data_from_folder, text_data_from_tokens or tabular_data_from_df.
Then I changed all the references to those functions in the example notebooks.
FYI, I started working on packaging and created ‘package’ branch for it. It will be a slow start as I’m following a tutorial, but hopefully it’ll go faster once I figure it out. And Jeremy already created setup.py so it’s a good start.
And we need to finalize the renaming anyway before we can upload anything to pypi and friends.