Fastai v2 chat

muellerzr · January 1, 2020, 10:39pm

Thanks! I wound up figuring it out and wrote a little script to install the library without a manual restart:

import os
!pip install -q torch torchvision feather-format kornia pyarrow Pillow wandb nbdev fastprogress --upgrade 
!pip install -q git+https://github.com/fastai/fastcore  --upgrade
!pip install -q git+https://github.com/fastai/fastai2 --upgrade
os._exit(00)

lawrence · January 2, 2020, 1:48am

Does anyone know if there is a way to exclude images from a databunch on the fly, if the image file itself is missing? I created a databunch from a datablock, where the label metadata came from a csv file.

But the csv file turned out to include a few filenames for files that didn’t exist, which caused my model to crash in the middle of training. Checking beforehand turns out to be slow (I used Path.is_file() from Pathlib).

brismith · January 2, 2020, 2:59pm

I used os.path.isfile and it isn’t very fast - a couple of minutes for a million rows or so - but I feel this should be a one-time clean-up rather than hitting this during your training process. I’d be interested if there is a better way.

lawrence · January 2, 2020, 7:33pm

Yeah, I guess the question is whether it’s better to clean up the data on the fly by adding a line like this:

If not path.is_file()...skip this one...(and possibly warn)

to the DataLoader or perhaps the Databunch creation step, or whether you are asking for trouble by trying to continue with a csv metadata file that doesn’t match the file directory, and as you say it’s therefore better to clean it all up beforehand, even if the process is slower. (I think I convinced myself while writing this that the pre-cleaning is better).

s.s.o · January 3, 2020, 1:29am

Is it possible to do Image Regression (output y is float) currently? What is the right way to define data block as FloatBlock or something similar.

muellerzr · January 3, 2020, 1:31am

@s.s.o you should use TransformBlock() instead. IE blocks=(ImageBlock(), TransformBlock())

s.s.o · January 3, 2020, 1:34am

Thank you. How do you handle the data bunch source part any tips?

muellerzr · January 3, 2020, 1:37am

I used it similar to how the image regression one was one (I pointed to the image folder) IIRC. I don’t have the code in front of me though at the moment.

muellerzr · January 3, 2020, 1:40am

I’d recommend looking at the Rossmann example, it may help answer some questions I think

s.s.o · January 3, 2020, 1:41am

Ok. I look at your examples but it uses pointblock… I’ll try some more.

muellerzr · January 3, 2020, 1:42am

Same concept in the end though If you still can’t get it let me know and I’ll try to find when I did it. Finally jumping back into the code now that the holidays are done

Also @s.s.o is the dataset publicly available? I’d be interested in that for the study group

s.s.o · January 3, 2020, 8:32am

@muellerzr Currently the dataset not public we are still collecting. It’s dental data (not my domain tho). I’m trying to convince my colleagues to make it public.

you may check fastai user he shared his data.

https://hackernoon.com/building-an-age-predictor-web-app-using-deep-learning-25f0190ea18f

MichaelScofield · January 3, 2020, 2:46pm

Hi everyone.

Following the latest instruction for setting up v2 in Google Colab from [ Fastai-v2 - read this before posting please!, I had these errors:

First after the installing

ERROR: albumentations 0.1.12 has requirement imgaug<0.2.7,>=0.2.5, but you'll have imgaug 0.2.9 which is incompatible.
ERROR: gql 0.2.0 has requirement graphql-core<2,>=0.5.0, but you'll have graphql-core 2.2.1 which is incompatible.

Second after from fastai2.vision.all import *:

ImportError                               Traceback (most recent call last)
<ipython-input-1-533e7442bc6c> in <module>()
      1 from fastai2.basics import *
      2 from fastai2.callback.all import *
----> 3 from fastai2.vision.all import *
      4 from fastai2.notebook.showdoc import *
      5 

7 frames
/usr/local/lib/python3.6/dist-packages/torchvision/transforms/functional.py in <module>()
      3 import sys
      4 import math
----> 5 from PIL import Image, ImageOps, ImageEnhance, PILLOW_VERSION
      6 try:
      7     import accimage

ImportError: cannot import name 'PILLOW_VERSION'

Today is Friday, 1/3/2020.
Is there any way to fix this? Thanks.

muellerzr · January 3, 2020, 2:51pm

You’re not alone. I’ve noticed this too.

sgugger · January 3, 2020, 5:48pm

This looks like an error in torchvision, not fastai (stack trace ends there) so you should raise an issue in their repo.

lgvaz · January 3, 2020, 11:19pm

This was caused by a new Pillow release.

For now you should install Pillow by doing:

!pip install Pillow==6.2.1

lawrence · January 4, 2020, 6:23pm

I have a fastai2 model producing promising results and I am wondering about a remark Jeremy made in class: that he did well by training a model on small images and then scaling up to larger images. Two questions:

How do you decide when it’s time to scale up? Is there any useful indicator, or is it just a matter of how much time you have left? I haven’t tried it yet so I don’t have a gut feeling for how much large images will slow the training, or how long it will take to get the model back to the same level of accuracy with larger images; and
How big to go? Is there a limit either to the image size or to the size increment where you lose advantage from going any larger? The main advantage I see is that the center crop on the test images will be larger.

I know I can figure this out the hard way, but I’m hoping some of you who are much more experienced than I am will have words of wisdom.

muellerzr · January 5, 2020, 12:59am

@sgugger just a minor tidbit I want to be sure of. In the most recent version you adjusted predict, and it seems we can no longer do something like the following:

learn.predict('image1.png')

Instead I need to make a path first. (IE path_im = Path('image1.png')) Is this a permanent adjustment?

Thanks!

sgugger · January 5, 2020, 9:38pm

I have no idea what your dataset look like, but as the error message should have warned you, the predict method was expecting one of the type encountered while processing your training/validation data.
This is permanent, yes.

muellerzr · January 5, 2020, 9:40pm

It was an image on the local directory, and the warning was it was not a Path type or an image in the dataset. We used to be able to just pass in a string for the file location and it would convert it to a path object (PathOrStr IIRC). Got it. Thanks

Ah wait I totally mis-read that. Looking at the lesson 2 example: pred_class,pred_idx,outputs = learn.predict(path/'black'/'00000021.jpg') it still used a path. Sorry!