Lesson 1 In-Class Discussion ✅

Michal_w · October 26, 2018, 8:46pm

I run the first lesson on cuda 90 and torch 0.4 (i will have to only specify padding mode )
In ImageDatabunch

data = data = ImageDataBunch.from_csv(words_path,
ds_tfms=get_transforms(do_flip=False,
flip_vert=False,
max_rotate=0),
size=224,
padding_mode=‘zeros’,
test=words_path/‘test’
)
data.normalize(imagenet_stats);

not mentally ready to rebuild my PC

M

digitalspecialists · October 26, 2018, 9:30pm

You can (could? the library moves fast) find them in data.valid_ds.x. In a current kaggle comp I am plotting my filenames against images in a copy of ClassificationInterpretation:plot_top_losses with self.data.valid_ds.x[idx] for exactly the purpose you describe, uncovering ground truth label errors to make a better classifier.

reshama · October 26, 2018, 10:16pm

(I am on GCP)
Is it ok to update pip:

jupyter@my-fastai-instance:~/tutorials/fastai$ pip list | grep torch
torch                              1.0.0.dev20181022
torchvision                        0.2.1            
You are using pip version 10.0.1, however version 18.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
jupyter@my-fastai-instance:~/tutorials/fastai$

Or, is that something that can be adjusted in a global requirements.txt file?

wgpubs · October 26, 2018, 11:15pm

Thanks much for the tip.

What’s funny is that .x doesn’t come up in the intellisense so I had no idea it was there. It works though which is awesome

interp.data.valid_ds.x[interp.top_losses(9)[1]]

… works great.

wgpubs · October 27, 2018, 12:06am

Re: learn.fit_one_cycle(3, max_lr=slice(1e-7,1e-5))

Is the slice(1e-7,1e-5)) setting the mix/max LR for one cycle … so that it goes from 1e-7 to 1e-5 and then back down to 1e-7 and then some?

jeremy · October 27, 2018, 12:25am

Here you go:

https://docs.fast.ai/basic_train.html#Discriminative-layer-training

jerbly · October 27, 2018, 12:42am

I’m trying to figure out what is the recommended way to load a trained model purely for prediction. In my training notebook I can call:

learn.save('my-model')

which will save a my-model.pth file. How can I load this model? I was looking for a class-method on Learner that would instantiate it from the saved file. Does anyone have a recommendation here?

agr · October 27, 2018, 12:58am

When loading in my own image data, I got the following error from PIL:

IOError: image file is truncated (1 bytes not processed)

I found an answer on SO that advised changing a PIL attribute:

from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

Doing this removed the error for me and i was able to train my learners without error going forward (with decent results), but as I don’t really understand what I did I’m not sure what else I may have changed in setting this value.

Has anyone else come across this error? Any suggestions on whether this is a viable solution or not? Any better alternatives? TIA!

basu · October 27, 2018, 1:42am

@jeremy Hi , is there no precompute feature in fastai_1.0x ?

prajjwal1 · October 27, 2018, 2:10am

My data is quite large, my Learner seems to be getting stuck and its taking 9-10 hours for one epoch. Is there any efficient way to take only a sample of dataset or training it on subset and then continuing with another subset without disturbing proportion of classes?

fredguth · October 27, 2018, 2:29am

A - I am not sure if I understood your question. Anyway, when you use pretrained model, fastai create layers that cutomize that arch to a specific problem, based on your databunch. You can also change these layers.
C - Exactly that, can’t you find it?
D - check the SaveModelCallback which is in tracker.py.
E - All pytorch models + models defined in text/models, vision/models etc.
I - yes, you can. There are many ways to do that. You could use a csv file to define what is train, valid or test (from_csv) . You can use from_lists. There are several ways.
J - Yes, you can.

fredguth · October 27, 2018, 2:30am

Are you using all your gpu RAM?

fredguth · October 27, 2018, 2:34am

fit one cycle is a training strategy that help you achieve better results in less epochs. The idea is to start from a very small lr and linearly increase it each epoch up to the lr you specified in fit. Than it will linearly decrease it. There are more than one discussion in the forum about this. And also, there is a very good notebook from @sgugger in Github. You should check.

prajjwal1 · October 27, 2018, 2:36am

Yes. All of 16 Gigs

prajjwal1 · October 27, 2018, 2:40am

I can divide a folder but i don’t want to disturb proportion of clsses. Can we take a sample from get_image_files or ImageDataBunch, not sure if they allow taking a proportionate sample.

ecdrid · October 27, 2018, 2:45am

What exactly happens behind the scenes when one calls

ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=224, bs=bs)

My attempt - I am not sure enough but this is my current understanding and here is a short desc
(help me find the correct one)

As soon as we call ImageDataBunch, the class gets instantiated which in-turn inherits from (kind of base class for many others too) this which in-turn sets up the basic things which we need like(train_dl, valid_dl and test_dl to the suitable device (gpu/cpu), tfms are DL tfms (normalize the images (here)). path is for models etc)(train_dl->Dataloader for the training set, batchify etc) which calls the DataLoader to get the things done
The ds_tfms param is basically collecting all the possible transformations tfms which we can apply and that calls transform_datasets which primarily calls this and applies the passed tfms via apply_tfms( the “DatasetTfm” contains the original dataset and a transformer named as “tfm” is applied to the original image (data.train_ds.ds) and stored in data.train_ds)
The re is matched and the result is escalated to different funcs of the same class like from_name_func, from_lists and then create…

So in short one simple line abstracts a hell lot of things, makes it looks easier and thus very important to understand what exactly it does!

And here’s what the nos on the plot interpretation plot meant
f'{classes[self.pred_class[idx]]}/{classes[t[1]]} / {self.losses[idx]:.2f} / {self.probs[idx][t[1]]:.2f}')

(so primarily the nos mean the predicted class by the model, the actual class, the loss associated with the pred made, not sure about the last one as of now(seems that its the prediction for the actual class)
(Also the preds values are rounded off to 2 decimals, so that 1 isn’t necessarily a 1)

Thanks

Anyone want to map the main funcs calls in a pdf like the last time someone did? (Will be very helpful)

Just discovered the second lec pre notebook_camvid_image_segmentation
Sweet!

Edit You can add this to see the stack (Warning Will generate a huge huge output for every function call made…

skottapa · October 27, 2018, 3:28am

would like to join as well in case a group is being formed

matwong · October 27, 2018, 3:35am

May be I am ahead of myself. Once I trained a model, how do I run simple inference or even deploy it (I know a flask is involved). I just did a horse breed classifier based on lesson 1, and just curious how do I use it in a real app if I want to. I scan thru the doc but I did see it.

raghavab1992 · October 27, 2018, 3:40am

@prajjwal1 how large and are u sure ur using ur gpu? can check with nvidia-smi command

ricknta · October 27, 2018, 3:41am

Francisco it looks great! But I get an error in:

download_images(path/file, dest, max_pics=200)

NameError Traceback (most recent call last)
in
----> 1 download_images(path/file, dest, max_pics=200)

NameError: name ‘download_images’ is not defined

…which I would assume means I don’t have the latest fastai version - but I did update the library (twice, just to make sure). I’m on Salamander. Any suggestions?