Lesson 1 In-Class Discussion ✅

gstrack · January 27, 2019, 3:35am

I figured out why I wasn’t getting great results with resnet32. I had set the size of the data files to 24 instead of 224. Just a small typo but it made a huge difference in results.

epoch	train_loss	valid_loss	error_rate
1	2.749509	0.876414	0.163814
2	0.885891	0.314874	0.066015
3	0.395196	0.244511	0.056235
4	0.263029	0.226939	0.055012

wyquek · January 27, 2019, 3:51am

I was playing around with lesson 1 three weeks ago with 102 flowers too (not sure if its the same 102 cos I just got the data from kaggle), but a resnet18 got me 0.0378 error rate here. I did not try res50 or res32 though.

AlisonDavey · January 27, 2019, 9:36am

It is called fine-tuning because you are only changing the pre-trained weights a little.

At the beginning you load the weights that work well on ImageNet onto the model, then you retrain the head the model of the model, adjusting just the weights of this part, to your specific data with the desired number of classes.

Once this is working well, you unfreeze the whole model, and allow all the weights to change – with a small learning rate so they just change a little – hence fine-tuning.

raimanu-ds · January 27, 2019, 10:06am

Hi all,

Since yesterday, I have been trying to load an external dataset of images (Fashion MNIST) but was unsuccessful.

I have tried untar_data() but it didn’t work. It seems to load the dataset since the .gz file appeared in the data folder (left panel in my screenshot below) but I get an error.

"Downloaded file {fname} does not match checksum expected! Remove that file from {data_dir} and try your code again."

Should I be using another function, if so, which one ?

Thanks for your help.
(P.S. I am using Google Colab)

Lngy00 · January 27, 2019, 2:43pm

Thanks!

mgp · January 27, 2019, 6:31pm

Thanks Alison. Didn’t realise we were changing hem just a little from their current status. Thought it relaunched the learning from pre-trained status.

rogerallen · January 28, 2019, 12:17am

2nd UPDATE & FIX: see Unofficial Setup thread (Local, AWS) for better setup instructions. It seems the instructions I was using were out of date. I’ll keep my post below in case someone else hits the same issue.

Trying out Lesson 1 I am hitting an error in the call data.show_batch(rows=3, figsize=(7,6)) and also when I call learn.fit_one_cycle(4), the stack looks like this for the first call:

RuntimeError: Traceback (most recent call last):
  File "/home/rallen/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/rallen/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/rallen/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/fastai/data_block.py", line 486, in __getitem__
    x = x.apply_tfms(self.tfms, **self.tfmargs)
  File "/home/rallen/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/fastai/vision/image.py", line 113, in apply_tfms
    else: x = tfm(x)
  File "/home/rallen/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/fastai/vision/image.py", line 498, in __call__
    return self.tfm(x, *args, **{**self.resolved, **kwargs}) if self.do_run else x
  File "/home/rallen/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/fastai/vision/image.py", line 445, in __call__
    if args: return self.calc(*args, **kwargs)
  File "/home/rallen/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/fastai/vision/image.py", line 450, in calc
    if self._wrap: return getattr(x, self._wrap)(self.func, *args, **kwargs)
  File "/home/rallen/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/fastai/vision/image.py", line 167, in coord
    self.flow = func(self.flow, *args, **kwargs)
  File "/home/rallen/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/fastai/vision/transform.py", line 227, in symmetric_warp
    return _perspective_warp(c, targ_pts, invert)
  File "/home/rallen/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/fastai/vision/transform.py", line 213, in _perspective_warp
    return _apply_perspective(c, _find_coeffs(_orig_pts, targ_pts))
  File "/home/rallen/anaconda3/envs/fastaiv3/lib/python3.7/site-packages/fastai/vision/transform.py", line 194, in _find_coeffs
    return torch.gesv(B,A)[0][:,0]
RuntimeError: B should have at least 2 dimensions, but has 1 dimensions instead

I do see other messages asking about this issue Developer chat and Verify_images error but no resolution as yet, or just a suggestion that nightly pytorch should fix it. My versions for fastai=1.0.34 & pytorch=1.0.0.dev20190127

UPDATE: If you do the default instructions, installing torchvision after pytorch-nightly, you will actually end up with pytorch 0.4.1 installed. Check this with import torch; print(torch.__version__)

The current Lesson 1 doc seems to only work with pytorch 0.4.1. If I follow the default instructions & don’t force-reinstall nightly, things work okay–modulo the fixes some folks above have done. Mainly, the change to add `padding_mode=‘zeros’ to ImageDataBunch.from_name_re() seems to be a way to get fastai to work with pytorch 0.4.1.

I hit this error because I forced pytorch 1.0 to be installed since installing pytorch nightly was the guidance given in the setup. Does fastai really want us to be running with 1.0.xxx or with 0.4.1?

RogerS49 · January 28, 2019, 9:36am

Karl are the stats channels in RGB sequence. Thanks

AlisonDavey · January 28, 2019, 9:52am

Yes, they are in RGB order, here’s confirmation https://pytorch.org/docs/stable/torchvision/models.html

jhjensen · January 28, 2019, 9:54am

How can the error rate be only 10% after 1 epoch?

Bryce · January 29, 2019, 3:07am

When I followed the steps in the course, I found that my error rate was always higher than the model trained by the teacher，as shown below:
the teacher’s:
Snipaste_2019-01-29_11-04-30
mine:
Snipaste_2019-01-29_10-56-20
Also, When using resnet18 to train MNIST_SAMPLE, the teacher used only two rounds of training to get an accuracy of 0.9946, and I trained 9 rounds to reach 0.996.
the teacher’s:
Snipaste_2019-01-29_11-05-43
mine:

Is this result normal?

charlybrown · January 29, 2019, 4:13pm

Has anybody had success using an EC2 instance? So far I’m very disappointed with the first notebook. It is riddled with errors that I have to keep resolving.

I’ve run into the issue: ValueError: padding_mode needs to be ‘zeros’ or ‘border’, but got reflection

when trying to run data.show_batch(rows=3, figsize=(7,6))

I can’t find any useful information on the rest of the forums.

charlybrown · January 29, 2019, 4:17pm

Found a solution. padding_mode=‘zeros’ must be passed to the from_name_re function

jf_94 · January 29, 2019, 9:10pm

Hey Bryce. Can you post the parameters you used when you were training the model? Also, I think it may be due to your GPU’s processing speed. For example, I’m using colab (for now), and I do need to change some of the parameters around, mainly the batch size, in order for a higher processing speed and a lower error rate.

Bryce · January 30, 2019, 2:58am

thanks. the parameters I used were just the same as tutorials. I am also using colab, I modified the batch size according to your opinion, although the accuracy is a little worse than tutorials, it works.

mahirmhd · January 30, 2019, 3:27am

I’m trying to do kaggle digit recognizer challenge using fastai. Do anyone know how I can load the data to databunch in it?

FabianLGB · January 30, 2019, 7:03am

Hello all

I’ve just started the course today and I’m working through Lesson 1. I’m now trying to create a model using my own images. My best error rate so far is around 0.3.

What are the general strategies to follow when your model isn’t performing that well?

My images are pictures of roofs, some with Solar PV and some without, which I pulled off a service that provides aerial photos and which I manually classified (here is the data if you are curious: http://www.dlgb.net/datasets/solar.tgz). The PV system is a relatively small feature of the images. Should I be doing something different with how the Image Bunch is created to make sure I preserve it?

I’m also not confident I’m selecting the learning rate properly. I noticed that in some cases my error rate gets worse after more epochs.

Any other ideas?

yogisin42 · January 30, 2019, 3:17pm

I trying the State-farm distracted driver competition. I am using the lesson 1 notebook. My learn.recorder.plot() plots nothing.

shiv · January 31, 2019, 3:59am

I am seeing below error while using ImageDataBunch.from_name_re. can someone help me?

AttributeError Traceback (most recent call last)
in
----> 1 data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms = get_transforms(),size=224)

~\AppData\Local\Continuum\anaconda3\envs\fastai-cpu\lib\site-packages\fastai\vision\data.py in from_name_re(cls, path, fnames, pat, valid_pct, **kwargs)
153 pat = re.compile(pat)
154 def _get_label(fn): return pat.search(str(fn)).group(1)
–> 155 return cls.from_name_func(path, fnames, _get_label, valid_pct=valid_pct, **kwargs)
156
157 @staticmethod

~\AppData\Local\Continuum\anaconda3\envs\fastai-cpu\lib\site-packages\fastai\vision\data.py in from_name_func(cls, path, fnames, label_func, valid_pct, **kwargs)
146 “Create from list of fnames in path with label_func.”
147 src = ImageItemList(fnames, path=path).random_split_by_pct(valid_pct)
–> 148 return cls.create_from_ll(src.label_from_func(label_func), **kwargs)
149
150 @classmethod

~\AppData\Local\Continuum\anaconda3\envs\fastai-cpu\lib\site-packages\fastai\data_block.py in _inner(*args, **kwargs)
369 assert isinstance(fv, Callable)
370 def _inner(*args, **kwargs):
–> 371 self.train = ft(*args, **kwargs)
372 assert isinstance(self.train, LabelList)
373 self.valid = fv(*args, **kwargs)

~\AppData\Local\Continuum\anaconda3\envs\fastai-cpu\lib\site-packages\fastai\data_block.py in label_from_func(self, func, **kwargs)
229 def label_from_func(self, func:Callable, **kwargs)->‘LabelList’:
230 “Apply func to every input to get its label.”
–> 231 return self.label_from_list([func(o) for o in self.items], **kwargs)
232
233 def label_from_folder(self, **kwargs)->‘LabelList’:

~\AppData\Local\Continuum\anaconda3\envs\fastai-cpu\lib\site-packages\fastai\data_block.py in (.0)
229 def label_from_func(self, func:Callable, **kwargs)->‘LabelList’:
230 “Apply func to every input to get its label.”
–> 231 return self.label_from_list([func(o) for o in self.items], **kwargs)
232
233 def label_from_folder(self, **kwargs)->‘LabelList’:

~\AppData\Local\Continuum\anaconda3\envs\fastai-cpu\lib\site-packages\fastai\vision\data.py in _get_label(fn)
152 “Create from list of fnames in path with re expression pat.”
153 pat = re.compile(pat)
–> 154 def _get_label(fn): return pat.search(str(fn)).group(1)
155 return cls.from_name_func(path, fnames, _get_label, valid_pct=valid_pct, **kwargs)
156

AttributeError: ‘NoneType’ object has no attribute ‘group’

velaia · January 31, 2019, 6:11am

Hi Fabian, I’ve taken a quick look at your dataset and found that there seem to be quite some labelling mistakes, e.g. the following ‘SOLAR’ roofs seem to actually have no solar installations whereas others in the ‘NO_SOLAR’ folder could at least have solar-thermal panels:
‘SOLAR’ mistakes:

SOLAR HOUSE - EPSG3857_Date20150424_Lat-31.936264_Lon115.865270_Mpp0.075
SOLAR HOUSE - EPSG3857_Date20150424_Lat-31.935843_Lon115.864105_Mpp0.075
SOLAR HOUSE - EPSG3857_Date20100919_Lat-31.934967_Lon115.857689_Mpp0.075
SOLAR HOUSE - EPSG3857_Date20181222_Lat-31.935361_Lon115.859809_Mpp0.075
‘NO_SOLAR’ mistakes:
NO_SOLAR HOUSE - EPSG3857_Date20101213_Lat-31.934045_Lon115.856484_Mpp0.075
NO_SOLAR HOUSE - EPSG3857_Date20071101_Lat-31.936260_Lon115.865397_Mpp0.075
NO_SOLAR HOUSE - EPSG3857_Date20150424_Lat-31.935851_Lon115.863486_Mpp0.075

I would be very curious what results an improved dataset could achieve!