Lesson 3 Advanced Discussion ✅

fredguth · November 15, 2018, 6:12pm

I want to implement a NN from scratch using the MNIST dataset.
I am using ImageDataBunch to load the files. But for pedagogical reasons I want to the 28x28 input to be flatten in (784,). How can I do that using fastai loading features?

sam2 · November 15, 2018, 7:45pm

More a python question but related to fastai repo.
I have data = ImageDataBunch.from_folder(…)
Now I successfully get:
data.c >>> 2031

where does data (a ImageDataBunch) inherit this property from?

dir(data) does not show c or even classes
It’s super_class (i.e class DataBunch) also does not have this property or method

Just trying to learn to figure out how to work with this dynamic world of data_block api

mrandy · November 15, 2018, 8:13pm

Hi all.

Have spent quite some time but still can’t figure out why learn.get_preds throws an error that:
“index 68000 is out of bounds for axis 0 with size 68000”. Has anyone faced the same issue? Here is a gist for the notebook:

gist.github.com

https://gist.github.com/a-safonau/7806c7aa923b42f0715b21f5a0838d6f

tmp - Google Draw v16.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "## Google Quick Draw prediction"
   ]
  },

This file has been truncated. show original

Here are the specs of the Imagedatabunch. Look the same as for MNIST dataset (size of test x ≠ size test y).

Cheers,
Andrei

mrandy · November 15, 2018, 8:21pm

if someone faced the same issue - > make sure the size of validation set is ≥ test set. Helped me

gbecon · November 15, 2018, 10:07pm

learn.model = torch.nn.DataParallel(learn.model)

worked for me for Resnet models, but when I try the same with

learn = language_model_learner(data_lm, pretrained_model=URLs.WT103, drop_mult=0.3)

with week 4 notebook, I get an error when training:

'DataParallel' object has no attribute '"reset"'

Any suggestions how to fix it?

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-13-31b16a16d38f> in <module>
----> 1 learn.fit_one_cycle(1, 2e-2, moms=(0.8,0.7))

~/.conda/envs/fastaiv1/lib/python3.6/site-packages/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, wd, callbacks, **kwargs)
     18     callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor,
     19                                         pct_start=pct_start, **kwargs))
---> 20     learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
     21 
     22 def lr_find(learn:Learner, start_lr:Floats=1e-7, end_lr:Floats=10, num_it:int=100, stop_div:bool=True, **kwargs:Any):

~/.conda/envs/fastaiv1/lib/python3.6/site-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
    160         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    161         fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
--> 162             callbacks=self.callbacks+callbacks)
    163 
    164     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~/.conda/envs/fastaiv1/lib/python3.6/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     92     except Exception as e:
     93         exception = e
---> 94         raise e
     95     finally: cb_handler.on_train_end(exception)
     96 

~/.conda/envs/fastaiv1/lib/python3.6/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     78         for epoch in pbar:
     79             model.train()
---> 80             cb_handler.on_epoch_begin()
     81 
     82             for xb,yb in progress_bar(data.train_dl, parent=pbar):

~/.conda/envs/fastaiv1/lib/python3.6/site-packages/fastai/callback.py in on_epoch_begin(self)
    197         "Handle new epoch."
    198         self.state_dict['num_batch'] = 0
--> 199         self('epoch_begin')
    200 
    201     def on_batch_begin(self, xb:Tensor, yb:Tensor, train:bool=True)->None:

~/.conda/envs/fastaiv1/lib/python3.6/site-packages/fastai/callback.py in __call__(self, cb_name, call_mets, **kwargs)
    185         "Call through to all of the `CallbakHandler` functions."
    186         if call_mets: [getattr(met, f'on_{cb_name}')(**self.state_dict, **kwargs) for met in self.metrics]
--> 187         return [getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs) for cb in self.callbacks]
    188 
    189     def on_train_begin(self, epochs:int, pbar:PBar, metrics:MetricFuncList)->None:

~/.conda/envs/fastaiv1/lib/python3.6/site-packages/fastai/callback.py in <listcomp>(.0)
    185         "Call through to all of the `CallbakHandler` functions."
    186         if call_mets: [getattr(met, f'on_{cb_name}')(**self.state_dict, **kwargs) for met in self.metrics]
--> 187         return [getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs) for cb in self.callbacks]
    188 
    189     def on_train_begin(self, epochs:int, pbar:PBar, metrics:MetricFuncList)->None:

~/.conda/envs/fastaiv1/lib/python3.6/site-packages/fastai/callbacks/rnn.py in on_epoch_begin(self, **kwargs)
     16 
     17     def on_epoch_begin(self, **kwargs):
---> 18         self.learn.model.reset()
     19 
     20     def on_loss_begin(self, last_output:Tuple[Tensor,Tensor,Tensor], **kwargs):

~/.conda/envs/fastaiv1/lib/python3.6/site-packages/torch/nn/modules/module.py in __getattr__(self, name)
    523                 return modules[name]
    524         raise AttributeError("'{}' object has no attribute '{}'".format(
--> 525             type(self).__name__, name))
    526 
    527     def __setattr__(self, name, value):

AttributeError: 'DataParallel' object has no attribute 'reset'

ademyanchuk · November 16, 2018, 2:30am

Maybe someone from fastai team could help. It seems, like test creation in v1.0.24 is broken (if you add a test with add_test or add_test_folder). Constructor of ImageItemList takes a y length for test from validation.

My workaround was to create train and valid the standard way:
src = (ImageItemList.from_csv(path, 'train.csv', folder='train_combined', suffix='.png') .random_split_by_pct(0.2) .label_from_df(sep=' '))

And assign test manually like this:
test_item_list = ImageItemList.from_folder(path/'test_combined')
src.test = test_item_list.label_from_list(labels=['0']*len(test_item_list))

Hope it could help someone trying to predict on test))
Anyway it would be nice to hear from fastai developers the correct way of preparing test set.

adpostma · November 16, 2018, 7:55am

You can convert the grey-scale image into a "pseudo"colour image by copying the one channel three times. and use the resulting three channel image as input.

champs.jaideep · November 16, 2018, 8:54am

that dsnt give the look of gray scale image i guess… I compared it actual gray and psuedo one … can i request you give your method once…

adpostma · November 16, 2018, 7:52pm

Code snippet (do not remember where I got this from):

w, h = grey_im.shape[:2]
pseudo_im = np.zeros((w, h, 3), dtype=np.uint8)
pseudo_im[:, :, 0] = spec
pseudo_im[:, :, 1] = spec
pseudo_im[:, :, 2] = spec

Grey and “pseudo” colour image look the same.

jeremy · November 16, 2018, 11:48pm

It works for me on an AWS EC2 P3, but not on Sagemaker.

MicPie · November 17, 2018, 7:16am

You can incorporate the flatten operation with the torch view function in your NN forward function:
x = x.view(-1,784)

If you build your NN with nn.Sequential you can use the view function in a Lambda class.

I am currently going through some Pytorch tutorials myself to get going with the principles.

a_yasyrev · November 17, 2018, 7:44am

I working with protein, version 1.024, test assign without problem.
Problem with dataset items - they are Image objects, so i cant access to item names, so cant understand how prepare submission file…

ritika26 · November 17, 2018, 8:15am

I am also facing the similar issue with datablock API. I used the standard API and it is working fine

data = ImageDataBunch.from_csv(path,folder=‘train-images’,valid_pct=0.2, csv_labels=‘labels.csv’,
ds_tfms=get_transforms(), size=256,bs=16,padding_mode=‘zeros’,num_workers=0)

miwojc · November 17, 2018, 1:31pm

Did you try to the same random_seed(42) for example for the two APIs? Possibly the random split to train and val is causing error or not? Due to some classes not being in both train and valid…

ritika26 · November 17, 2018, 1:34pm

No I have not tried random seed .Let me try with random seed.

Thanks,
Ritika

jeremy · November 17, 2018, 8:25pm

If you grab the .items attr instead then you’ll find the filenames there.

keijik · November 18, 2018, 1:20am

Can somebody please tell me how to display the output of the unet?

I did a learn.predict on an img. I assume there is some helper function to map from the 32 code dimensions to color values?

fredguth · November 18, 2018, 2:28am

print what you have in the output, so we can help

keijik · November 18, 2018, 3:02am

Here is a snippet.

jeremy · November 18, 2018, 4:19am

You can use argmax as we discussed in class to get the class indexes from that.