Making predictions in v1


#22

Is there any guarantee about the order of these results with regards to the order of the input CSV? I ask because, as far as I can tell, the y value is always 0 since there is no place to tell the model what the truth is for the test data. So I’m merging the predictions with my CSV downstream.

I’m getting decent (~95%) accuracy on my validation, but 50% on my test data. It’s possible that I’m overfitting, but I suspect I’m doing data munging incorrectly.


(Ilia) #23

In v0.7 you could get the actual order of test samples from data.test_ds.fnames. In v1 I don’t see it, but I’m pretty sure [filepath.stem for filepath in list( path_of_your_test_folder.iterdir() ) ] with path_of_your_test_folder being Pathlib path should give you the actual order of your test files.


#24

This is really helpful. It is quite surprising to me that the order is based on the sort order in the directory, rather than the order in the csv!


#25

Just looping back to confirm that mapping based on the sort order in the directory (as @tetelias suggested) rather than the CSV (as I had been doing previously) solved my problem. My test set predictions were 95% accurate, very similar to my validation set.


#26

Thanks for the feedback. We’ll be looking at how to get the matching easier on the test set in future developments.


(Jeremy Howard (Admin)) #27

This is now: test_ds.x.


(Ilia) #28

In v1.0.5 installed from pip data have test_dl, but not test_ds. :slight_smile:


#29

Same, using the master branch.

I loaded my data like this:

data = ImageDataBunch.from_csv(
DATA_PATH,
folder=‘processed_train’,
test=DATA_PATH/‘processed_test’,
csv_labels=‘train.csv’,
sep=’ ‘,
suffix=’.png’,
bs=32
)

I get a train_dl, train_ds, valid_dl, valid_ds, and a test_dl but no test_ds.

Screenshot%20from%202018-10-16%2012-39-52


#30

Ahh, digging a little further it appears the test filepaths are available here: data.test_dl.dl.dataset.x

If i call get_preds on the test like this:
test_preds, test_y = learn.get_preds(is_test=True)

Will the order of the images in learn.data.test_dl.dl.dataset.x match the order of the predictions in test_preds?

Thanks!


(Jeremy Howard (Admin)) #31

Should be. BTW some of those classes have __getattr__ defined so you can probably get rid of dl or dataset in your call.


#32

Yes, data.test_dl.dataset.x works in place of data.test_dl.dl.dataset.x.


#33

Thanks Jeremy, indeed I can get rid of ‘dl’ in my call.

Unfortunately get_preds() is giving me results I don’t think I understand. For example, when I call get_preds on the validation set I’m expecting it to return the predictions and known targets. However some of the targets it returns don’t match any of the labels from my csv file.

Steps

  1. Load data

data = ImageDataBunch.from_csv(DATA_PATH, folder=‘processed_train’, test=DATA_PATH/‘processed_test’, csv_labels=‘train.csv’, sep=’ ‘, suffix=’.png’)

  1. Create learner and train

loss_fn = F.binary_cross_entropy_with_logits
learn = ConvLearner(data, tvm.resnet18, loss_fn=loss_fn, metrics=fbeta)
learn.fit_one_cycle(1, 0.01)

  1. Get predictions and targets from validation set

preds, targets = learn.get_preds()

  1. Inspect a target (28 classes from my train.csv)

This is an unusual combination so I wanted to have a look at this particular image. The above target should correspond to a label of ‘1 2 3 4’ in my train.csv…however this label doesn’t exist!

My train.csv looks like this (28 possible labels):
Screenshot%20from%202018-10-17%2001-24-13

Have I made some silly mistake? Is my expectation incorrect? Or is there an issue with what get_preds is returning or perhaps how the labels were read in from train.csv?


(RobG) #34

Note that there is also a nice holdout capability that can reduce the amount of duplicated code you may need to get validation and test predictions. Instead of data.test_dl and data.valid_dl you can use data.holdout(is_test=True) and data.holdout(is_test=False) respectively.


(Thomas) #35

I am unable to make predictions with my test data, always cuda out of memory, independent of batch size.
The progress bar arrives to the end, so 100% of my test data passes the model, but when the output is computed, error.
I don’t have problems with the valid set.

out = learn.get_preds(is_test=True)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-132-588810aa18f3> in <module>
----> 1 out = learn.get_preds(is_test=True)

~/fastai/fastai/basic_train.py in get_preds(self, is_test)
    175     def get_preds(self, is_test:bool=False) -> List[Tensor]:
    176         "Return predictions and targets on the valid or test set, depending on `is_test`."
--> 177         return get_preds(self.model, self.data.holdout(is_test), cb_handler=CallbackHandler(self.callbacks))
    178 
    179 @dataclass

~/fastai/fastai/basic_train.py in get_preds(model, dl, pbar, cb_handler)
     36 def get_preds(model:Model, dl:DataLoader, pbar:Optional[PBar]=None, cb_handler:Optional[CallbackHandler]=None) -> List[Tensor]:
     37     "Predict the output of the elements in the dataloader."
---> 38     return [torch.cat(o).cpu() for o in zip(*validate(model, dl, pbar=pbar, cb_handler=cb_handler, average=False))]
     39 
     40 def validate(model:Model, dl:DataLoader, loss_fn:OptLossFunc=None,

~/fastai/fastai/basic_train.py in validate(model, dl, loss_fn, metrics, cb_handler, pbar, average)
     47         for xb,yb in progress_bar(dl, parent=pbar, leave=(pbar is not None)):
     48             if cb_handler: xb, yb = cb_handler.on_batch_begin(xb, yb, train=False)
---> 49             val_metrics.append(loss_batch(model, xb, yb, loss_fn, cb_handler=cb_handler, metrics=metrics))
     50             if not is_listy(yb): yb = [yb]
     51             nums.append(yb[0].shape[0])

~/fastai/fastai/basic_train.py in loss_batch(model, xb, yb, loss_fn, opt, cb_handler, metrics)
     17     if not is_listy(xb): xb = [xb]
     18     if not is_listy(yb): yb = [yb]
---> 19     out = model(*xb)
     20     out = cb_handler.on_loss_begin(out)
     21 

~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/container.py in forward(self, input)
     90     def forward(self, input):
     91         for module in self._modules.values():
---> 92             input = module(input)
     93         return input
     94 

~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    475             result = self._slow_forward(*input, **kwargs)
    476         else:
--> 477             result = self.forward(*input, **kwargs)
    478         for hook in self._forward_hooks.values():
    479             hook_result = hook(self, input, result)

~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/conv.py in forward(self, input, output_size)
    724         return F.conv_transpose2d(
    725             input, self.weight, self.bias, self.stride, self.padding,
--> 726             output_padding, self.groups, self.dilation)
    727 
    728 

RuntimeError: CUDA error: out of memory

It is probably how I created my test ds, the SegmentationDataset expects x and y values, so i just called the constructor with (x,x) as I don’t have y values for test data.

Then used:

def get_tfm_datasets(path, val_idxs, size):
    datasets = get_datasets(path, val_idxs)
    tfms = get_transforms(do_flip=True, max_rotate=4, max_lighting=0.2, max_warp=0.15)
    return transform_datasets(train_ds, valid_ds, test_ds=test_ds, tfms=tfms, tfm_y=True, size=size, padding_mode='border')
  
train_tds, _, _= get_tfm_datasets(PATH128, range(400), 128)

to get trasnformed datasets.

Any idea?


(Thomas) #36

My GPU ram is not beaing liberated:

=== Software === 
python version  : 3.6.6
fastai version  : 1.0.6.dev0
torch version   : 1.0.0.dev20181015
nvidia driver   : 396.54
torch cuda ver  : 9.2.148
torch cuda is   : available
torch cudnn ver : 7104
torch cudnn is  : enabled

=== Hardware === 
nvidia gpus     : 1
torch available : 1
  - gpu0        : 8119MB | Quadro P4000

=== Environment === 
platform        : Linux-4.4.0-130-generic-x86_64-with-debian-stretch-sid
distro          : Ubuntu 16.04 Xenial Xerus
conda env       : fastai
python          : /home/paperspace/anaconda3/envs/fastai/bin/python
sys.path        : 
/home/paperspace/anaconda3/envs/fastai/lib/python36.zip
/home/paperspace/anaconda3/envs/fastai/lib/python3.6
/home/paperspace/anaconda3/envs/fastai/lib/python3.6/lib-dynload
/home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages
/home/paperspace/fastai
/home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages/IPython/extensions
/home/paperspace/.ipython

Wed Oct 17 08:21:36 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.54                 Driver Version: 396.54                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P4000        Off  | 00000000:00:05.0 Off |                  N/A |
| 46%   34C    P8     5W / 105W |   8105MiB /  8119MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2019      C   ...rspace/anaconda3/envs/fastai/bin/python  8095MiB |
+-----------------------------------------------------------------------------+


(Jeremy Howard (Admin)) #37

get_preds isn’t likely to be a great option for segmentation - you don’t want all your images in memory at once! Try doing your work a batch at a time.


#38

I think you should look at your train_ds.classes. It’s possible that the 0 label corresponds to another number in your dataframe.


#39

Using an imaging dataset with a binary classifier using the default crossentropy loss, I’m getting good classification (98% correct in a balanced set) in a set of data that is visually distinguishable to a human (that is, I believe that it’s possible to be this good).

The one odd thing to me is that in the test set, the log_prob is always one of two values:

learn = ConvLearner(data, 
                    tvm.resnet50, 
                    metrics=[accuracy, dice], 
                    callback_fns=ShowGraph)

learn.fit_one_cycle(1)

# Update all of the layers
learn.unfreeze()
learn.fit_one_cycle(24, slice(1e-4, 2e-2))#, pct_start=0.05)

# Predict into the test set
learn.get_preds(is_test=True)
test_output = learn.get_preds(is_test=True)
log_probs, y = test_output

mpl.pyplot.hist(test_results['logprob'])

image

It seems like I must be doing something wrong - I would expect a distribution of values rather than just two values. Any pointers?


(Mikhail Ksenzov) #40

I also ran into complications with predictions. I am on fastai 1.0.28. Working through the kaggle standard example of dogs-vs-cats with structure:

$ ls datasets/dogs-vs-cats/

README.md sampleSubmission.csv test1 test1.zip train train.zip

I create my data bunch as:

data = ImageDataBunch.from_name_re(path=path, 
                               fnames=fnames, 
                               pat=r"/(dog|cat)\.\d+\.jpg$", 
                               ds_tfms=get_transforms(),
                               size=224, 
                               bs=BATCH_SIZE,
                               test='test1',
                               suffix='.jpg')

and test set seems to be empty:

>>> data.test_ds

LabelList
y: CategoryList (1 items)
[]...
Path: datasets/dogs-vs-cats/train
x: ImageItemList (1 items)
[]...
Path: datasets/dogs-vs-cats/train

Do I invoke ctor incorrectly?


(Mikhail Ksenzov) #41

Also , to confirm, test set does not have to conform to the same pat regexp, right? Then do we need suffix and if so - is it used for test set only (since train/validation should be governed by pat).