Beginner Question: How to predict on Test Set

navneetkrch · November 24, 2018, 3:46am

Hi guys,
Context: DogBreedClassifier
I am new to DL. I have trained my model and saved my weights.
Now I want to use this model to predict on Test Folder, and update my submission.csv file.
Thanks in Advance

navneetkrch · November 25, 2018, 7:26am

navneetkrch · November 25, 2018, 7:43am

github.com

fastai/fastai/blob/13d1ac0af6de16b16cf156d981b5340ca9e00df4/fastai/vision/data.py

"`vision.data` manages data input pipeline - folderstransformbatch input. Includes support for classification, segmentation and bounding boxes"
from ..torch_core import *
from .image import *
from .transform import *
from ..data_block import *
from ..data_block import _df_to_fns_labels
from ..basic_data import *
from ..layers import CrossEntropyFlat
from concurrent.futures import ProcessPoolExecutor, as_completed
import PIL

__all__ = ['get_image_files', 'DatasetTfm', 'ImageClassificationDataset', 'ImageMultiDataset', 'ObjectDetectDataset',
           'SegmentationDataset', 'ImageClassificationBase', 'denormalize', 'get_annotations', 'ImageDataBunch', 'ImageFileList', 'normalize',
           'normalize_funcs', 'show_image_batch', 'transform_datasets', 'SplitDatasetsImage', 'channel_view',
           'mnist_stats', 'cifar_stats', 'imagenet_stats', 'download_images', 'verify_images', 'bb_pad_collate']

image_extensions = set(k for k,v in mimetypes.types_map.items() if v.startswith('image/'))

def get_image_files(c:PathOrStr, check_ext:bool=True, recurse=False)->FilePathList:
    "Return list of files in `c` that are images. `check_ext` will filter to `image_extensions`."

This file has been truncated. show original

def split_data_add_test_folder(self, test_folder:str=‘test’, label:Any=None):
“Add test set containing items from folder test_folder and an arbitrary label”
items = ImageFileList.from_folder(self.path/test_folder)
return self.add_test(items, label=label)

SplitData.add_test_folder = split_data_add_test_folder.

We can add the test folder using this.

navneetkrch · November 25, 2018, 7:46am

File Structure for my scenario

path\

train\
    clas1\
    clas2\
    ...
 valid\
    clas1\
    clas2\
    ...
 test\

Workaround:
path = ‘data/train’
Also while defining the data I had to
data = ImageDataBunch.from_folder(path, train=‘train’, valid_pct=0.2, test=’…/test’, ds_tfms=tfms, size=224)
See here I had to define test = ‘…/input’,
somehow with path set at ‘data/’ I could not get the test_ds loaded.

matthewarthur · February 9, 2019, 10:39am

Should I be able to mix from types in the same databunch() API call? I have a dataframe input for my train/valid images and I want to use a folder for my test images. How should I approach this?

sgugger · February 9, 2019, 1:47pm

When you call add_test you can pass any ItemList so you can create your train/valid with ImageItemList.from_df then when you add the test set:

.add_test(ImageItemList.from_folder(...)

or even more direct: add_test_from_folder should work.

mmiakashs · March 17, 2019, 11:41am

Thanks for the explanation, it works. But I could not able to retrieve the test image files name.

sgugger · March 17, 2019, 2:36pm

The test image filenames will be in data.test_ds.x.items.

kapusta · May 26, 2020, 1:56pm

When using the .add_test(…) approach, is the test data normalized automatically the same way as train data or does that have to be specified when adding the test itemlist?

muellerzr · May 26, 2020, 2:09pm

Yes, anything that occurs to the validation set occurs to the test set here.