Custom ItemList, getting ForkingPickler broken pipe

charming · May 9, 2019, 2:05pm

Is there a good way to add test sets correctly?:狞笑:

etremblay · May 9, 2019, 2:25pm

If you are using MixedItemList, I could not make it work because it doesnt work like other ItemLists.

What I did in my code is that I trick fastai into thinking the test set is actually the validation set and then get predictions for the validation set.

charming · May 9, 2019, 2:42pm

Yes, thank you for your excellent work, which has helped me a lot. However, if I set the training set size to 0 in fastai, an error will be reported:哀思:

herrmann · May 23, 2019, 8:59pm

Hopefully fixed in master now (thanks to @sgugger 's quick response): https://github.com/fastai/fastai/pull/2107

etremblay · May 24, 2019, 2:24pm

Thanks @Herman!

alpsayin · August 24, 2019, 9:39am

Hi Etienne,

Thanks a lot for your wonderful work and also for kindly sharing it with the rest of us (and special thanks to sgugger. I coincidentally found myself following the exact path you followed. I was wondering if you were able to use learn.predict method after training?

I am using a model only with tabular and image data and successfully trained it. But no luck on predict. This was sort of an effort to compile a list of predictions on validation set and “plot_top_losses” manually for inspection

But the funny thing is error I’m getting is a KeyError from pd.Categorical called by fastai’s Categorify proc on data. And below is what I think is causing the problem; the creation of a dataframe from two copies of TabularLines (fastai.tabular.data line 45). I tried to temporarily replace TabularLine inside my MixedItem with a custom object to make those two lines “happy” but ended up creating more problems that I understand less. I’ve included full traceback when I call predict.

Final note; I’m looking to see if I did some obvious mistake. Otherwise, I think the answer for my (and any) MixedItemList related Q is to wait for v2 Also, apologies for long post.

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai/tabular/data.py in process_one(self, item)
     44     def process_one(self, item):
     45         df = pd.DataFrame([item,item])
---> 46         for proc in self.procs: proc(df, test=True)

Partial code (which is proven working with Images only) and full traceback below;

dup_valid_dl = copy.deepcopy(data.valid_dl)
for item in dup_valid_dl.dataset:
#     print(f'{item}')
     data_item, label = item
     img_item, tab_item = data_item.obj[0], data_item.obj[1]
     print(f'Tab_Item:\t{tab_item}')
     print(f'Img_Item:\t{img_item}')
     print(f'Label:\t{label}')
     predicted = learn.predict( (data_item) )[0].data[0]
#     print(f'{predicted}')
#     truth = label.data
#     error = truth-predicted
#     predicted_list.append([item, np.abs(error), truth, predicted])
#     print(f'{truth:.3f}-{predicted:.3f}={error:.3f}')
#     print(f'======================================================================================')
predicted_list = sorted(predicted_list, key=lambda x: -x[1])

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2656             try:
-> 2657                 return self._engine.get_loc(key)
   2658             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: 'port_id'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-175-383315a78990> in <module>()
     20     print(my_df)
     21 
---> 22     predicted = learn.predict( (data_item) )[0].data[0]
     23 #     print(f'{predicted}')
     24 #     truth = label.data

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai/basic_train.py in predict(self, item, return_x, batch_first, with_dropout, **kwargs)
    372     def predict(self, item:ItemBase, return_x:bool=False, batch_first:bool=True, with_dropout:bool=False, **kwargs):
    373         "Return predicted class, label and probabilities for `item`."
--> 374         batch = self.data.one_item(item)
    375         res = self.pred_batch(batch=batch, with_dropout=with_dropout)
    376         raw_pred,x = grab_idx(res,0,batch_first=batch_first),batch[0]

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai/basic_data.py in one_item(self, item, detach, denorm, cpu)
    178         "Get `item` into a batch. Optionally `detach` and `denorm`."
    179         ds = self.single_ds
--> 180         with ds.set_item(item):
    181             return self.one_batch(ds_type=DatasetType.Single, detach=detach, denorm=denorm, cpu=cpu)
    182 

~/anaconda3/envs/pytorch_p36/lib/python3.6/contextlib.py in __enter__(self)
     79     def __enter__(self):
     80         try:
---> 81             return next(self.gen)
     82         except StopIteration:
     83             raise RuntimeError("generator didn't yield") from None

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai/data_block.py in set_item(self, item)
    606     def set_item(self,item):
    607         "For inference, will briefly replace the dataset with one that only contains `item`."
--> 608         self.item = self.x.process_one(item)
    609         yield None
    610         self.item = None

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai/data_block.py in process_one(self, item, processor)
     88         if processor is not None: self.processor = processor
     89         self.processor = listify(self.processor)
---> 90         for p in self.processor: item = p.process_one(item)
     91         return item
     92 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai/data_block.py in process_one(self, item)
    754         res = []
    755         for procs, i in zip(self.procs, item):
--> 756             for p in procs: i = p.process_one(i)
    757             res.append(i)
    758         return res

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai/tabular/data.py in process_one(self, item)
     44     def process_one(self, item):
     45         df = pd.DataFrame([item,item])
---> 46         for proc in self.procs: proc(df, test=True)
     47         if len(self.cat_names) != 0:
     48             codes = np.stack([c.cat.codes.values for n,c in df[self.cat_names].items()], 1).astype(np.int64) + 1

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai/tabular/transform.py in __call__(self, df, test)
    122         "Apply the correct function to `df` depending on `test`."
    123         func = self.apply_test if test else self.apply_train
--> 124         func(df)
    125 
    126     def apply_train(self, df:DataFrame):

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai/tabular/transform.py in apply_test(self, df)
    143         "Transform `self.cat_names` columns in categorical using the codes decided in `apply_train`."
    144         for n in self.cat_names:
--> 145             df.loc[:,n] = pd.Categorical(df[n], categories=self.categories[n], ordered=True)
    146 
    147 FillStrategy = IntEnum('FillStrategy', 'MEDIAN COMMON CONSTANT')

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2925             if self.columns.nlevels > 1:
   2926                 return self._getitem_multilevel(key)
-> 2927             indexer = self.columns.get_loc(key)
   2928             if is_integer(indexer):
   2929                 indexer = [indexer]

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2657                 return self._engine.get_loc(key)
   2658             except KeyError:
-> 2659                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2660         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2661         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: 'port_id'

etremblay · August 27, 2019, 5:46pm

Hey @alpsayin, sorry I haven’t touched anything deep learning related since May. But from what I remember I did not try to make predict work. You can see my code here on how I went around the problem: https://github.com/EtienneT/fastai-petfinder/blob/master/Fastai%20PetFinder.ipynb. But really not ideal.

Can’t wait to see what v2 will yield for this kind of scenario too.

jcatanza · March 7, 2020, 1:40am

??? If, in a Jupyter notebook, you enclose a block of code inside an
`if name == 'main':`
statement, you get a
`NameError`.

NameError Traceback (most recent call last)
in

NameError: name ‘name’ is not defined

xiaohao · June 25, 2020, 4:48am

Hi Elfayoumi,
I’m also wondering about how to join several imageLists in mixedItemList. Is there a straightforward way to do that?
Thanks

Custom ItemList, getting ForkingPickler broken pipe

??? If, in a Jupyter notebook, you enclose a block of code inside an if name == 'main': statement, you get a NameError.

??? If, in a Jupyter notebook, you enclose a block of code inside an
`if name == 'main':`
statement, you get a
`NameError`.