Labellist.to_csv() and to_df() not working for text data

Hello,

I am trying to export the validation set from my TextClasDataBunch to csv using the Labellist.to_csv() method. Both this method and the .to_df() method give the following error output:


AttributeError                            Traceback (most recent call last)
<ipython-input-37-8cee060745ec> in <module>
----> 1 df = data_clas.valid_ds.to_df()

~/anaconda3/envs/env/lib/python3.7/site-packages/fastai/data_block.py in to_df(self)
    574     def to_df(self)->None:
    575         "Create `pd.DataFrame` containing `items` from `self.x` and `self.y`."
--> 576         return pd.DataFrame(dict(x=self.x._relative_item_paths(), y=[str(o) for o in self.y]))
    577 
    578     def to_csv(self, dest:str)->None:

~/anaconda3/envs/env/lib/python3.7/site-packages/fastai/data_block.py in _relative_item_paths(self)
    118 
    119     def _relative_item_path(self, i): return self.items[i].relative_to(self.path)
--> 120     def _relative_item_paths(self):   return [self._relative_item_path(i) for i in range_of(self.items)]
    121 
    122     def use_partial_data(self, sample_pct:float=1.0, seed:int=None)->'ItemList':

~/anaconda3/envs/env/lib/python3.7/site-packages/fastai/data_block.py in <listcomp>(.0)
    118 
    119     def _relative_item_path(self, i): return self.items[i].relative_to(self.path)
--> 120     def _relative_item_paths(self):   return [self._relative_item_path(i) for i in range_of(self.items)]
    121 
    122     def use_partial_data(self, sample_pct:float=1.0, seed:int=None)->'ItemList':

~/anaconda3/envs/env/lib/python3.7/site-packages/fastai/data_block.py in _relative_item_path(self, i)
    117         return cls.from_df(df, path=path, cols=cols, **kwargs)
    118 
--> 119     def _relative_item_path(self, i): return self.items[i].relative_to(self.path)
    120     def _relative_item_paths(self):   return [self._relative_item_path(i) for i in range_of(self.items)]
    121 

AttributeError: 'numpy.ndarray' object has no attribute 'relative_to'

There is currently a thread about this from another user in the part 1 forum, but it received no replies in 9 days, so I am posting here to report the same problem. All packages are up-to-date, and I am working on a local machine on Ubuntu 18.04.

EDIT: sgugger replied in the linked thread:

The to_csv method is intended to save filenames, not anything else. There is no way to save your processed dataset to a csv file, but the save method of TextDataBunch saves your ids so you can access your processed dataset easily later.

1 Like