Issue with TextBlock.from_df - dataloaders only accepting one column name

Hi guys,
I’m updating an old model with fastai_v2 codes and I’ having some troubles with the new TextBlock.

Following the tutorial, this piece of code runs just fine:

path2 = untar_data(URLs.IMDB_SAMPLE)
df = pd.read_csv(path2/'texts.csv')
df.head(1)

imdb_lm = DataBlock(blocks=TextBlock.from_df('text', is_lm=True),
                    get_x=ColReader('text'),
                    splitter=ColSplitter())

dls = imdb_lm.dataloaders(df, bs=64, seq_len=72)

But if I change the name of the text column to something else, I get an Attribute error when running dls = imdb_lm.dataloaders(df, bs=64, seq_len=72):

df.columns = ['label','blablabla','is_valid']   
    imdb_lm = DataBlock(blocks=TextBlock.from_df('blablabla', is_lm=True),
                        get_x=ColReader('blablabla'),
                        splitter=ColSplitter())

    dls = imdb_lm.dataloaders(df, bs=64, seq_len=72)


AttributeError                            Traceback (most recent call last)
<ipython-input-170-502263c4ccb3> in <module>()
----> 1 dls = imdb_lm.dataloaders(df, bs=64, seq_len=72)
      2 dls.show_batch(max_n=2)

9 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in __getattr__(self, name)
   5272             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5273                 return self[name]
-> 5274             return object.__getattribute__(self, name)
   5275 
   5276     def __setattr__(self, name: str, value) -> None:

AttributeError: 'Series' object has no attribute 'blablabla'

I tried with another dataset and the same thing happened. If the column containing the texts isn’t “text”, the dataloaders method returns an AttributeError.

Anyone know why this happens?

2 Likes

After tokenizing your text will always be in “text” unless res_col_name is overridden. So your get_x should always return “text” while your TextBlock can point to blahblahblah

8 Likes

Ah it worked! Thank you @muellerzr.
I was puzzled because things were easier using the mid-level API ahahah.

May be something to add to the documentation in the text tutorial :wink:

1 Like

For sure. I’m writing a tutorial for people interested in drug discovery. Every bit of info is valuable :vulcan_salute:t5:

1 Like