'col' parameter in LanguageModelData.from_dataframes api

What does 'col' represents in

LanguageModelData.from_dataframes(path, field, col, train_df, val_df, test_df=None, bs=64, bptt=70, **kwargs)

Thanks.

col referes to the column name. It is being used in the following line

text += text_field.preprocess(df[col].str.cat(sep=’ ')) in the fastai/nlp.py file

check line 281,282, 190, 194

2 Likes

This is how it works:

In [7]: import pandas as pd

In [8]: d=[{'a': 1, 'txt': 'foo'}, {'a':2, 'txt':'bar'}]

In [9]: df=pd.DataFrame.from_records(d)

In [10]: col='txt'

In [11]: df[col]
Out[11]: 
0    foo
1    bar
Name: txt, dtype: object

In [12]: df[col].str.cat(sep=' ')
Out[12]: 'foo bar'