The class method TextDataBunch.from_csv()
doesn’t give the ability to pass the CSV headers to it.
Suggestion: add names
parameters and pass it to pd.read_csv()
.
It’s a small pull request but can be useful IMO.
The class method TextDataBunch.from_csv()
doesn’t give the ability to pass the CSV headers to it.
Suggestion: add names
parameters and pass it to pd.read_csv()
.
It’s a small pull request but can be useful IMO.
We don’t want the factory methods to have too many parameters (note that there is an argument header
but I’m not sure it’s what you want). In this case, it’s easy to read the csv as a dataframe with the right header then use from_df
.
Sounds legit, thanks!
(header
indeed doesn’t meet my needs)
also, This works well for me on v0.21
.
import io
text = \
'''1,4.0,?,?,none,?
2,2.0,3.0,?,none,38
2,2.5,2.5,?,tc,39'''
buf = io.StringIO(text)
df = pd.read_csv(buf, na_values=['?', 'none'], header=None, prefix='col_')
df
col_0 col_1 col_2 col_3 col_4 col_5
0 1 4.0 NaN NaN NaN NaN
1 2 2.0 3.0 NaN NaN 38.0
2 2 2.5 2.5 NaN tc 39.0
Another trick (if this still doesn’t work) would be to use add_prefix
:
df
0 1 2 3 4 5
0 1 4.0 NaN NaN NaN NaN
1 2 2.0 3.0 NaN NaN 38.0
2 2 2.5 2.5 NaN tc 39.0
df = df.add_prefix('col_')
df
col_0 col_1 col_2 col_3 col_4 col_5
0 1 4.0 NaN NaN NaN NaN
1 2 2.0 3.0 NaN NaN 38.0
2 2 2.5 2.5 NaN tc 39.0