Hi everyone,
I’m trying to reproduce some results from fastai v1 in fastai2. I have a dataframe with 'is_valid'
column and used to do split_from_df(col='is_valid')
in fastai v1. How should I do this in fastai2? Many thanks.
Hi everyone,
I’m trying to reproduce some results from fastai v1 in fastai2. I have a dataframe with 'is_valid'
column and used to do split_from_df(col='is_valid')
in fastai v1. How should I do this in fastai2? Many thanks.
Hey @MichaelScofield, I can’t quite answer your question as I’m not sure it exists yet (from my knowledge but I could be wrong!) But to get more visibility with this perhaps move this thread to the v2 subforum? (As it’s more relevant than v1)
Done (you can hit the little pen next to the title of the thread eventually if you have a high enough trust ranking to do so)
Thank you.
What type of data? vision,text,tabular?
The from_df functionality seems to be in data specific files, such as: https://github.com/fastai/fastai_dev/blob/master/dev/09a_vision_data.ipynb
@sgugger pinging you in here as I looked too. Closest thing I can think of would be a FuncSplitter that looks at a column in the CSV?
Yes for now. We’ll had the ColSplitter soon but it’s not there yet.
I thought about FuncSplitter
too, tried it out but not quite understand how it works or what object the function applies to. From my view it seems to apply the function to the image_name
, not the data frame. Oh I must have messed things up.
I really need your advices.
If you’re like me and you got to this post by searching “tabular split by column” then you probably want ColSplitter()
from the data.transforms
module:
For example the code I needed was:
splitter = ColSplitter('is_valid')(df)
tabular_object = TabularPandas(
df,
procs = preprocessors,
cont_names = continuous_vars,
y_names = dep_var,
splits = splitter
)
@MichaelScofield @muellerzr hi can you share how to add the testing set as my dataset is the same as a col
with the name of is_valid
.
If you have split in train, test, and valid please let me know. my code just loads the train
and the valid
loads test
and valid
both in a single valid databunch.
Thanks