I am trying to create a simple DataBunch from my TextClasDataBunch, so that it conforms with the description of “Batching for classification” in the notebook https://github.com/fastai/course-v3/blob/master/nbs/dl2/12_text.ipynb
il = TextList.from_files(path, include=['train', 'test'])
sd = SplitData.split_by_func(il, partial(grandparent_splitter, valid_name='test'))
ll = label_by_func(sd, parent_labeler, proc_x = [proc_tok, proc_num], proc_y=proc_cat)
I don’t have files I can apply to this approach, I have 2 labelled pandas dataframes for train and valid datasets, and so I used TextClasDataBunch.from_df since it seemed appropriate. Everything I need is in my TextClasDataBunch object, but hidden away, whereas the example in the code is a very simple set of the two numeralicalized and labelled datasets.
This may not be needed, since the TextClasDataBunch has what I need, but I am unsure whether the simpler format may be required as I follow the notebooks through with my own data. Is there a simpler approach to handling dataframes that would more naturally lead into the workflow in the notebook?