Datablock: Train set uses folders and test set uses csv?

Hi Everyone,

I am trying to do the State Farm Distracted Driver Detection Kaggle competition (https://www.kaggle.com/c/state-farm-distracted-driver-detection). I searched the forum, but couldn’t find an answer for my issue.

The train data is split by subfolders so I could use ImageDataBunch.from_folder.

The test data is split by csv, so I could use ImageDataBunch.from_csv.

However, I am trying to figure out how to combine these two methods into one ImageDataBunch?

Any help would be appreciated. I apologize in advance if the above is a stupid question.

Cheers

3301x

Hope this works:

src=(ImageList.from_folder(…)
.split_by_rand_pct()
.label_from_folder())

for test images use “add_test” method src. add_test(ImageList.from_csv() ).
Make a databunch after this.

I’ve published a starter code for that competition if you want to take a look:

Hi

Sayakgis and Stefan. I just wanted to thank you both for helping me out. In particular, Stefan your notebook was fantastic.

Cheers

3301x