SOLVED:NOT splitting datablock


does anyone know I can NOT split my dataset? I am making a testset dataloader. It has groundtruth data for evaluation of the model. I can make the Datablock and the Dataloader. However, the default is the 0.2 split. how can I remove this splitting?


You should do dls.test_dl(yourfnames, with_labels=True). Not entirely sure your file structure/etc so that’s the closest to a recommendation I can give ATM :slight_smile:

Thanks. My file structure is essentially a CSV, with a column for image filepaths, 4 columns for bounding box coordinates, and a column for classnumber.

For my test datablock, I created a datablock (Image, Category) that takes a row index filename, creates an image, extracts out a crop based on the bounding box in the row for the filename, and returns this crop as the image.

I used a datablock because I was not sure how to modify the dataloader effectively. What this method would work for a datablock, rather than the dataloader?

I’m assuming then that your test setup isn’t the same as your training setup, so a new DataBlock is fine. I wrote a little no_split function awhile ago, use this for your splitter:

def nosplit(o): return L(int(i) for i in range(len(o))), L()

Literally splitter = nosplit

And then when calling .dataloaders() pass shuffle_train=False and drop_last=False.

Absolute worst case (not in front of a computer to test), make the splitter:

def nosplit(o): return L(int(i) for i in range(len(o))), L(int(i) for i in range(len(o)))

And then the dl you’d want to use is the .valid (so test_dls = newDataBlock.dataloaders().valid)

1 Like

That works. Thank you so much.