Thanks @sgugger for the answer. It seems that the custom dataset build has an argument (mode
) that, if not specified, does not generate the ds
. However, things seems a little bit more complex than default migrating. I made a summary of what I think it does:
BACKGROUND: This pipeline uses openslide
library to tile a very big image (WSI - a tissue histology image) and make a prediction in a weakly supervised way. The authors mean with this that they only have an overall label
rather than a per tile label
. The workflow goes more or less like:
- Perform
predictions
directly on all the tiles generated and got probabilities
- Reorder the tiles with the greatest probability
- Keep only the tiles with the greatest probability and generate a subset of images with the overall label (converting the problem to a fully supervised fashion)
- Perfrom
training
on this subset and update weights and optimizer
So, if I understood the code correctly, this pipeline perform a first prediciton step that is not suitable for fastai2 Dataloaders
to work out of the box. Am I right? Do you think is there any possibility to implement this kind of pipeline in fastai2? Maybe is just enough to give the dataloader a first prediction and then the subsequent steps in the fastai2 pipeline will work?
Thanks 