[Proposal] VisionDataBunch class


(Tyler C Nisonoff) #1

In an experimental commit here*

I create a VisionDataBunch in vision/data.py that overrides #create and #labels_to_csv rather than creating private functions and overwriting DataBunch’s versions of these on import

The current implementation has the major drawback that once you import
fastai.vision.data, you can no longer use DataBunch.create on
non-image datasets.

Additionally, this seems cleaner and easier to reason about.

If people think this is an appropriate change I’m happy to modify the existing notebooks / create a notebook for this change, but I was hesitant to go through that process if the change would be rejected.

I’m interested to hear others’ thoughts!


#2

Why is that? The method is monkey-patched but its old behavior isn’t erased.


(Tyler C Nisonoff) #3

Ah I see now that they’re essentially the same implementation, but the second one adds onto it.

Howabout we just merge the ds_tfms keyword arg into the original DataBunch.create

this way we dont have to maintain two versions of the same code, and its a lot easier to reason about (I got tripped up debugging some vision code as I was looking at the wrong create method).


#4

There is a dependency reason that makes it that way. fastai.data shouldn’t depend on fastai.vision.data.