I’m trying to understand how to best work with fast.ai’s data API. In my understanding, the purpose of subclassing ItemBase
is so that one can, for example, implement a plotting method for said item in that subclass. Similarly, subclassing ItemList
allows implementing show_xys()
and show_xyzs()
. That’s very handy when using functions such as data.show_batch()
.
Unfortunately, that also means that I always have to recreate my dataset when I iterate on these methods. It would be much more convenient to have my data in one place, and my plotting functions in a separate class. Then I could substitute my plotting functions without having to reload my data.
I vaguely remember that Jemery said that fast.ai v2 would make use a lot more of delegation. I guess this would be one such case, where I could simply substitute the delegate of a plotting function, for example.
Is my understanding of the limitation of the v1 API correct, or does it sound like I’m using it wrongly? If I have correctly described a limitation of the v1 data API, is this indeed something that you are trying to address in v2?
Edit: I have worked around this limitation with the following coding style:
class MyItemBase(ItemBase):
#...
def plot(self, *args, **kwargs): return _plot(self.data, *args, **kwargs)
def _plot(data):
# ...
plt.show()
That way, I can just reexecute the cell, which will redefine _plot
, and it gets picked up by my existing MyItemBase
instances. Same for show_xys
etc… I do wonder though whether there is a more elegant way.