Hello I recently went through chapter 11 of the Fastbook (Data Munging with FastAI’s Mid-Level API) and now I am trying to use it for creating a dataloader for image classification task. Unfortunately there are some things I don’t seem to grasp.
I am looking to create a custom type that I could use for showing images along with their class as a title.
Below are the classes and functions I have been creating for that purpose.
def resized_image(fn:path, sz=460):
x = Image.open(fn).convert('RGB').resize((sz,sz))
# convert image to tensor
return tensor(array(x)).permute(2,0,1).float()/255.
class CustomType(Tuple):
def show(self, ctx=None, **kwargs):
img, title = self
return show_image(img, title=title, ctx=ctx)
class CustomTransform(Transform):
def setups(self, files):
self.labeller = using_attr(RegexLabeller(pat=r'(.*)_\d+.png$'), 'name')
labels = list(map(labeller, files))
self.vocab = list(dict.fromkeys(labels))
self.o2i = {label:idx for idx,label in enumerate(self.vocab)}
def encodes(self,o): return (resized_image(o), self.o2i[self.labeller(o)])
def decodes(self,x): return CustomType((x[0], self.vocab[x[1]]))
Encoding part works fine as it outputs a tuple with tensor and the class as a number.
When trying to decode the result I get the following error:
TypeError: only integer tensors of a single element can be converted to an index
Should I be encoding my data differently? I don’t understand what is going wrong here.
The CustomType works fine if I pass in the results like this:
img = resized_image(files[0])
CustomType((img, 'test_title')).show()