Hi All,
I’ve been trying to understand fastai’s transform and pipeline APIs by experimenting with my own code.
Here I have created a simple Transform class and block and tried to create a datablock and dataloaders. The goal is to load see the functioning of encodes
and decodes
functions in the Transform block
import torch_snippets as ts
# and other fastai imports
class MyTfm(Transform):
def __init__(self, **kwargs):
super().__init__()
def encodes(self, x:Path) -> TensorImage:
x = Image.open(x)
x = 1 - np.array(x)/255.
x = resize(x, ('at-most', (32,128)))
x = pad(x, (32,128), pad_value=0)
return x # this is a numpy array of shape (32, 128)
def decodes(self, x:TensorImage) -> str:
return 'abcd'
def MyTfmBlock(): return TransformBlock([MyTfm()])
dblock = DataBlock(
blocks=(MyTfmBlock),
get_items=get_image_files,
get_x=lambda x: x,
splitter=RandomSplitter(seed=10),
)
root = Path('folder/to/images')
dls = dblock.dataloaders(root, bs=4)
b = dls.one_batch()
The code for creating one batch works fine. I’m able to see the tensor shape as expected
>>> ts.inspect(*b)
==================================================================
TensorImage Shape: torch.Size([4, 32, 128]) Min: 0.000 Max: 1.000 Mean: 0.078 dtype: torch.float32
==================================================================
The main confusion I have is, why is the decode not working? I wanted it to decode it to a simple random abcd
string given the image, but i see the same TensorImage
getting returned. Why is that happening?
>>> ts.inspect(*dls.decode(b))
==================================================================
TensorImage Shape: torch.Size([4, 32, 128]) Min: 0.000 Max: 1.000 Mean: 0.078 dtype: torch.float32
==================================================================
Edit:
Here’s the github gist - fastai-doubt.ipynb · GitHub