Yes, the latter is way too much magic: if you want your transform to work at the tuple level, it will preserve the type at the tuple level, not inside the tuple.
I’ll look at why you need both when I have some time. You should not (note that as_item is probably superceded by the Pipeline setup methods, so you might need force_as_item)
Yah that was my plan if this isn’t possible … merge them, add a column for determining which dataset they should go in, go from there.
Btw, do you ever sleep and actually make your classes You’re everywhere. I literally saw the notification pop up and before I even looked I knew who it would be.
Hi all,
Finally I have some time to start and explore the V2.
I’m playing with the Kannada-MNIST Kaggle Dataset, where all the training images are imported as single csv file (each line represent one image).
I transformed it to tensor (# images, 1 Channel, 28 , 28), but i’m not sure if it’s possible to create image Datasaet/Dataloader directly from this tensor or do I need first to create image files for each image (sound wrong to me…)?
I’m sure it’s possible with the lower level API but I don’t know where to start looking…
Thanks!
Is there any good guide/tutorial and how to add your own transformation into the fastai v2 transforms/pipelines? Been trying to add in a gaussian noise data aug without much success.
the encodes() adds the transform and decodes() is supposed to remove the transform. As far as I know, the decodes are just used when displaying the items / batches. so imho you don’t have to implement a decodes (as you are adding noise I couldn’t think of a way to remove it).
I don’t think so. I checked the AddNoise on the MNIST dataset (which has always 0.0 in the top left corner) and every item has different noise added.
to see exactly the order of the transforms run dlock.summary(). And you are right, currently it’s executed before IntToFloatTensor (which divides by 255) and thus should add noise in the range between 0-255. You could also add order = 20 to your transform, to make sure it’s executed after.
class AddNoise(Transform):
order = 20 # <--- add this
def __init__(self, mean=0., std=1., **kwargs):
super().__init__(**kwargs)
self.std = std
self.mean = mean
print("Mean/Std: {}/{}.".format(self.std, self.mean))
def encodes(self, x:(TensorImage)):
self.fudge = torch.randn(x.size()).cuda() * self.std + self.mean
return x + self.fudge
A good way to understand transforms is to build a pipeline and play around with it:
# split_idx=0 to make sure that RandTransforms are being executed
p = Pipeline([PILImage.create,ToTensor,IntToFloatTensor,AddNoise], split_idx=0)
# get one item / mage
i = get_image_files(path)[0]
# put it through the pipeline
o = p(i)
# check what happened to your item
type(o)
o.float().mean(), o.float().max(), o.float().min()
o
Thanks. Is there anyway to make the fastai data pipeline compatible with standard torchvision and kornia augmentations? I’m running into issues because fastai needs this “TensorImage” object, but torchvision/kornia does not know what these are…
Note also that torchvision expects tensors for most transforms (some can work with PIL images, you would need to see on the specific transform you want). These only happen after the ToTensor transform. So on your item transform make sure to give it an order property larger than 5:
class TVRRC(ItemTransform):
order = 6
def __init__(self, size=448):
Thanks, for color jitter I have the code running with kornia (not crashing out), but the image looks wrong when I display it (I only get a black image?)
When I pass item_tfms and batch_tfms to the Datablock API, on show_batch I get black images. When I remove the AddJitter() transform, it appears normal?
Ok… there’s something else fastai v2 is doing under the hood. Using datablock I pass NO item_tfms and batch_tfms, and after pulling out the TensorImage, it seems to be normalized to (0, 1)!!! Where is this coming from? It should be (0, 255).
My augs are already doing the normalization, and so with this hidden normalization it’s reducing all my values to 1e-7, no wonder I get a black image!
EDIT: Using the datablock.summary() command I was able to identify the hidden normalization. This is added automatically. Is there a way to not add this final IntToFloatTensor operation? It looks like it’s even hardcoded for RGB images and it’s really a bad idea to do this if you’re doing non-std images.
IntToFloatTensor – {‘div’: 255.0, ‘div_mask’: 1}
EDIT2: Found a solution, DataBlock adds the IntToFloatTensor operation by default as the first one, so just pop it out (it’s a list, so just remove the first element). This is important if your transformations are doing some sort of normalization, as if you don’t remove the initial IntToFloatTensor, you will be doing double normalization.
Ok, there’s something still messed up with the IntToFloatTensor default operation that datablock calls. Does anyone know what is exactly going on and how the default FastAI transformations are related to it? Initially I thought it was dividing all values by 255 to get values in (0…1) but that’s not the case.
I can have ok results if I remove the first IntToFloatTensor operation, and use my data augs which itself uses a IntToFloatTensor operation inside it. But when I add it back in, I can see that my data augs that is called AFTER the first operation has input data that is (0…255). What’s going on? But then as I track the data, it gets normalized twice.
Isn’t the output of the IntToFloatTensor operation supposed to get it between (0…1). Why does the downstream function still get (0… 255)?
My workaround was to remove the original IntToFloatTensor operation and then move my data augs operation to the first one, instead of the last.
I did further digging, and it looks like the order of batch_tfms despite how I manipulate it, is not reflected in the order of which the transformations are called, as evidenced when I look at the pipeline printout!
How do you manipulate the order of transformations if you’re using the datablock API?