Also, Jeremy and Sylvain found that in general, the one batch is normally enough. If you look at my kernel you’ll see that the full vs one batch is extremely close
This looks great idea! but doesn’t it cause MemoryError as you’re loading whole dataset at once?
I have also worked around to calculate the dataset stats, but I use this snippet for calculations
ds = Datasets(fnames, tfms=Pipeline([PILImage.create, Resize(320), ToTensor]))
dl = TfmdDL(ds, bs=32,after_batch=[IntToFloatTensor],drop_last=True)
mean, std = 0., 0.
for b in tqdm(dl,total=len(dl)):
mean += b[0].mean((0,2,3))
std += b[0].std((0,2,3))
mean /= len(dl)
std /= len(dl)
It actually doesn’t (I’ve found so far) I ran mine with 4g of memory just fine on the entire plant pathology dataset. Took quite a bit of time (a minute or two). However your way works as well I believe
I am using sigmoid in decodes as workaround for show_batch as matplotlib requires float values to be in [0,1] range and if I don’t scale them as required, matplotlib just clamps the Tensor, which is definitely loss of information.
I agree we should reverse the procedure done in the encodes to undo the effect, but what if I want to look at the image with change made in the encodes (in case of pre-processing) ?
Thanks for replying, after I posted this I dig deeper and found b[1] is a tuple, b[1][0] had 12 masks of 1st channel and so on.
I tried n_inp and it didn’t help.
and for the stacking, I wanted to keep them separated, even If I wanted it to be stacked I don’t know how to stack PILMasks as a single tensor.
I’ve worked with image+masks input recently. I’ll post a tutorial soon but this is how I did it
class MultiMask(Tuple):
...
def stack(self):
# To be used with batch only
return L(self).stack(dim=1)
To make the stack work for you, you need all of your masks be of same size and maybe square, I’m not sure about the square though. So in case, the stack fails, just put Resize(<desired size>) before calling ToTensor as you might be customizing ToTensor for you:
You can simply pass in a list of PILMasks to this tuple, so your array_to_mask could be modified like so:
@Transform
def array_to_mask(x):
return MutliMask([PILMask.create(o) for o in x[:4]])
This will give you a single TensorMask object with all your masks, expecting the dim to be (4,size,size).
You won’t be able to show_batch anymore, as TensorMask isn’t designed to show multiple masks at once. This requires you to @typedispatch the show_batch for your use-case. I’ll share a tutorial soon explaining the whole process.
Apologies if this info is available somewhere already. Does anyone know the current state of pretrained xresnets? Back in January, it seems that they weren’t yet ready. I’m seeing better performance with old pretrained resnets so I’m assuming this is still the case?
You haven’t given read access to your notebook so no one can really help you. Also, I wouldn’t want to be @-ing the admins that way if I were you , it just makes it less likely to get a reply from them. Grant read access to your notebook.
Asking this question here which was previously posted in this topic
how can I customize the batch sampling method? In metric learning approaches, we need some control over the no. of positive/negative examples in a batch, where can I define this logic ?