@init_27, @akashpalrecha, @barnacl, @DrHB and I have worked on trying to implement CutMix research paper for FastaiV2.
We did this together and everyone has equal contributions towards the implementation.
We are sharing the experiment notebook as below and in our experiments we find, that for the same number of epochs for resnet50
we get very similar accuracy to cnn_learner
.
Our CutMix implementation changes the batch to:
There’s no computation overhead as such and the training times are very similar without CutMix
.
We adapt the MixUp
implementation inside the library and use the same loss function to compute CutMix
loss.
However, we are trying to refactor code and need help particularly with indexing into 4d
tensors - we are trying to do a batch implementation to keep things fast and are not sure how to index into these 4d tensors for rectangular bounding boxes of different sizes. Our implementation currently looks like:
class CutMix(Callback):
run_after,run_valid = [Normalize],False
def __init__(self, alpha=1.): self.distrib = Beta(tensor(alpha), tensor(alpha))
def begin_fit(self):
self.stack_y = getattr(self.learn.loss_func, 'y_int', False)
if self.stack_y: self.old_lf,self.learn.loss_func = self.learn.loss_func,self.lf
def after_fit(self):
if self.stack_y: self.learn.loss_func = self.old_lf
def begin_batch(self):
W, H = self.xb[0].size(3), self.xb[0].size(2)
lam = self.distrib.sample((self.y.size(0),)).squeeze().to(self.x.device)
lam = torch.stack([lam, 1-lam], 1)
self.lam = lam.max(1)[0]
shuffle = torch.randperm(self.y.size(0)).to(self.x.device)
xb1,self.yb1 = tuple(L(self.xb).itemgot(shuffle)),tuple(L(self.yb).itemgot(shuffle))
nx_dims = len(self.x.size())
rx = (self.distrib.sample((64,))*W).type(torch.long).to(self.x.device)
ry = (self.distrib.sample((64,))*H).type(torch.long).to(self.x.device)
rw = (torch.sqrt(1-self.lam)*W).to(self.x.device)
rh = (torch.sqrt(1-self.lam)*H).to(self.x.device)
x1 = torch.round(torch.clamp(rx-rw//2, min=0, max=W)).to(self.x.device).type(torch.long)
x2 = torch.round(torch.clamp(rx+rw//2, min=0, max=W)).to(self.x.device).type(torch.long)
y1 = torch.round(torch.clamp(ry-rh//2, min=0, max=H)).to(self.x.device).type(torch.long)
y2 = torch.round(torch.clamp(ry+rh//2, min=0, max=H)).to(self.x.device).type(torch.long)
for i in range(len(x1)):
self.learn.xb[0][i, :, x1[i]:x2[i], y1[i]:y2[i]] = xb1[0][i, :, x1[i]:x2[i], y1[i]:y2[i]]
self.lam = (1 - ((x2-x1)*(y2-y1))/(W*H)).type(torch.float)
if not self.stack_y:
ny_dims = len(self.y.size())
self.learn.yb = tuple(L(self.yb1,self.yb).map_zip(torch.lerp,weight=unsqueeze(self.lam, n=ny_dims-1)))
def lf(self, pred, *yb):
if not self.training: return self.old_lf(pred, *yb)
with NoneReduce(self.old_lf) as lf:
loss = torch.lerp(lf(pred,*self.yb1), lf(pred,*yb), self.lam)
return reduce_loss(loss, getattr(self.old_lf, 'reduction', 'mean'))
We believe that we particularly need help in this part of the code:
for i in range(len(x1)):
self.learn.xb[0][i, :, x1[i]:x2[i], y1[i]:y2[i]] = xb1[0][i, :, x1[i]:x2[i], y1[i]:y2[i]]
We admit that there is a lot of cleanup yet to be done and @sgugger @jeremy if you could please guide us on how to properly index the code and provide feedback
For indexation, so far we have tried:
import torch
xb1 = torch.randn(64, 3, 128, 128)
xb2 = torch.randn(64, 3, 128, 128)
x1.shape
x1 = torch.randint(0, 128, (64,))
x2 = torch.randint(0, 128, (64,))
y1 = torch.randint(0, 128, (64,))
y2 = torch.randint(0, 128, (64,))
x1.shape
xb1[:, :, x1:x2, y1:y2] = xb2[:, :, x1:x2, y1:y2]
but get an error:
TypeError Traceback (most recent call last)
<ipython-input-11-88c5a959f6b1> in <module>
13 x1.shape
14
---> 15 xb1[:, :, x1:x2, y1:y2] = xb2[:, :, x1:x2, y1:y2]
TypeError: only integer tensors of a single element can be converted to an index
Here is the GIST for the notebook: