RuntimeError: The size of tensor a (2) must match the size of tensor b (96) at non-singleton dimension 1

ranjit · February 6, 2019, 7:49pm

HI I’m trying to do a gender detection images

df.head()

full_path gender
17/10000217_1981-05-05_2009.jpg M
48/10000548_1925-04-04_1964.jpg M
12/100012_1948-07-03_2008.jpg M
65/10001965_1930-05-23_1961.jpg M
16/10002116_1971-05-31_2012.jpg F

tfms = get_transforms(do_flip=False)
np.random.seed(42)
src = (ImageItemList.from_df(df, path)
.random_split_by_pct(0.2)
.label_from_df(cols=1))

data = (src.transform(tfms, size=128)
.databunch().normalize(imagenet_stats))

After run a cycle, its gives me this error

RuntimeError Traceback (most recent call last)
in
----> 1 learn.fit_one_cycle(2, slice(lr))

/notebooks/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, wd, callbacks, **kwargs)
18 callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor,
19 pct_start=pct_start, **kwargs))
—> 20 learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
21
22 def lr_find(learn:Learner, start_lr:Floats=1e-7, end_lr:Floats=10, num_it:int=100, stop_div:bool=True, **kwargs:Any):

/notebooks/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
160 callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
161 fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
–> 162 callbacks=self.callbacks+callbacks)
163
164 def create_opt(self, lr:Floats, wd:Floats=0.)->None:

/notebooks/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
92 except Exception as e:
93 exception = e
—> 94 raise e
95 finally: cb_handler.on_train_end(exception)
96

/notebooks/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
87 if hasattr(data,‘valid_dl’) and data.valid_dl is not None:
88 val_loss = validate(model, data.valid_dl, loss_func=loss_func,
—> 89 cb_handler=cb_handler, pbar=pbar)
90 else: val_loss=None
91 if cb_handler.on_epoch_end(val_loss): break

/notebooks/fastai/basic_train.py in validate(model, dl, loss_func, cb_handler, pbar, average, n_batch)
52 if not is_listy(yb): yb = [yb]
53 nums.append(yb[0].shape[0])
—> 54 if cb_handler and cb_handler.on_batch_end(val_losses[-1]): break
55 if n_batch and (len(nums)>=n_batch): break
56 nums = np.array(nums, dtype=np.float32)

/notebooks/fastai/callback.py in on_batch_end(self, loss)
236 “Handle end of processing one batch with loss.”
237 self.state_dict[‘last_loss’] = loss
–> 238 stop = np.any(self(‘batch_end’, not self.state_dict[‘train’]))
239 if self.state_dict[‘train’]:
240 self.state_dict[‘iteration’] += 1

/notebooks/fastai/callback.py in call(self, cb_name, call_mets, **kwargs)
184 def call(self, cb_name, call_mets=True, **kwargs)->None:
185 “Call through to all of the CallbakHandler functions.”
–> 186 if call_mets: [getattr(met, f’on_{cb_name}’)(**self.state_dict, **kwargs) for met in self.metrics]
187 return [getattr(cb, f’on_{cb_name}’)(**self.state_dict, **kwargs) for cb in self.callbacks]
188

/notebooks/fastai/callback.py in (.0)
184 def call(self, cb_name, call_mets=True, **kwargs)->None:
185 “Call through to all of the CallbakHandler functions.”
–> 186 if call_mets: [getattr(met, f’on_{cb_name}’)(**self.state_dict, **kwargs) for met in self.metrics]
187 return [getattr(cb, f’on_{cb_name}’)(**self.state_dict, **kwargs) for cb in self.callbacks]
188

/notebooks/fastai/callback.py in on_batch_end(self, last_output, last_target, train, **kwargs)
269 if not is_listy(last_target): last_target=[last_target]
270 self.count += last_target[0].size(0)
–> 271 self.val += last_target[0].size(0) * self.func(last_output, *last_target).detach().cpu()
272
273 def on_epoch_end(self, **kwargs):

/notebooks/fastai/metrics.py in accuracy_thresh(y_pred, y_true, thresh, sigmoid)
20 “Compute accuracy when y_pred and y_true are the same size.”
21 if sigmoid: y_pred = y_pred.sigmoid()
—> 22 return ((y_pred>thresh)==y_true.byte()).float().mean()
23
24 def dice(input:Tensor, targs:Tensor, iou:bool=False)->Rank0Tensor:
RuntimeError: The size of tensor a (2) must match the size of tensor b (96) at non-singleton dimension 1

yeldarb · February 13, 2019, 10:35pm

Not sure if you’re still looking for an answer, but I got a similar error because I changed a hyper-parameter (batch size in my case) and tried to load data from a different run using different hyper-parameters.

jeremyeast · February 14, 2019, 10:02pm

I am having the same error for a binary classification for tabular, I have tried the following:

acc_imb = partial(accuracy_thresh, thresh=0.1)
m = tabular_learner(data, layers=[1000,500], metrics=[acc_imb])

m = tabular_learner(data, layers=[500,250], metrics=[accuracy_thresh(thresh=0.1)])

In both cases I get:
RuntimeError: The size of tensor a (2) must match the size of tensor b (64) at non-singleton dimension 1

jls · February 19, 2019, 2:06pm

According to docs, accuracy_thresh is intended for one-hot-encoded targets (often in a multiclassification problem). I guess that’s why your size of tensor doesn’t match.

jeremyeast · February 20, 2019, 8:48pm

Thanks Junlin! I realized we can actually pass a treshold in the loss_func of the model

jls · February 22, 2019, 9:17am

You’re welcome. Would you like to share how did you get around the one-hot-encoded situation?

jeremyeast · February 25, 2019, 5:06am

I’ve always had a binary one-hot encoded situation, so I did :

learn = tabular_learner(data, layers=[500,250], metrics=[accuracy], loss_func=imbalance_loss_func)

jls · February 26, 2019, 2:31am

Just out of curiosity, what did you past as loss_func ?

jeremyeast · March 15, 2019, 4:53am

I passed a cuda tensor of weights with cross-entropy loss

chloeji · March 31, 2019, 3:14am

I still don’t get if is the binary classes, how to avoid the tensor don’t match problem, can you please explain me the details? thanks

efpm04013 · April 7, 2019, 5:04am

This happens when you use torch.nn.CrossEntropyLoss () wrongly. It expects the predictions as a matrix and the actual as a vector.

Would suggest you use LabelEncoder from sklearn for the target values and this will work

from sklearn.preprocessing import LabelEncoder
le=LabelEncoder()
dftrain[‘label_encoded’]=le.fit_transform(dftrain[‘label’])

dfvalid[‘label_encoded’]=le.transform(dfvalid[‘label’])

Then,

loss_criterion=torch.nn.CrossEntropyLoss()
Then,
actual=dftrain[‘label_encoded’]
loss=loss_criterion(prediction,actual)

JK_Ranbir · April 11, 2019, 5:45pm

Hi Kiran,
I am facing the same issue. Would you please help me. I am new to this PyTorch.
Issue: The size of tensor a (128) must match the size of tensor b (9) at non-singleton dimension 0
I believe my issue is in this block of code, i am using PyTorch 1.01 i believe the CrossEntropyLoss is giving issue here as you mentioned in previous post. Would you please help on it.

def train(model, optimizer, train_loader, test_loader):
n_batch = len(train_loader.dataset) // BATCH_SIZE
criterion = nn.CrossEntropyLoss()

for e in range(N_EPOCH):
    model.train()
    correct, total_loss = 0, 0
    total = 0
    for index, (sample, target) in enumerate(train_loader): (Error is pointing this line as issue)
        sample, target = sample.to(DEVICE).float(), target.to(DEVICE).long()
        sample = sample.view(-1, 9, 1, 128)
        output = model(sample)
        loss = criterion(output, target)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
        _, predicted = torch.max(output.data, 1)
        total += target.size(0)
        correct += (predicted == target).sum()

        if index % 20 == 0:
            tqdm.tqdm.write('Epoch: [{}/{}], Batch: [{}/{}], loss:{:.4f}'.format(e + 1, N_EPOCH, index + 1, n_batch,
                                                                                 loss.item()))
    acc_train = float(correct) * 100.0 / (BATCH_SIZE * n_batch)
    tqdm.tqdm.write(
        'Epoch: [{}/{}], loss: {:.4f}, train acc: {:.2f}%'.format(e + 1, N_EPOCH, total_loss * 1.0 / n_batch,
                                                                  acc_train))

engrkhanwrt · March 17, 2020, 9:58am

I got the same error by including validation set. After removing validation set (slit_ny_none) the error become fixed. Any solution ??

rgarcia · March 24, 2020, 11:26am

While running
learn.fit_one_cycle(...)
The metrics are calculated always over the validation set.

By not having it you just avoid calculating the metrics.

cap_rogers · May 5, 2020, 5:27pm

class DoubleConv(nn.Module):
def __init__(self, in_channels, out_channels, mid_channels=None):
    super().__init__()
    if not mid_channels:
        mid_channels = out_channels
    self.double_conv = nn.Sequential(
        nn.Conv2d(in_channels, mid_channels, kernel_size=3, padding=1,stride=(2,2)),
        nn.BatchNorm2d(mid_channels),
        Mish(),
        nn.Conv2d(mid_channels, out_channels, kernel_size=3, padding=1,stride=(2,2)),
        nn.BatchNorm2d(out_channels),
        Mish()
    )

def forward(self, x):
    return self.double_conv(x)


class Down(nn.Module):
def __init__(self, in_channels, out_channels):
    super().__init__()
    self.maxpool_conv = nn.Sequential(
        nn.MaxPool2d(2),
        DoubleConv(in_channels, out_channels)
    )

def forward(self, x):
    return self.maxpool_conv(x)


class Up(nn.Module):
def __init__(self, in_channels, out_channels, bilinear=True):
    super().__init__()

    # if bilinear, use the normal convolutions to reduce the number of channels
    if bilinear:
        self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
        self.conv = DoubleConv(in_channels, out_channels, in_channels // 2)
    else:
        self.up = nn.ConvTranspose2d(in_channels , in_channels // 2, kernel_size=2, stride=2)
        self.conv = DoubleConv(in_channels, out_channels)


def forward(self, x1, x2):
    x1 = self.up(x1)
    # input is CHW
    diffY = x2.size()[2] - x1.size()[2]
    diffX = x2.size()[3] - x1.size()[3]

    x1 = F.pad(x1, [diffX // 2, diffX - diffX // 2,
                    diffY // 2, diffY - diffY // 2])
    x = torch.cat([x2, x1], dim=1)
    return self.conv(x)

class UNet(nn.Module):
def __init__(self, n_channels, n_classes, bilinear=True):
    super(UNet, self).__init__()
    self.n_channels = n_channels
    self.n_classes = n_classes
    self.bilinear = bilinear
    self.inc = DoubleConv(n_channels, 32)
    self.down1 = Down(32, 96)
    factor = 2 if bilinear else 1
    self.down4 = Down(32, 128 // factor,)
    self.up1 = Up(160, 128 // factor, bilinear
    self.up4 = Up(96, 1, bilinear)
    self.head1=Head(in_features=8*8,n_classes=n_classes)

def forward(self, *x):
    shape=x[0].shape
    n_tiles=len(x)
    #print(shape)
    x=torch.stack(x,1).view(-1,shape[1],shape[2],shape[3])
    x1 = self.inc(x)
    x2 = self.down1(x1)
    x5 = self.down4(x1)
    x = self.up1(x5, x2)
    logits = self.up4(x, x1)
    shape1=logits.size()
    logits=logits.view(n_tiles,shape1[0]//n_tiles,shape1[1],shape1[2],shape1[2])
    tiles=[im for im in logits]
    img_head=list(map(self.head1,tiles))
    img=torch.stack(img_head,1)
    return img

class Head(nn.Module):
def __init__(self,n_classes,in_features):
    super(Head,self).__init__()
    self.head=nn.Sequential(Flatten(),nn.Linear(in_features,n_classes),)
def forward(self,x):
    return self.head(x)

The above is the architecture code

class CoupledCrossEntropy(nn.Module):
def __init__(self):
    super(CoupledCrossEntropy,self).__init__()
def forward(self,input,target,reduction='mean'):
    target=target.long()
    input=input.float()
    print(input.size())
    print(target.size())
    #input=input.permute(1,0,2)
    loss=0
    for tile in range(len(input[0])):
        loss+=F.cross_entropy(input[:,tile,:],target)
    return loss

This is the loss function

model = UNet(n_channels=3,n_classes=6)
learn = Learner(data, model, loss_func=CoupledCrossEntropy(), opt_func=Over9000, 
            metrics=[KappaScore(weights='quadratic')],model_dir='.')
logger = CSVLogger(learn, f'log')
learn.clip_grad = 1.0
learn.split([model.head1])
learn.unfreeze()

learn.fit_one_cycle(5, max_lr=slice(1e-4,1e-7), div_factor=100, pct_start=0.0,callbacks= 
[ReduceLROnPlateauCallback(learn,patience=5,min_delta=1e-5)])
torch.save(learn.model.state_dict(), f'{fname}_{fold}.pth')

This the model instantiation and training code.

I am getting “The size of tensor a (384) must match the size of tensor b (32) at non-singleton dimension 2” error. Any help would very useful.

FooBar167 · May 29, 2020, 1:06pm

Something wrong with the metrics. To fix this kind of error I’ve changed the metrics from metrics=accuracy to:

def acc(input, target):
    target = target.squeeze(1)
    return (input.argmax(dim=1)==target).float().mean()

metrics=acc

Not sure it’ll help in your case, I’m novice too.