Learn.summary() throws ValueError: Expected more than 1 value per channel when training, got input size [1, 1024]

hiromi · January 16, 2018, 3:47am

I have attached a stripped down version of lesson2-image_models.ipynb below.

What is strange about it is learn.summary with no parenthesis works, and so does learn. But when I run learn.summary(), it throws the following error:


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-11-bc39e9e85f86> in <module>()
----> 1 learn.summary()

~/fastai/courses/dl1/fastai/conv_learner.py in summary(self)
   117         precompute = self.precompute
   118         self.precompute = False
--> 119         res = super().summary()
   120         self.precompute = precompute
   121         return res

~/fastai/courses/dl1/fastai/learner.py in summary(self)
    51     def data(self): return self.data_
    52 
---> 53     def summary(self): return model_summary(self.model, [3,self.data.sz,self.data.sz])
    54 
    55     def __repr__(self): return self.model.__repr__()

~/fastai/courses/dl1/fastai/model.py in model_summary(m, input_size)
   161         x = [to_gpu(Variable(torch.rand(1,*in_size))) for in_size in input_size]
   162     else: x = [to_gpu(Variable(torch.rand(1,*input_size)))]
--> 163     m(*x)
   164 
   165     for h in hooks: h.remove()

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
   323         for hook in self._forward_pre_hooks.values():
   324             hook(self, input)
--> 325         result = self.forward(*input, **kwargs)
   326         for hook in self._forward_hooks.values():
   327             hook_result = hook(self, input, result)

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/container.py in forward(self, input)
    65     def forward(self, input):
    66         for module in self._modules.values():
---> 67             input = module(input)
    68         return input
    69 

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
   323         for hook in self._forward_pre_hooks.values():
   324             hook(self, input)
--> 325         result = self.forward(*input, **kwargs)
   326         for hook in self._forward_hooks.values():
   327             hook_result = hook(self, input, result)

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py in forward(self, input)
    35         return F.batch_norm(
    36             input, self.running_mean, self.running_var, self.weight, self.bias,
---> 37             self.training, self.momentum, self.eps)
    38 
    39     def __repr__(self):

~/src/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
  1009         size = list(input.size())
  1010         if reduce(mul, size[2:], size[0]) == 1:
-> 1011             raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
  1012     f = torch._C._functions.BatchNorm(running_mean, running_var, training, momentum, eps, torch.backends.cudnn.enabled)
  1013     return f(input, weight, bias)

ValueError: Expected more than 1 value per channel when training, got input size [1, 1024]

gist.github.com

https://gist.github.com/anonymous/f395fad6fb9dbd9b764273d765871922

lesson2-image_models.ipynb

{
  "cells": [
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Multi-label classification"
    },
    {
      "metadata": {
        "trusted": true

This file has been truncated. show original

Is anybody else experiencing this issue?

Thank you!

ecdrid · January 19, 2018, 5:45pm

Hi,
The issue is related to the PyTorch version?
Downgrading to .2 will bring back things to Normal…

hiromi · January 19, 2018, 6:43pm

That is it! I can’t say I understand what colesbury is saying I will poke around and see if I can figure out.

Thanks for taking a look

ecdrid · January 19, 2018, 7:00pm

It will fail. Don't train on batches of size 1 
if you use feature-wise batch normalization. 
(Inference is fine on batch-size 1). 
Skip over the left-over batch.

Batch normalization computes:

y = (x - mean(x)) / (std(x) + eps)

If you have one sample per batch then mean(x) = x,
 and the output will be entirely zero 
(ignoring the bias). 
You can't use that for learning....

I have read the BatchNorm paper and it makes sense to thereafter…

Pleasure…

Batch Norm Paper Link…

https://arxiv.org/abs/1502.03167

raspstephan · February 4, 2018, 2:17am

I also ran into this error recently when the last batch in my training was of size one. So basically

n_samples % bs = 1

I simply changed the random seed of my train/valid split to get around the problem, but how could this be fixed properly?

srmsoumya · March 22, 2018, 10:46am

Has anyone solved this issue yet?

I tried doing learn.predict() and then learn.predict_array() which solves the issue, but doesn’t make much sense to me as to why is this happening.

hiromi · March 22, 2018, 2:17pm

I’ve been putting a check to make sure that training data size % batch size is not 1 before I start training. Maybe we could put something in fastai library that if the last batch only had 1 thing in it and model has batch norm, throw away that data. But that seems rather disruptive - so maybe we just be cautious.

raspstephan · March 22, 2018, 2:44pm

I will look later today at the batch norm paper/definition. Maybe we can then include a check as you suggested and create a merge request.

hiromi · March 22, 2018, 2:53pm

The other day, I checked everything with sample data set and all was well. At the end of the day, I kicked off training with the bigger set - which just happened to have the last batch with 1 data in it. I woke up the next day to a failed training

raspstephan · March 22, 2018, 5:52pm

This seems like something we should fix

raspstephan · March 23, 2018, 4:11am

I looked at the issue tonight, but in the process of creating a minimal example I stumbled onto another issue for which I created a GitHub issue: https://github.com/fastai/fastai/issues/240

Once this is fixed, I will look at the bs=1 problem.

giusvit · March 25, 2018, 6:53pm

Hi hiromi, hi everybody,
it works if you do this:

learn.model.eval()
learn.summary()

Somehow, you need to set the model in evaluation mode to make the summary method work.

By the way, thanks to @ramesh the predict_array method now works correctly too. The reason is that the module should be set in evaluation mode when making predictions, because it changes the behavior of certain modules (e.g. BatchNorm).

hiromi · March 25, 2018, 7:01pm

Hello

Yes, .eval disables BatchNorm and Dropout so that when you are running on validation set or test set, you get a better result (at that point, you are not concerned with avoiding overfitting). For training, however, we want to use BatchNorm. I initially came across this issue when printing out the summary, but the root cause of this actually causes your training to fail.

If @raspstephan doesn’t get it first, I can look into creating a PR to at least check the final batch size so that you do not have to wait until the end of the epoch to see the failure. Hope that clears some stuff

raspstephan · March 25, 2018, 7:04pm

Hi, I will look at it again tonight. I tried creating a minimal example with the ImageDataLoader.from_array() function. I created a training set with 65 samples and a batch size of 64, but training worked fine!?

If you have time, maybe you could try to create a minimal example that produces the error.

hiromi · March 25, 2018, 7:06pm

Certainly! I’m almost done with what I’m working on right now, so I will create a notebook with minimal reproducible example. It’s certainly possible somebody else got to it since the last time I looked at it.

giusvit · March 25, 2018, 7:19pm

Yes, you are right, in fact I was planning to use learn.model.train() thereafter to set the model in training mode, so I would just use eval() to print the summary. But yeah, I agree it’s not the smartest way to do that

hiromi · March 25, 2018, 11:29pm

Here is batch size of 4 with training dataset of size 5:

gist.github.com

https://gist.github.com/hiromis/8cea0c05546cb1b9a5136c1869024a4b

tmp-lesson2-image_models.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Multi-label classification"
   ]
  },
  {

This file has been truncated. show original

So it’s definitely reproducible but maybe it is good to be aware of how to go around the issue. Because once we start creating our own models, we would need to know…

Gabriel_Syme · June 13, 2018, 2:17am

Hi,

I was wondering if this was ever handled with. I was getting the same error here and bypassed it by leaving some data out of the problem but that feels a bit inefficient.

Would a crude solution be to replicate, only for training, a few rows / images of the training dataset in order to have a complete final batch? It should at least be better than deleting input data.

Kind regards,
Theodore.

hiromi · June 14, 2018, 2:23pm

The issue only happens when the last batch has only 1 element. I usually adjust the batch size so that the remainder is anything but one, but your solution also works

dineshydv · July 8, 2020, 5:39am

Thanks. It solved my problem. I was trying to extract features of intermediate layers and putting eval() solved this error.