Learner.backward(item) usage documentation, goal: gradients per sample

I need to get the gradients per sample (not mini-batch), of the last layer group. Thus I intended to use: https://docs.fast.ai/basic_train.html#Learner.backward
But can’t find any usage documentation for this function, nor are there any tests for it demonstrating it. My issue is: Where do I get the item parameter from? I have my learner object, it has a train_dl, I can iterate over the dataloader, but this does not give me item objects.

So far I have:

for item in learn.dl(vision.DatasetType.Valid):
  learn.backward(item[0])

But this fails with:

  File "…/lib/python3.8/site-packages/fastai/data_block.py", line 654, in __getitem__
    x = x.apply_tfms(self.tfms, **self.tfmargs)
AttributeError: 'Tensor' object has no attribute 'apply_tfms'

It remains opaque to me how to obtain the correct input for learn.backward(), so I tried to work with the code in .backward directly:

for item in learn.dl(vision.DatasetType.Valid):
  basic_train.loss_batch(learn.model.eval(), item[0], item[1], learn.loss_func, opt=basic_train.FakeOptimizer())

But this fails with:

  File "~/.local/lib/python3.8/site-packages/fastai/basic_train.py", line 26, in loss_batch
    out = model(*xb)
  File "~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "~/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "~/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "~/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "~/.local/lib/python3.8/site-packages/torchvision/models/resnet.py", line 101, in forward
    out = self.bn1(out)
  File "~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "~/.local/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 104, in forward
    return F.batch_norm(
  File "~/.local/lib/python3.8/site-packages/torch/nn/functional.py", line 1668, in batch_norm
    return torch.batch_norm(
RuntimeError: CUDA out of memory. Tried to allocate 384.00 MiB (GPU 0; 11.17 GiB total capacity; 10.25 GiB already allocated; 256.31 MiB free; 10.60 GiB reserved in total by PyTorch)

I don’t retrieve an OOM during training nor normal prediction. Also this exception happens when calling .loss_batch the first time.