Hey folks,
I’m already despaired.
I’ve got 61 pairs of images [input: dark image (fast shutter), target: bright image (slow shutter)] of the same objects. Each image is in size of (NxM=4272x2848 pixels).
I’m asking your help with two issues:
- Problem 1: Configure the mini-batches / patches.
- Problem 2: Find a solution to the CUDA memory problem.
So:
Problem 1:
I want to split the whole dataset into about roughly 10 batches. That’s why I set up bs=6
(which creates batches with a size of 6 each).
But for each batch, I’d want to split every images-pair into 10-200 smaller mini-batches/patches. This means, that for every image, there would be created many patches in size of WxD=256x256 (out of the NxM=4272x2848 pixels), which would be fed every epoch.
The goal here is to use less memory in GPU since it has only available 12GB at a time.
The patches shouldn’t be squished, rescaled, padded, or anything. They shall only be smaller crops of the original size.
Which shall I choose?
item_tfms=RandomCrop(256)
batch_tfms=RandomCrop(256)
How can I tell that it creates 10 batches of 6 images of size MxN, with 10-200 patches of size WxD?
Because now it looks like that it creates 10 batches of 6 images of size WxD. Right?
Problem 2:
As I tried to start training that model, I got this issue:
I don’t get it. What is actually happening behind?
I tried to research the forum here, and found somebody’s recommending on his function that prints out all of the data that was created in the GPU RAM.
def pretty_size(size):
"""Pretty prints a torch.Size object"""
assert(isinstance(size, torch.Size))
return " × ".join(map(str, size))
def dump_tensors(gpu_only=True):
"""Prints a list of the Tensors being tracked by the garbage collector."""
import gc
total_size = 0
for obj in gc.get_objects():
try:
if torch.is_tensor(obj):
if not gpu_only or obj.is_cuda:
print("%s:%s%s %s" % (type(obj).__name__,
" GPU" if obj.is_cuda else "",
" pinned" if obj.is_pinned else "",
pretty_size(obj.size())))
total_size += obj.numel()
elif hasattr(obj, "data") and torch.is_tensor(obj.data):
if not gpu_only or obj.is_cuda:
print("%s → %s:%s%s%s%s %s" % (type(obj).__name__,
type(obj.data).__name__,
" GPU" if obj.is_cuda else "",
" pinned" if obj.data.is_pinned else "",
" grad" if obj.requires_grad else "",
" volatile" if obj.volatile else "",
pretty_size(obj.data.size())))
total_size += obj.data.numel()
except Exception as e:
pass
print("Total size:", total_size)
Here is the output: What am I actually seeing here?
Tensor: GPU pinned
Tensor: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 64 × 512 × 512
Tensor: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 64 × 512 × 512
Tensor: GPU pinned
Tensor: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 64 × 512 × 512
Tensor: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 64 × 512 × 512
Tensor: GPU pinned
Tensor: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 64 × 512 × 512
Tensor: GPU pinned 6 × 128 × 256 × 256
TensorImage: GPU pinned 6 × 128 × 256 × 256
Tensor: GPU pinned
Tensor: GPU pinned 6 × 128 × 256 × 256
TensorImage: GPU pinned 6 × 128 × 256 × 256
Tensor: GPU pinned 6 × 128 × 256 × 256
TensorImage: GPU pinned 6 × 128 × 256 × 256
Tensor: GPU pinned
Tensor: GPU pinned 6 × 128 × 256 × 256
TensorImage: GPU pinned 6 × 128 × 256 × 256
Tensor: GPU pinned 6 × 128 × 256 × 256
TensorImage: GPU pinned 6 × 128 × 256 × 256
Tensor: GPU pinned
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned 128
Tensor: GPU pinned 128
Tensor: GPU pinned 128
Tensor: GPU pinned 128
Tensor: GPU pinned 128
Tensor: GPU pinned 128
Tensor: GPU pinned 128
Tensor: GPU pinned 128
Tensor: GPU pinned
Tensor: GPU pinned 128
Tensor: GPU pinned 128
Tensor: GPU pinned
Tensor: GPU pinned 128
Tensor: GPU pinned 128
Tensor: GPU pinned
Tensor: GPU pinned 128
Tensor: GPU pinned 128
Tensor: GPU pinned
Tensor: GPU pinned 128
Tensor: GPU pinned 128
Tensor: GPU pinned
Tensor: GPU pinned 128
Tensor: GPU pinned 128
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 512
Tensor: GPU pinned 512
Tensor: GPU pinned
Tensor: GPU pinned 512
Tensor: GPU pinned 512
Tensor: GPU pinned
Tensor: GPU pinned 512
Tensor: GPU pinned 512
Tensor: GPU pinned
Tensor: GPU pinned 512
Tensor: GPU pinned 512
Tensor: GPU pinned
Tensor: GPU pinned 512
Tensor: GPU pinned 512
Tensor: GPU pinned
Tensor: GPU pinned 512
Tensor: GPU pinned 512
Tensor: GPU pinned
Tensor: GPU pinned 512
Tensor: GPU pinned 512
Tensor: GPU pinned
Tensor: GPU pinned 512
Tensor: GPU pinned 512
Tensor: GPU pinned
Tensor: GPU pinned 256
Tensor: GPU pinned 256
Tensor: GPU pinned
Tensor: GPU pinned 128
Tensor: GPU pinned 128
Tensor: GPU pinned
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned
Tensor: GPU pinned 64
Tensor: GPU pinned 64
Tensor: GPU pinned
Tensor: GPU pinned 6 × 3 × 2048 × 2048
TensorImage: GPU pinned 6 × 3 × 2048 × 2048
Tensor: GPU pinned 6 × 3 × 2048 × 2048
Tensor: GPU pinned 6 × 64 × 1024 × 1024
TensorImage: GPU pinned 6 × 64 × 1024 × 1024
Tensor: GPU pinned
Tensor: GPU pinned 6 × 64 × 1024 × 1024
TensorImage: GPU pinned 6 × 64 × 1024 × 1024
Tensor: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 64 × 512 × 512
Tensor: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 64 × 512 × 512
Tensor: GPU pinned
Tensor: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 64 × 512 × 512
Tensor: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 64 × 512 × 512
Tensor: GPU pinned
Tensor: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 64 × 512 × 512
Tensor: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 64 × 512 × 512
Tensor: GPU pinned
Tensor: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 64 × 512 × 512
Tensor: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 64 × 512 × 512
TensorImage: GPU pinned 6 × 3 × 2048 × 2048
Tensor: GPU pinned 1 × 3 × 1 × 1
Tensor: GPU pinned 1 × 3 × 1 × 1
Parameter: GPU pinned 256 × 256 × 3 × 3
Parameter: GPU pinned 256
Parameter: GPU pinned 3
Parameter: GPU pinned 3 × 99 × 1 × 1
Parameter: GPU pinned 99
Parameter: GPU pinned 99 × 99 × 3 × 3
Parameter: GPU pinned 99
Parameter: GPU pinned 99 × 99 × 3 × 3
Parameter: GPU pinned 384
Parameter: GPU pinned 384 × 96 × 1 × 1
Parameter: GPU pinned 96
Parameter: GPU pinned 96 × 96 × 3 × 3
Parameter: GPU pinned 96
Parameter: GPU pinned 96 × 192 × 3 × 3
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 512
Parameter: GPU pinned 512 × 256 × 1 × 1
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 1024
Parameter: GPU pinned 512
Parameter: GPU pinned 1024
Parameter: GPU pinned 256
Parameter: GPU pinned 256
Parameter: GPU pinned 512
Parameter: GPU pinned 512
Parameter: GPU pinned 1024
Parameter: GPU pinned 128
Parameter: GPU pinned 128
Parameter: GPU pinned 384
Parameter: GPU pinned 384
Parameter: GPU pinned 768
Parameter: GPU pinned 64
Parameter: GPU pinned 64
Parameter: GPU pinned 256
Parameter: GPU pinned 1024 × 512 × 3 × 3
Parameter: GPU pinned 512 × 1024 × 3 × 3
Parameter: GPU pinned 1024 × 512 × 1 × 1
Parameter: GPU pinned 512 × 512 × 3 × 3
Parameter: GPU pinned 512 × 512 × 3 × 3
Parameter: GPU pinned 1024 × 512 × 1 × 1
Parameter: GPU pinned 384 × 384 × 3 × 3
Parameter: GPU pinned 384 × 384 × 3 × 3
Parameter: GPU pinned 768 × 384 × 1 × 1
Parameter: GPU pinned 256 × 256 × 3 × 3
Parameter: GPU pinned 64 × 3 × 7 × 7
Parameter: GPU pinned 64 × 64 × 3 × 3
Parameter: GPU pinned 64 × 64 × 3 × 3
Parameter: GPU pinned 64 × 64 × 3 × 3
Parameter: GPU pinned 64 × 64 × 3 × 3
Parameter: GPU pinned 64 × 64 × 3 × 3
Parameter: GPU pinned 64 × 64 × 3 × 3
Parameter: GPU pinned 128 × 64 × 3 × 3
Parameter: GPU pinned 128 × 128 × 3 × 3
Parameter: GPU pinned 128 × 64 × 1 × 1
Parameter: GPU pinned 128 × 128 × 3 × 3
Parameter: GPU pinned 128 × 128 × 3 × 3
Parameter: GPU pinned 128 × 128 × 3 × 3
Parameter: GPU pinned 128 × 128 × 3 × 3
Parameter: GPU pinned 128 × 128 × 3 × 3
Parameter: GPU pinned 128 × 128 × 3 × 3
Parameter: GPU pinned 256 × 128 × 3 × 3
Parameter: GPU pinned 256 × 256 × 3 × 3
Parameter: GPU pinned 256 × 128 × 1 × 1
Parameter: GPU pinned 256 × 256 × 3 × 3
Parameter: GPU pinned 256 × 256 × 3 × 3
Parameter: GPU pinned 256 × 256 × 3 × 3
Parameter: GPU pinned 256 × 256 × 3 × 3
Parameter: GPU pinned 256 × 256 × 3 × 3
Parameter: GPU pinned 256 × 256 × 3 × 3
Parameter: GPU pinned 256 × 256 × 3 × 3
Parameter: GPU pinned 256 × 256 × 3 × 3
Parameter: GPU pinned 256 × 256 × 3 × 3
Parameter: GPU pinned 256 × 256 × 3 × 3
Parameter: GPU pinned 512 × 256 × 3 × 3
Parameter: GPU pinned 512 × 512 × 3 × 3
Parameter: GPU pinned 512 × 256 × 1 × 1
Parameter: GPU pinned 512 × 512 × 3 × 3
Parameter: GPU pinned 512 × 512 × 3 × 3
Parameter: GPU pinned 512 × 512 × 3 × 3
Parameter: GPU pinned 512 × 512 × 3 × 3
Total size: 5074405187
Considering that I use Tesla T4 (GPU 11.17GB), and 61 images x 4MB each = 244MB. How did it turn out to store already 10.47GB? What settings or configurations should I do to fix it?
Thanks
Link to my script: