I am trying to train a model, but I can only feed 8 images at a time bz = 8, with images of 256x256, I am getting really bad results, I know we need bigger bz in order to get gradients to propagate and get better results.
Is there a way to let your GPUs use less memory in order to increase the Batch Size?
You can use GradientAccumulation callback to make a bunch of smaller batches act like bigger ones. IE you set your batch size as 1 or 2 but you use the callback to make it update after say 20
@FraPochetti the problem with this is that because the gradients dont get to propagate, you end and with long train time to get any decent results or the model never generalizes.
@FraPochetti, because you need High Res images to have better classification on medical diagnostic.
and
In most cases (not all, for example in GANs) using bigger batches is better. But we usually have a limitation on our GPU. In this competition we have big images 1400 x 2100 and if we want to use the original size, then even P100 allows only small batches (~2 images per batch).
How about breaking up the large images into smaller sizes and still retain resolution. The new smaller size would be dependant on what you are trying to classify but it may be a possible solution.
@amritv thanks for the suggestion, I have done that, the images are between 25 and 60 megs each, tiff files and I have break then down into 32 parches.
But if the parch is 512x 512, I can only use batch size of 8 and 8 parches, using parches of 128x128 I can do 32 parches with batch size of 16. this give me 0.83 - 0.84
but I need better resolution in order to get better generalization.