I wonder is it possible to compute batch size in advance? Taking into account image resolution, the number of GPUs, model size, etc. I mean, I guess it is possible as soon as all parameters are known in advance. The only question is how difficult to get a generic solution?
I know there is an interesting context manager that helps to manage GPUs while working in Jupyter. However, as I can understand, it helps to work with GPU without kernel restarts. But I guess it should be possible to pre-compute optimal batch value depending on hardware and data setup. Even if parameters should be entered manually.
Because so far every once in a while I have issues with picking the right value and getting CUDA errors when, for example, unfreezing the model and continue training. It requires more parameters to train and therefore fails. Probably not a bit deal if you working in the notebook but becomes a problem when writing end-to-end training scripts