I am training an image-to-image GAN to colorize photos. I am using a free GPU on the Gradient platform, which automatically shuts down after six hours. It has 16gb of memory. I am using a dataset containing more than 100k images. It seems the largest image size I can train on within six hours is 1 epoch of 320x320 photos. I would like to train my dataset on larger photos because I think it would be more useful for inference. To train on 384x384 photos, the batch size has to be so small that training takes about eight hours.
What can I do?
What’s your batch size?
Training time is affected a lot more by the image size than by batch size. If you move from 320x320 images to 384x384, your model will definitely take longer to train.
If your batch size is too small, you could accumulate gradients i.e. do a gradient update every
nth batch rather than every batch. I know the callback exists for
v1, not sure about
Another option: $8/month.
I’m paying the $8 a month to store the data on there. To pay for GPU usage would cost way too much by the time I finished training the model, unfortunately.
I’ll look into accumulating gradients. Thanks. That’s a good idea.
Ah cool, I think it isn’t documented on the website yet then
Right here (It’s called a little different as there’s no space so it can be tricky to find)
hahaha, I’m lazy and it didn’t show up in the search bar so I jumped to conclusions