Tabular data is slow and only using 25% of my GPU for training

I’ve been reading a bit more and had to come back to correct my previous statement, batch size does affect generalisation performance! Check out Train longer, generalize better: closing the generalization gap in large batch training of neural networks and generally this thread.
I maintain it’s a more advanced/fine-tuning topic, but thought it was important to correct my previous statement!