in test time , Is “torch.no_grad()” automatically called when we call learn.model.eval(); ?
Because in below code , I can’t see any change in run time by using “with torch.no_grad()” or not
learn.model.eval();
import time
start = time.process_time()
with torch.no_grad():
probs_kasre = learn.model(t_x)
print("\n total time is :\t" , time.process_time() - start)
All I see is PyTorch code, so that may be where the confusion lies learn.model.eval is the same as model.eval from torch because that’s all learn.model is. Does that help some?
There’s nothing inherently special about fastai’s learn.model
I’m not sure about the running time, but if I’ve just setup the model and throw some dummy data into it to ensure the output shapes look correct, not using torch.no_grad() stores the gradients, and this takes up GPU memory, overloads my GPU and leads to an out of memory error that can’t be fixed with torch.cuda.empty_cache(), so you gotta restart your notebook kernel. All of this is when the model is not in eval mode.
Is ONNX always a better choice? For inference on CPU, yes, there’s a massive difference, but assuming you’re running inference on a GPU, is this accurate:
Torch Script > PyTorch > ONNX
Last I checked, onnxruntime isn’t implemented for GPU inference in Python yet, just C++