Select GPU while using load_learner

Hello everybody,

I want to do inference on different machines. The training takes place on a machine with several GPUs. In order to be able to perform several experiments at the same time, I assign a special GPU using torch.cuda.set_device(). After training is finished, I export the model by using the function export. On inference time I want to load this model on another machine with only one GPU by using the function load_learner, but I get the following error message:

Attempting to deserialize object on CUDA device 2 but torch.cuda.device_count() is 1. Please use torch.load with map_location to map your storages to an existing device.

Is it somehow possible to specify the GPU the load_learner function is using or am I doing something wrong?

Since you were training on multiple GPUs, it’s trying to load on a GPU that isn’t there anymore. If you look at the code for load_learner, you will see:

state = torch.load(source, map_location='cpu') if defaults.device == torch.device('cpu') else torch.load(source)

The default device is cuda though if it is available, so try adding map_location='cuda:0' which will put the model on the first GPU.

torch.load(source, map_location='cuda:0')

@Tom2718 Thanks for your reply. I just tried out the solution you proposed, but unfortunately I get an error message.

File "/home/woe2a/PycharmProjects/test/GUI.py", line 81, in __init__
    erg = self.learn.predict(input)
File "/home/woe2a/.local/lib/python3.6/site-packages/fastai/basic_train.py", line 379, in predict
    x = self.data.denorm(x)
File "/home/woe2a/.local/lib/python3.6/site-packages/fastai/vision/data.py", line 61, in denormalize
    return x.cpu().float()*std[...,None,None] + mean[...,None,None] if do_x else x.cpu()
RuntimeError: expected device cpu and dtype Float but got device cuda:0 and dtype Float

The model is normalized with the imagenet_stats, it seems there is something wrong with that.

I am having the same issue but with v2, any way to select a different GPU when loading a .pkl model? Thanks.