GPU usage lag-time

Hi all,

Using the NVIDIA NSIGHT profiler I saw that the function learn.get_preds() has a startup time where nothing seems to be happening. See below for the screenshot. Can anyone shed some light on what might be happening during this time?

Note that this is taken during a second iteration of the code, meaning that the GPU has already been activated.

The workflow is the following:

learn = load_learner(f'{modelname}.pkl', cpu=False)
dl = learn.dls.test_dl(imgs, bs=batch_size, device=torch.device('cuda'))
predictions, ignored, preds_binary = learn.get_preds(dl=dl, with_decoded=True)

Thanks!