Has anyone done any work compressing the unet model in memory and/or quickly moving it from system memory to gpu memory?
I am finding it takes about 1.7s to move to system memory and 0.429s to gpu memory using .to(‘cpu’), .to(‘cuda’)
Has anyone done any work compressing the unet model in memory and/or quickly moving it from system memory to gpu memory?
I am finding it takes about 1.7s to move to system memory and 0.429s to gpu memory using .to(‘cpu’), .to(‘cuda’)
Don’t you just do it once? At init time?
I intend to swap in and out different unets