I’m a bit surprised to see that aws p2.xlarge is quite slow considering its high cost.
Currently working on the statefarm sample, I notice that each epoch takes 26~33s although the initial output was showing that each epoch ran in 11s.
I’ve checked that I’m really using gpu with theano’s script after changing cuda.use('gpu1') → cuda.use('gpu0') (as shown by nvidia-smi, there’s only one gpu available on p2.xlarge).
Therefore, I’m curious as to what has been used when building the course. Anyone can shed some light on it?
Using gpu device 0: Tesla K80 (CNMeM is enabled with initial size: 75.0% of memory, cuDNN 5103)
/home/ubuntu/anaconda2/lib/python2.7/site-packages/theano/sandbox/cuda/__init__.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.
warnings.warn(warn)
That looks good, you can up the memory to 80-90% in the Theano config but that won’t make a huge difference.
I would recommend running GPUTest and see how you score, I don’t know how a K80 would score but if I was to guess around 3000-4000. If you get significantly lower, it is likely something isn’t right.
Can you run lesson 1, first fit and see what you get for time. p2 instance should give you around 650s per epoch.
Hi,
650s per epoch meaning for 5 epochs we are getting close to an hour… for a single fit after unfreeze?
How much time should the dog breeds experiment take,there are several fits there…?
I’m running on p2 and I see that the GPU is busy performing the task when I type nvidia-smi. I haven’t measured timings yet, because I was sure it’s supposed to be several minutes and not hours…