Justin Johnson from stanford has some good hardware benchmarks for how fast different CNN’s run.
Sample Table from his page:
Justin Johnson from stanford has some good hardware benchmarks for how fast different CNN’s run.
Sample Table from his page:
Happy new year everybody!
I am switching from Keras to Pytorch. Apparently i installed everything fine but while trying to run the first lesson’s block:
arch=resnet34
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
learn = ConvLearner.pretrained(arch, data, precompute=True)
learn.fit(0.01, 3)
i get following error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-12-42d5f498bf97> in <module>()
1 arch=resnet34
2 data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
----> 3 learn = ConvLearner.pretrained(arch, data, precompute=True)
4 learn.fit(0.01, 3)
~/fastai/courses/dl1/fastai/conv_learner.py in pretrained(cls, f, data, ps, xtra_fc, xtra_cut, **kwargs)
96 def pretrained(cls, f, data, ps=None, xtra_fc=None, xtra_cut=0, **kwargs):
97 models = ConvnetBuilder(f, data.c, data.is_multi, data.is_reg, ps=ps, xtra_fc=xtra_fc, xtra_cut=xtra_cut)
---> 98 return cls(data, models, **kwargs)
99
100 @property
~/fastai/courses/dl1/fastai/conv_learner.py in __init__(self, data, models, precompute, **kwargs)
89 elif self.metrics is None:
90 self.metrics = [accuracy_multi] if self.data.is_multi else [accuracy]
---> 91 if precompute: self.save_fc1()
92 self.freeze()
93 self.precompute = precompute
~/fastai/courses/dl1/fastai/conv_learner.py in save_fc1(self)
135 m=self.models.top_model
136 if len(self.activations[0])!=len(self.data.trn_ds):
--> 137 predict_to_bcolz(m, self.data.fix_dl, act)
138 if len(self.activations[1])!=len(self.data.val_ds):
139 predict_to_bcolz(m, self.data.val_dl, val_act)
~/fastai/courses/dl1/fastai/model.py in predict_to_bcolz(m, gen, arr, workers)
12 m.eval()
13 for x,*_ in tqdm(gen):
---> 14 y = to_np(m(VV(x)).data)
15 with lock:
16 arr.append(y)
~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
323 for hook in self._forward_pre_hooks.values():
324 hook(self, input)
--> 325 result = self.forward(*input, **kwargs)
326 for hook in self._forward_hooks.values():
327 hook_result = hook(self, input, result)
~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/container.py in forward(self, input)
65 def forward(self, input):
66 for module in self._modules.values():
---> 67 input = module(input)
68 return input
69
~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
323 for hook in self._forward_pre_hooks.values():
324 hook(self, input)
--> 325 result = self.forward(*input, **kwargs)
326 for hook in self._forward_hooks.values():
327 hook_result = hook(self, input, result)
~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/conv.py in forward(self, input)
275 def forward(self, input):
276 return F.conv2d(input, self.weight, self.bias, self.stride,
--> 277 self.padding, self.dilation, self.groups)
278
279
~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/functional.py in conv2d(input, weight, bias, stride, padding, dilation, groups)
88 _pair(0), groups, torch.backends.cudnn.benchmark,
89 torch.backends.cudnn.deterministic, torch.backends.cudnn.enabled)
---> 90 return f(input, weight, bias)
91
92
RuntimeError: CUDNN_STATUS_NOT_INITIALIZED
I checked nvidia drivers version with nvcc --version it seems ok
nvcc: NVIDIA ® Cuda compiler driver
Copyright © 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
Like .theanorc is there any Pytorch Configuration file that needs to be adjusted with the GPU ?
Thanks for any help!
Sounds like you might not have cudnn installed, or may have the wrong version. Try using the fastai AMI or Paperspace if you want to get up and running quickly.
Hi: somehow with my new paperspace machine and setup using your new script for v2 (curl http://files.fast.ai/setup/paperspace | bash),
I don’t seem to get pytorch to recognize cuda:
Simple test script:
import torch
import torch.utils.data
from torch import nn, optim
from torch.autograd import Variable
from torch.nn import functional as F
from torchvision import datasets, transforms
from torchvision.utils import save_image
print(torch.cuda.is_available())
exit();
Prints false.
I tried this because the lesson1 resnet portion was training very slow (60s/it).
I wonder how to debug this.
I resolved the problem of the GPU not found. I unstalled pytorch, torchvision (conda uninstall) and then reinstalled the same in the fastai environment. GPU is found and the training is quite fast.
That’s odd. Did you run the whole script on a fresh machine? It should have installed it into the fastai env for you…
Hi Jeremy, yes I ran the whole script on a new machine and it did install pytorch etc into the fastai env. But the installed versions didn’t use the GPU and standalone tests failed with errors when torch tried to use cuda as I said in my earlier message. I then created a new env with just torch and that passed the GPU tests. After that I simply uninstalled and reinstalled torch and torchvision into the fastai env and now things are good.
Thanks for the tip.
I reinstalled/upgraded CUDA & CUDNN and it fixed the error.
I have the same issue that pytorch cannot recognize cuda/GPU with running fastai script in a fresh paperspace machine. I found CUDA installed cannot recognize GPU by testing CUDA samples, and I reinstall pytorch and CUDA but they all still didn’t work. Then I create another fresh machine and follow these common steps: install cuda, cuDNN —> reboot --> verify if cuda works with GPU by running samples of CUDA —> install Anaconda —> create a new python environment —> install pytorch with conda command —> verify if torch.cuda.is_available() —> pip install fastai and checkout fastai from github. At last, the codes in notebooks can run in GPU, I don’t know why. I just copied lines from Jeremy’s scripts in a different sequence.
Did you checked versions? mine CUDA version just the second last and it somehow created problem.
both should be the latest. Checkout Nvidia’s website to find out latest versions. Also (as in the paperspace script) the enviroment variables have to be set.
I checked the versions of CUDA. The version of CUDA in script is 9.0, and wget cuda 9.0 command will download a very small size file which is unreasonable, but it will still download and install the latest version (9.1) automatically when executing the command of installing CUDA.
Thats weird. CUDA is over 1 GB i think. To download CUDNN directly from nvidia site you need to log in i think. The script from Jeramy goes directly to URL where you dont need to login.
Check your setup files if they valid setup files.
Are you uninstalling everything before reinstalling?
I am sorry I make it confusing. I download CUDA from Nvidia site and cudnn file from fastai site, same lines in the script
I try to uninstall CUDA with apt command after finding your tips, but I cannot make sure if my way of uninstalling is correct and complete.
Here is mine nvcc --version output
:
to check your version in Jupyter notebook you can use following lines. mine outputs 7005
import torch
torch.backends.cudnn.version()
and
torch.backends.cudnn.version()
print(torch.backends.cudnn.is_acceptable(torch.cuda.FloatTensor(1)))
print(torch.backends.cudnn.version())
returns
True
7005
which means it recognizes CUDNN.
Try uninstalling everything and then do a clean install.
Thanks so much for your advice. I will run your codes to verify it again. Actually, I have made torch recognize CuDNN and run the model in GPU following the steps I posted.
It returns
True
7003
You want to upgrade it to 7005?