Need some advise, guys.
Is GTX 1080Ti worth additional $200 over GTX1080? I know it has +3GB memory so it should be quicker as potentially it can process more pictures at once. But when I was training models on p2.xlarge I have never seen memory load more than 60% (it is K80 if I am not mistaken, so 60% is 7GB). Limitation probably was on a hard drive IO. Patriot HellFire MLC can do 3000/2400MB/s but I I cant sync all this math to understand if I can load all 11GB of 1080Ti?
And a simpler question: should 8 GB RAM be enough? Or maybe somebody can give feedback on config?
Aws is a great place to start so I think itās important to expand this a bit. Using an online platform is great to test the waters and decide if you want to explore deep learning further. In the long run, it makes more sense to buy a personal machine to change the cost model from a pay per hour cost to a pay once and explore all you want.
Yes. Get the bigger gpu. I have the 1080Ti and it is really a great one. I guess I canāt speak for the 1080 as Iāve never had it, but I donāt regret spending the extra money in this area at all.
Just to also point out, if you already had a gtx1080 I probably wouldnāt recommend upgrading to the 1080Ti, but when you are buying one or the other, go for the better hardware.
If I have a personal DL box, how can I replicate the whole AWS fastai environment/library on Linux?
I tried to set up a virtual environment for Py36 and started installing few dependencies. But, I experienced lots of error messages about missing other dependencies. Should I ignore the error messages and keep installing all the dependencies (see the full list below)? Any short cut for this process? For future maintenance, am I required to run āgit pullā and āconda env updateā on a regularly basis?
My laptop has GTX1070, 32GB RAM, 2 x 1TB SSD (dual OS - Windows and Linux).
I setup an Ubuntu box a couple weeks ago. I didnāt use conda, but I do have everything in a venv. I havenāt bothered updating any python modules after installing them the first time and havenāt run into any problems.
What kind of error messages are you getting? Python modules should install their dependencies automatically.
I would definitely get the 1080Ti if I were you, I was being cheap and got the 1070 and regret it now
Also I would get 32gb RAM if you want to be able to handle larger data sets with bigger batch sizes. I have 16gb and wish I had more. I def think 8gb is too little so at least get 16gb.
What architecture are you running ? I am just curious to know ā¦ I have been playing around with ResNets and VGG for a while havenāt faced any such issues.
@jeremy When I started running the notebooks, I came crossed ācannot find module bcolzā. So, I installed bcolz individually. However, I got an error message saying bcolz was installed. I removed everything and reinstalled the whole repo by using download zip file from GitHub instead of git clone. However, no improvement. I found others discussing similar problem in Paperspace few days ago. But, no solution in the forums. Any idea? In the meantime, I will use AWS.
BTW, itās not necessary to run the git clone command from the ~/anaconda3/envs directory as you describe in step 1. You can run that from anywhere on your hard drive!
I prefer to keep all the source code that I download in one place, under ~/source, so I would have gone cd ~/source in step 1 instead.
Iāve found that sometimes if I canāt install it using pip install bcolz that I try conda install bcolz instead, and if the package is available with conda it usually works after that. Have you tried using conda to install bcolz?
sure, i had no problem with VGG-like but my input images were gray-scale and relatively small. But when i tried to switch to Xception (Keras version) which is pretty deep i needed to decrease the batch size substantially - same with Resnet, a smaller batch helped.
but currently iām running a CycleGAN, and OOM error causes a more serious problem - since the batch size is 1, i needed to decrease an training images size - practically half it in order to make it work - the generated images look not too bad but maybe the result would be better if i were running the full sizeā¦
For those of you who have been building a new box. I just finished installing nvidia cuda and cudnn drivers last night. Had some issues getting the drivers my Ubuntu box, but found a very useful tutorial is actually found on the opencv website. If someone is just setting up their desktop build for the first time, hope these notes are helpful.
If anyone else has any feedback or installation notes, or a better guide, I would be interested in their experiences as well. I know some of my fellow USF masterās students have been re-configuring old computers to use as DL boxes.
Cheers,
Tim
Installation Guide covers:
installation of Nvidia drivers on Ubuntu, specifically CUDA, CUDNN drivers
setup of python environments for deep learning frameworks (ignore if you want to use conda for package installations)
A couple of caveats:
Know your Framework / Driver Version Compatibility: Before you start installing any of the software, note compatibility issues with Torch. From the website the only links available are for CUDA 7.5 or 8.0, which are older versions. To make Torch run on CUDA 9, you have to clone a repo + install (a bit more complicated)
Restart your comp after drivers are installed: Once the CUDA is installed, make sure to reboot your machine to make sure the drivers are installed.
Check versions between CUDA + CUDNN Make sure the CUDNN + CUDA versions are matched correctly with the framework you want to use
Note Python version + Framework Compatibility: If ever interested in tensorflow, make sure your python version matches (sometimes TF is looking for 3.5 instead of the current 3.6)
Recommend the .deb installation method : Thereās two ways of installing the nvidia CUDA drivers, the .deb / local run file option. IMO, the .deb(local) approach is much cleaner and easier to manage. (see img below)