Personal DL box

More RAM would allow you to handle more data in preprocessing correct?

Not really - preprocessing a batch takes very little RAM. Shouldn’t ever be a bottleneck.

1 Like

i have GeForce GTX 1080 - 8GB, and recent NN architectures brought up the “out of memory” issue to deal with

1 Like

Haha. I got stuck as expected. :sweat_smile: Thanks for @johnnyv as my remote IT support. Here are the procedure to replicate fastai environment to a local machine.

  1. In terminal, under ~/anaconda3/envs/ directory,
    $ git clone https://github.com/fastai/fastai.git

  2. $ cd fastai/ (the environment.yml file is under this directory)

  3. In ~/anaconda3/envs/fastai/ directory,
    $ conda env create -f environment.yml

1 Like

What architecture are you running ? I am just curious to know … I have been playing around with ResNets and VGG for a while haven’t faced any such issues.

Regards,

Gokkul

@jeremy When I started running the notebooks, I came crossed “cannot find module bcolz”. So, I installed bcolz individually. However, I got an error message saying bcolz was installed. I removed everything and reinstalled the whole repo by using download zip file from GitHub instead of git clone. However, no improvement. I found others discussing similar problem in Paperspace few days ago. But, no solution in the forums. Any idea? In the meantime, I will use AWS.

Hi Sarada,

BTW, it’s not necessary to run the git clone command from the ~/anaconda3/envs directory as you describe in step 1. You can run that from anywhere on your hard drive!

I prefer to keep all the source code that I download in one place, under ~/source, so I would have gone cd ~/source in step 1 instead.

I’ve found that sometimes if I can’t install it using pip install bcolz that I try conda install bcolz instead, and if the package is available with conda it usually works after that. Have you tried using conda to install bcolz?

1 Like

sure, i had no problem with VGG-like but my input images were gray-scale and relatively small. But when i tried to switch to Xception (Keras version) which is pretty deep i needed to decrease the batch size substantially - same with Resnet, a smaller batch helped.
but currently i’m running a CycleGAN, and OOM error causes a more serious problem - since the batch size is 1, i needed to decrease an training images size - practically half it in order to make it work - the generated images look not too bad but maybe the result would be better if i were running the full size…

Thanks for posting that! One more step:

source activate fastai

You need to do that step every time you login. Or else put it in your .bashrc

2 Likes

hi All,

For those of you who have been building a new box. I just finished installing nvidia cuda and cudnn drivers last night. Had some issues getting the drivers my Ubuntu box, but found a very useful tutorial is actually found on the opencv website. If someone is just setting up their desktop build for the first time, hope these notes are helpful.

If anyone else has any feedback or installation notes, or a better guide, I would be interested in their experiences as well. I know some of my fellow USF master’s students have been re-configuring old computers to use as DL boxes.

Cheers,

Tim

Installation Guide covers:

  • installation of Nvidia drivers on Ubuntu, specifically CUDA, CUDNN drivers
  • setup of python environments for deep learning frameworks (ignore if you want to use conda for package installations)

A couple of caveats:

  1. Know your Framework / Driver Version Compatibility: Before you start installing any of the software, note compatibility issues with Torch. From the website the only links available are for CUDA 7.5 or 8.0, which are older versions. To make Torch run on CUDA 9, you have to clone a repo + install (a bit more complicated)
  2. Restart your comp after drivers are installed: Once the CUDA is installed, make sure to reboot your machine to make sure the drivers are installed.
  3. Check versions between CUDA + CUDNN Make sure the CUDNN + CUDA versions are matched correctly with the framework you want to use
  4. Note Python version + Framework Compatibility: If ever interested in tensorflow, make sure your python version matches (sometimes TF is looking for 3.5 instead of the current 3.6)
  5. Recommend the .deb installation method : There’s two ways of installing the nvidia CUDA drivers, the .deb / local run file option. IMO, the .deb(local) approach is much cleaner and easier to manage. (see img below)

Installing Deep Learning Frameworks on Ubuntu with Cuda Support

https://www.learnopencv.com/installing-deep-learning-frameworks-on-ubuntu-with-cuda-support/

My Rig:
Intel i5 (from 2011)
32 GB RAM
Nvidia GTX1080
1 x 500GB SSD ubuntu
Some other HD’s for storage and a Windows Boot

6 Likes

here’s my config: https://pcpartpicker.com/list/DZJLtJ … In addition I am using a K40 GPU

I just finished putting together my DL box a couple of days ago. I follow a simple process :wink:

  1. wget and run this file - this installs vim, copies all my dot files, etc, I use it on any new box I am setting up
  2. wget and run this file - this is a modified script from part1 v1 that includes everything for part1 v2 and also installs keras, this could potentially be useful to a person setting up their own box (I try to maintain it and it has been tested by a couple of people in the course)

With regards to the hardware, I didn’t expect I would enjoy having a box of my own so much. I have automated AWS provisioning quite a bit but still there is something extremely nice if you can just open up your laptop and reconnect to your box without any further ado. I think this convenience is really valuable, and the fact that you can have notebooks open and can resume working on them throughout the day, etc.

Having said that, I went for 1080ti but not sure this was the best choice for me. It is definitely quite pricey and there is still a long way till I can use it’s full potential. I am thinking a 1070 might be the sweet spot with how much compute it offers and how affordable it is, at least in Poland I could have picked it up for 50% the price of a 1080ti.

I mention this because I don’t think you need a top end card to be able to benefit a lot from having a DL box. Thus far my experience is that once you remove the AWS overhead, it is much easier to experiment and I think this is what much of the learning I will do will be about, at least for next couple of weeks. Probably had I had any GPU to speak of I wouldn’t have even considered getting a new one.

Maybe this will change later on but at least for now this is how I see it and should you have a GPU with any reasonable amount of RAM, I think you will be in for a real treat if you work on a set up of your own vs only relying on AWS :slight_smile:

EDIT: The consensus among more experienced folks seem to be to go for the biggest and the greatest GPU you can reasonably afford, so please take my comments with a grain of salt :slight_smile:

6 Likes

I agree with that, but I wouldn’t buy something more expensive than a 1080ti - I don’t think it’s worth it. If you’re lucky enough to have more money than that, buy a 2nd one!

4 Likes

Would you say the RAM is the biggest reason to get the 1080Ti vs the 1070? I mean how big of a difference is the actual speed for those two GPUs?

Yes absolutely the RAM!

Ok, that is what I thought. Thanks for clarifying.

Plus you may have so much fun cooling it :wink:

6 Likes

The 1080 Ti should be about 40% faster than the 1070 for CNN workloads based on the benchmarks I’ve looked into.

The new Pascal 1070 Ti was designed to slot in between the Maxwell 1070 and 1080 in terms of performance, so we should expect it to be somewhere around 65% of the speed of a 1080 Ti.

1070 1070 Ti 1080 1080 Ti
Cuda Cores 1920 2432 2560 3584
FP32 (TFLOPS) 6.5 8.1 9.0 11.5
Memory Speed (Gbps) 8 8 11 11
Memory Bandwidth (GBps) 256 256 352 484
Rel CNN Perf 0.6 0.7 1.0
1 Like

I did not understand your post until I faced this need. Guys, ngrok is amazing if you dont have static IP- it allows you to expose a web server running on your local machine to the internet. All you need to do is to setup jupyter notebook security, run it on your DL box and open with ngrok the same port. You can get public URL to your DL box through WEB site. So to ensure you dont lose access to your home server:

  • setup PC auto turn on (in case something happened)
  • auto launch jupyter notebook
  • auto launch ngrok
6 Likes