Personal DL box

Iā€™ve found that sometimes if I canā€™t install it using pip install bcolz that I try conda install bcolz instead, and if the package is available with conda it usually works after that. Have you tried using conda to install bcolz?

1 Like

sure, i had no problem with VGG-like but my input images were gray-scale and relatively small. But when i tried to switch to Xception (Keras version) which is pretty deep i needed to decrease the batch size substantially - same with Resnet, a smaller batch helped.
but currently iā€™m running a CycleGAN, and OOM error causes a more serious problem - since the batch size is 1, i needed to decrease an training images size - practically half it in order to make it work - the generated images look not too bad but maybe the result would be better if i were running the full sizeā€¦

Thanks for posting that! One more step:

source activate fastai

You need to do that step every time you login. Or else put it in your .bashrc

2 Likes

hi All,

For those of you who have been building a new box. I just finished installing nvidia cuda and cudnn drivers last night. Had some issues getting the drivers my Ubuntu box, but found a very useful tutorial is actually found on the opencv website. If someone is just setting up their desktop build for the first time, hope these notes are helpful.

If anyone else has any feedback or installation notes, or a better guide, I would be interested in their experiences as well. I know some of my fellow USF masterā€™s students have been re-configuring old computers to use as DL boxes.

Cheers,

Tim

Installation Guide covers:

  • installation of Nvidia drivers on Ubuntu, specifically CUDA, CUDNN drivers
  • setup of python environments for deep learning frameworks (ignore if you want to use conda for package installations)

A couple of caveats:

  1. Know your Framework / Driver Version Compatibility: Before you start installing any of the software, note compatibility issues with Torch. From the website the only links available are for CUDA 7.5 or 8.0, which are older versions. To make Torch run on CUDA 9, you have to clone a repo + install (a bit more complicated)
  2. Restart your comp after drivers are installed: Once the CUDA is installed, make sure to reboot your machine to make sure the drivers are installed.
  3. Check versions between CUDA + CUDNN Make sure the CUDNN + CUDA versions are matched correctly with the framework you want to use
  4. Note Python version + Framework Compatibility: If ever interested in tensorflow, make sure your python version matches (sometimes TF is looking for 3.5 instead of the current 3.6)
  5. Recommend the .deb installation method : Thereā€™s two ways of installing the nvidia CUDA drivers, the .deb / local run file option. IMO, the .deb(local) approach is much cleaner and easier to manage. (see img below)

Installing Deep Learning Frameworks on Ubuntu with Cuda Support

https://www.learnopencv.com/installing-deep-learning-frameworks-on-ubuntu-with-cuda-support/

My Rig:
Intel i5 (from 2011)
32 GB RAM
Nvidia GTX1080
1 x 500GB SSD ubuntu
Some other HDā€™s for storage and a Windows Boot

6 Likes

hereā€™s my config: https://pcpartpicker.com/list/DZJLtJ ā€¦ In addition I am using a K40 GPU

I just finished putting together my DL box a couple of days ago. I follow a simple process :wink:

  1. wget and run this file - this installs vim, copies all my dot files, etc, I use it on any new box I am setting up
  2. wget and run this file - this is a modified script from part1 v1 that includes everything for part1 v2 and also installs keras, this could potentially be useful to a person setting up their own box (I try to maintain it and it has been tested by a couple of people in the course)

With regards to the hardware, I didnā€™t expect I would enjoy having a box of my own so much. I have automated AWS provisioning quite a bit but still there is something extremely nice if you can just open up your laptop and reconnect to your box without any further ado. I think this convenience is really valuable, and the fact that you can have notebooks open and can resume working on them throughout the day, etc.

Having said that, I went for 1080ti but not sure this was the best choice for me. It is definitely quite pricey and there is still a long way till I can use itā€™s full potential. I am thinking a 1070 might be the sweet spot with how much compute it offers and how affordable it is, at least in Poland I could have picked it up for 50% the price of a 1080ti.

I mention this because I donā€™t think you need a top end card to be able to benefit a lot from having a DL box. Thus far my experience is that once you remove the AWS overhead, it is much easier to experiment and I think this is what much of the learning I will do will be about, at least for next couple of weeks. Probably had I had any GPU to speak of I wouldnā€™t have even considered getting a new one.

Maybe this will change later on but at least for now this is how I see it and should you have a GPU with any reasonable amount of RAM, I think you will be in for a real treat if you work on a set up of your own vs only relying on AWS :slight_smile:

EDIT: The consensus among more experienced folks seem to be to go for the biggest and the greatest GPU you can reasonably afford, so please take my comments with a grain of salt :slight_smile:

6 Likes

I agree with that, but I wouldnā€™t buy something more expensive than a 1080ti - I donā€™t think itā€™s worth it. If youā€™re lucky enough to have more money than that, buy a 2nd one!

4 Likes

Would you say the RAM is the biggest reason to get the 1080Ti vs the 1070? I mean how big of a difference is the actual speed for those two GPUs?

Yes absolutely the RAM!

Ok, that is what I thought. Thanks for clarifying.

Plus you may have so much fun cooling it :wink:

6 Likes

The 1080 Ti should be about 40% faster than the 1070 for CNN workloads based on the benchmarks Iā€™ve looked into.

The new Pascal 1070 Ti was designed to slot in between the Maxwell 1070 and 1080 in terms of performance, so we should expect it to be somewhere around 65% of the speed of a 1080 Ti.

1070 1070 Ti 1080 1080 Ti
Cuda Cores 1920 2432 2560 3584
FP32 (TFLOPS) 6.5 8.1 9.0 11.5
Memory Speed (Gbps) 8 8 11 11
Memory Bandwidth (GBps) 256 256 352 484
Rel CNN Perf 0.6 0.7 1.0
1 Like

I did not understand your post until I faced this need. Guys, ngrok is amazing if you dont have static IP- it allows you to expose a web server running on your local machine to the internet. All you need to do is to setup jupyter notebook security, run it on your DL box and open with ngrok the same port. You can get public URL to your DL box through WEB site. So to ensure you dont lose access to your home server:

  • setup PC auto turn on (in case something happened)
  • auto launch jupyter notebook
  • auto launch ngrok
6 Likes

Nice. Iā€™ve been meaning to set up reverse ssh tunneling. This seems much simpler.

Thereā€™s also the open source frp which some folks on HN recommended. Older hn discussion here

2 Likes

Fun fact! With an Intel Core i5-7500, 3.4GHz, 6MB CPU running the lesson 1 notebook (with data augmentation) it maxes out the CPU cores:

While the GPU sits nearly idly with 1080ti :slight_smile: It never reached over 60 C and nvidia-smi dmon showed utilization not exceeding 50%.

I think this is really useful to know when thinking of setting up your own box. Especially when it comes to random forests (which I completely didnā€™t foresee learning about but I am super happy about the ml1 lectures being shared!) getting a stronger CPU might not be a bad idea if you can afford it, and more ram.

If I were buying the parts again I would probably look at Ryzen, would likely completely skip bigger HDD (got a 3TB one) and would have considered getting more RAM.

Oh well - the only way to learn is through experience :slight_smile:

6 Likes

I tried to re-do everything again and followed all the installation steps. After I ran a notebook, I got an error message "No module named ā€œbcolzā€. Then, I used the command conda install bcolz. But, bcolz is already installed.

I closed everything and re-launched a terminal. The "No module named ā€œbcolzā€ error message persists. Please help!

If you run the python shell from the terminal, can you import bcolz? If so maybe try killing the jupyter notebooks process and starting that again, making sure you do source activate fastai first. That sets up your environment to point to the right python

Still no luck.

sorry Iā€™m out of ideas

Maybe add the --force flag to conda install bcolz?