Making your own server

Thanks for this. After more tweaking without result, I suspect it is my Anaconda Python 3.6 conflicting with Cuda 8 (this officially works with >=3.3 to <3.6) but I do not want to downgrade. I also have the latest pygpu backend installed already and that’s not working as well, cannot be imported.

Have you considered a dual-boot Win10 and Ubuntu 16.04 ?
It’s pretty easy to install and will give you access to a wider range of “stable” machine learning tools.
Plus you’ll have to switch to Ubuntu for Part 2 down the road.

Pro-tip: make sure to keep a Win10 installation DVD or USB for the day you decide to uninstall your Ubuntu partition, as you’ll need it to repair/restore your boot manager :wink:

E.

1 Like

I never got CUDA 8 working with Python 3.6, I totally hear you about not wanting to downgrade (I am also stubborn like this) but in this case it might be easier to just relent and use 3.5 (you can set the latest Anaconda to use 3.5, no need to downgrade the Anaconda distribution)

I also agree that pragmatically, maybe using Linux is easiest right now
 but again some of us are stubborn and want to promote the use of the OS that we use daily. If no one puts the time in to smooth out the bumps then it will never get better. It’s also quite the pain to quit all my tasks and leave the rest of my software behind when I reboot into Linux.

A third possibility is SO close, with the WSL Bash on Windows 10. But unfortunately it does not yet support the use of the GPU, so you’d need to be in CPU-only mode when using the WSL. Vote here if you want MS to implement this: https://wpdev.uservoice.com/forums/266908-command-prompt-console-bash-on-ubuntu-on-windo/suggestions/16108045-opencl-cuda-gpu-support

1 Like

I will be setting up the deep learning server on the Costco machine (they have a $250 discount now) I hinted above. It looks like a great machine for $2k (1080Ti).

I will set it up as dual boot.

Do people still sticking to Ubuntu 16.04 + Python 3.5 combo?
Any specific recommendations/wisdom before I start configuring?

I will also be experimenting with OpenCV3 on this machine.

https://www.costco.com/CyberpowerPC-SLC3600C-Desktop---Intel-Core-i7---11GB-NVIDIA-GeForce-GTX-1080Ti-Graphics.product.100350563.html

@ai88

  1. I recall having some silly issue with the Ubuntu installation USB key regarding it being UEFI or Legacy BIOS compatible mode, and failing to boot for install. I think I had to go Legacy BIOS mode in the end, despite my motherboard being UEFI & co.

  2. Make sure you also keep an USB stick with Win10 installation files ready, see my post #379 above.
    It sounds cryptic but trust me: “been there, done that” :scream:

  3. Contrary to Windows ecosystem, your Ubuntu setup for deep-learning won’t appreciate the “keep all drivers updated” approach. The Nvidia GeForce driver in particular needs to remain the same than the Cuda one, say 375.66 both. If you update a 381.23 for GeForce only, it may break the setup and you’ll get weird error messages when running epochs on Jupyter Notebooks.
    “Sometimes better is the enemy of good” :sunglasses:

A quick update for those using epoch time on Cell #7 of lesson1.ipynb as a benchmark for setup (@RogerS49, @stephenl , @kzuiderveld, @brendan).

After upgrading to Python 3.6 and Keras 2.0 for Part #2 (Part #1 use Python 2.7 and Keras 1.22), I used @Robi updated notebooks.
Took just a few minutes to finetune file names here and there. like utils_p3.py, vvg16_p3.py etc :+1:

My 2015 gaming PC with i5 4690K + Z97 + 16gb ddr3 + 500gb SSD and Asus GTX 1080Ti FE used to run the epoch in 210 sec.

Now:

Eric
PS: I do not recommend new students of Part #1 this approach, except if you are familiar with debugging code. Best to start with Python 2.7 and Keras 1.2 imho.

@RogerS49, @stephenl, @brendan, @EricPB:

I got a similar hardware setup compared to Eric (NVIDIA 1080 Ti, Z800), here’s my “benchmark”:
Epoch 1/1
360/360 [==============================] - 103s - loss: 0.1239 - acc: 0.9680 - val_loss: 0.0433 - val_acc: 0.9865

My system seems considerably faster than Eric’s (103s vs 126s), but the trick is to massage the vgg16.py code and add workers=3 to the parameters of the fit_generator() statement. Multithreading then will kick in, leading to an increased throughput (see my post at Huge performance improvement with network training!).

This trick will get even more important with the next generation of NVIDIA Volta cards - data augmentation needs to be multi-threaded so that the GPU will not starve for data.

2 Likes

Very interesting, thanks for the link.

It made me realize that I was potentially throttling the GTX 1080Ti to PCI-e 8x as I kept a 750Ti as a backup, so now PCIe usage is down to 30% like yours, vs 70% before (I use Psensor for real-time monitoring).
https://wpitchoune.net/psensor/

But the workers=2 or 3 didn’t reduce a iota the epoch’s time: still running at 125-126 sec, even after regenerating batches like you did in one of your notebooks.
So I guess the bottleneck is somewhere else, if any.

E.

1 Like

Eric and @Robi thanks for notifying us of these changes and performance increases. Great write up by @Robi.

I have a 'web of entanglement to deal with, I have elected perhaps wrongly to use a different Conda environment for Py3. Which I have installed and its a slow process of spitting errors all over the place as I build it up. This is more a project rather than a quick change over at my end I suspect.

Cheers,

Stephen

Up and running on my Acer Predator 15 with Win 10 & GTX 1070 8GB here https://www.acer.com/ac/it/IT/content/predator-model/NH.Q16ET.002 !!!

Many thanks to you all for the patience and to this Win 10 native install tutorial from Phil Ferriere here, updated to May 2017 and tried successfully with Python 3 last night https://github.com/philferriere/dlwin

My Part 1 Lesson 1 benchmark is 292s from @Robi script, with GPU clock set to NORMAL, CNMeM at 80% memory and cuDNN enabled:

Off I go! Cheers, Gius

1 Like

#CUDA 8.0 using CuDNN 6.0.20 on Python 3.6 on Ubuntu 16.04 LTS

Please find below the complete package build-out for PYTHON 3.6 using a GPU, using anaconda3, assuming Ubuntu 16.04 LTS on a clean install. This is now more an interactive build-out for a home server not AWS, I have removed a lot of unrequired pieces just to get you going. I have not spun this up yet on the modified Py3 code but it throws out all the usual messages expected of an operating CUDA GPU build that I can see. For any amendments please notify me I will change it. I didn’t have much to say on the jupyter notebook section, its quick and simple, no fancy sha hash keys or certs etc.

###Install Ubuntu 16.04 LTS on your machine running a Nvidia GPU then


###Open terminal and copy n paste linux commands.

###INSTALL SSH OPENSERVER
sudo apt-get install openssh-server
sudo service ssh status
###ctrl-c to get out

###locate your ip address and write it down to use on your PC/Mac to login to lab remote.
ifconfig

###log into your lab machine using a e.g. PC and ssh to lab machine i.e. ssh your-remote-name@your remote-ipaddsress

e.g ssh fred@192.168.1.5

###ensure system is updated and has basic build tools as below.

sudo apt-get update
sudo apt-get --assume-yes upgrade

###install Anaconda3 for Python3 for current user (interactive)

cd downloads
curl -O https://repo.continuum.io/archive/Anaconda3-4.3.1-Linux-x86_64.sh
sha256sum Anaconda3-4.2.0-Linux-x86_64.sh

###Check SHA256 key
https://docs.continuum.io/anaconda/hashes/Anaconda3-4.3.1-Linux-x86_64.sh-hash

###Follow the prompts


bash Anaconda3-4.3.1-Linux-x86.sh

source ~/.bashrc

sudo apt-get --assume-yes install software-properties-common

###download and install CUDA and GPU drivers

wget “http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb” -O "cuda-repo-ubuntu1604_8.0.61-1_amd64.deb"
sudo dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
sudo apt-get update
sudo apt-get -y install cuda
sudo modprobe nvidia
nvidia-smi

###Download the cuDNN 6.0 from nvidia developers website https://developer.nvidia.com/1, you will need an account, sign up. Obtain the files for Linux cuDNN 6.0 for CUDA 8.0 and place that in your ~/Downloads folder.
#INSTALL cuDNN 6.0

tar -zxf cudnn-8.0-linux-x64-v6.0.tgz
cd cuda
sudo cp lib64/* /usr/local/cuda/lib64/
sudo cp include/* /usr/local/cuda/include/

###reboot system

###Check CUDA


cd /usr/local/cuda/samples/1_Utilities/deviceQuery
sudo make
./deviceQuery

###
prints output of your nvidia GPU card

###configure theano to use GPU

echo “[global]
device = cuda0
floatX = float32
[cuda]
root = /usr/local/cuda” > ~/.theanorc

###Modify keras.json to use Theano by changing ‘tensor flow’ to ‘theano’
cd ~
nano .keras/keras.json

{
“floatx”: “float32”,
“epsilon”: 1e-07,
“backend”: “theano”,
“image_data_format”: “channels_first”
}

###We better test this rig
.

python

Python 3.6.0 |Anaconda custom (64-bit)| (default, Dec 23 2016, 12:22:00)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type “help”, “copyright”, “credits” or “license” for more information.
">>>import keras"

Using Theano backend.
WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10). Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end(gpuarray)

Using gpu device 0: GeForce GTX 1080 Ti (CNMeM is disabled, cuDNN 6020)
/home/sl/anaconda3/lib/python3.6/site-packages/theano/sandbox/cuda/init.py:631: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.1.
warnings.warn(warn)

ctrl-z

###configure jupyter notebook and prompt for password and remember it!
jupyter notebook --generate-config

###make a password
python -c “from notebook.auth import passwd; print(passwd())”

###we will start jupyter notebook as follows on the local and remote machines
Do this from the Local first ( the machine with the GPU running the labs):

jupyter notebook --no-browser --port=8889

###Do this from the remote machine say PC or MAC - you will need to modify the name and ip address to suit

ssh -N -f -L localhost:8888:localhost:8889 remotename@remoteipaddress

If you have strange issues check the ports are not in use by another process on your MAC e.g.

ps aux | grep localhost:8889

###if you have used the remote machine before for ssh you will need to get into the localhost file on the PC or MAC and erase the SHA keys against previously used local machines IP addresses.

###on your pc or mac open a browser and type:

localhost:8888


OK now you will run the NBS course Lesson 1. Keep in mind this is now Python3 and it will throw code errors due to differences between the lesson 1 notebooks that used Python2 and Python3 which need modifying. There are a few posts on this from @Robi @EricPB @kzuiderveld etc.

3 Likes

To be honest, I am just glad I got this working with libraries throwing the usual punches. The corrected modifications are above mostly it was in keras.json file to use “image_data_format”: “channels_first” and .theanorc to use Cuda0. Image below is my test results of @kzuiderveld’s threading test. Mine is a bit quicker on all tests because of the hardware. I have yet to digest all of this, just getting it working is an achievement. Thanks Karel for your mod’s. I will dive into this a bit more over the next week or so. On face value I feel I could take on the whole of ImageNet! :slight_smile:

1 Like

Not sure if there’s any interest, but thought I’d try here before ebay. I just ended up upgrading my own setup and have the core PC setup for sale (All you need is a graphics card, hard drive and case). Components are less than 2 months old, probably ~150 hours of online use on the components and everything still works great.

Includes:
Intel i7-7700k
MSI Z270A PRO
Corsair LPX 32GB (2x16GB) DDR4 3000 RAM
EVGA G3 750W PSU
Cooler Master Hyper 212 CPU Cooler

I think total retail came out to $950 after tax and shipping, would be willing to sell for $700 if anyone is interested.

I thought I would share my build which can be found at https://pcpartpicker.com/b/NW4qqs . In the near future, I hope to post a nice walkthrough tutorial on how to install everything to do this course in windows 10. I just “finished” part1 and have now started part2.

1 Like

I have a very similar build, but only one Ti card so far. Also just finishing Part 1.

Hi. Christoffer Björkskog from Finland here.
Btw. Thanks for a great course to the people arranging it. The best one on the topic i have seen so far.

I have a p2 instance at the moment, but i am thinking that i could upgrade one of the computers i have at home to have a GTX 1050Ti GPU (4Gb) (which would mean that i don’t need to replace its power supply, and 1050Ti are quite cheap now) and the kids could play Minecraft a little smoother on it as well :wink: I could then later go for a custom build server. This would be for learning deep learning and perhaps do some kaggle competitions for fun.
According to this site: http://timdettmers.com/2017/04/09/which-gpu-for-deep-learning/ It would be roughly as powerful as a P2 instance and would cost roughly as much as being on P2 for the two courses.


What do you guys think?

I’ve just build a ryzen 7 (3ghz overclocked to 3.6ghz - v easy), gtx 1080ti machine for deep learning.
installing the software etc was pretty simple, especially using jeremy’s install-gpu.sh script (w/minor modifications)

beware - ryzen’s don’t have onboard video support, so even though my motherboard (asus p350b plus) has video connectors they don’t work. i have to run video off the 1080ti card.

Also, my mobo doesn’t have built-in wifi so i need to attach a separate card/dongle. this is a pita.
Also, i bought vengeance 3000mhz memory as i read the ryzen can be overclocked to beyond this. doesn’t work, memory is running at 2133mhz. if i’d bought stick 2400mhz memory i’d get that out of the box.

something i’d not heard about before is gpu coil noise (or coil whine). look it up, my gpu does it (not too much).

that all said, the 1080ti is faster than the p2 instances
your time is valuable. you’re buying more time.

1 Like

Hi - I was wondering if you ever got the Jetson TX2, and if so, what do you think? It sounds interesting, having a dedicated deep learning device, but I wonder if it’s too stripped down for a beginner+ like me.

Question:

I have an 4 year old PC at home, from which I think only the SSD fits the role for a “machine learning” server.

Could anybody give me advices what component must be replaced or it is simply better to build a new one from scratch??

Thanks,
Xi

  • harddisk: Samsung SSD-kovalevy 830 Series - 1T

  • CPU: Intel Core i3 Sandy Bridge 2105 - 3,1 GHz - Cache L3 3 Mt - Socket LGA 1155 (BX80623I32105)

  • Graphic: ASUS GeForce GTX 550 Ti DirectCU - 1 Gt GDDR5 - PCI-Express 2.0

  • Box: Cooler Master PC-kotelo Elite 334U + virtalĂ€hde 460 W (RC-334U-KKP460)

  • Memory: Corsair PC-muisti Vengeance Performance 2 x 4 Gt DDR3-1600

  • Motherboard: GIGABYTE GA-H77-DS3H - Socket 1155 - Chipset H77 – ATX

You should definitely get a better GPU, that one only has 1GB of memory. By comparison, the GTX 1080 Ti has 11GB. Some more RAM also can’t hurt, I’ve seen some people say by rule of thumb that you should get at least twice the RAM as your GPU has, although I don’t know if that is well founded.

1 Like