Personal DL box

@radek, probably you already know but just in case, if I see GPU utilization is low and GPU memory is under say 50% I will increase batch size and usually means much more optimized training.

About CPU, I agree that even if not the number one bottleneck in DL it will pay to have a good enough one, I had for some months this year a dedicated server with Ryzen(8cores, 16 threads) and really happy about its performance, good benchmarkings as far as I know.

And, lastly, I think CPU RAM to be the bottleneck many times because data wrangling requires a lot (and inversely correlated with your coding efficiency). I have 32 GB laptop locally, minimum 64GB on rented serversā€¦ and I always find myself in situations where I will need more. For a personal DL box, that I dont have, I wouldnā€™t have less than 128GB if possible,

Kind of beefy specs but, well Christmas is near already! :grinning:

The nice thing about RAM is you can upgrade it after the fact. So as long as you are mindful of wanting 128GB at some point, you can build a system with that in mind and actually just put like 32GB in with the intention to add the rest later. Just make sure not to use like 8X4GB sticks if you want to bump it up later.

1 Like

I remove all things and re-install it again and again. Once, it was working, I stopped. :sweat_smile:

5 Likes

Thats, obviously, an explanation of a day !!! :rose: :joy:

2 Likes

@beacrett What is your full DL box setup?

Currently:

  • Ryzen 5 1600 (6 core, can over clock if desired, comes with cooler)
  • AM4 B350 chipset motherboard (B350 is the middle tier of their chipsets - it is worth it for the small price bump)
  • 16GB DDR4 3200 RAM (fastest supported by my processor, going to get another 16GB)
  • 250GB NVMe M.2 SSD
  • 2TB HDD
  • 1080 ti (EVGA GeForce GTX 1080 Ti SC Black Edition - very happy with this so far - great cooling)
  • 750 watt modular power supply (would need to get a bigger one if adding a second GPU)
  • Dual boot Windows 10 / Ubuntu 16.04 LTS

imho, its worth getting the fastest ram supported by your CPU (within cost reason). Keep track of the model and its timings - you may need to manually change settings in the bios to ensure it is running at full speed and you want any new ram you buy to match the speed and timings for optimal performance (try to order the same model to keep it simple)

5 Likes

There is no turning back :slight_smile: Delivered today, ā€œfeeling like a little kidā€ :nerd_face:

17 Likes

Has anybody tried to activate conda virtualenv and run jupyter notebook from within crontab job? source activate does not work for me, source bin/activate throws me to root user and does not activate anything. Nothing useful in google forest so far.

I havenā€™t, but maybe you can try running a script as a login shell,

#!/bin/bash -l
cd fastai
source activate fastai 2>/dev/null &
nohup jupyter-notebook 2>/dev/null &

That has usually solved my ā€œthis isnā€™t working in cronā€ woes in the past

1 Like

Thanks @rob, this is what worked for me link

UPDT: no it did not work as well. Cant activate conda env from cron.

1 Like

I tried setting up Deep Learning Machine on Azure for fast.ai and itā€™s working fine.

During the setup, I faced an issue which seems to be an issue with Jupyter.

The issue is while following the steps in readme, even after creating the environment and activating the fastai environment I am not able to find the actual kernel for fastai.

I have browsed and many people faced similar issues with jupyter and conda .

I have resolved by manually installing kernel after activating fast ai environment.

python -m ipykernel install --user --name fastai --display-name ā€œPython (fastai)ā€

Please let me know if anyone else faced the same issue.

1 Like

Hello, I have used paperspace script with a fresh install ubuntu 16.04 with a 1070 everything was ok untill I tried to run learning cell ā€œresnet34ā€. I monitored my system and it uses all the RAM until the point where kernel shutdowns itself (8G RAM) and 1G Swap.

But it doesnt use VRAM at all I checked it with nvidia-smi tool. Iā€™m guessing it doesnt use GPU at all? But still shouldnt 8G ram should be enough? I also added the kernel with the command posted and switched kernel to that stil no luck.

So what can be the problem?

Adding the notebook and some screenshots before and after might help other forum members to respond otherwise itā€™s like throwing an arrow in darkā€¦
Thanks ā€¦

It is the Lesson1 notebook.

@lymitshn I wouldnā€™t suggest using your own box for learning this course - better to use the fast.ai AMI on AWS, Paperspace, or Crestle. Once youā€™re comfortable with the basic techniques then you can come back to getting your own machine working. Youā€™ll know enough at that point to understand how to debug your issues and ask for help in a way that we can be useful to you.

1 Like

Thank you for suggestion but I really want to run my local machine.
I created a new env and verified and torch uses GPU also added 6G swap and it seemed to work this time. Ran until hitting %12 (slowlyā€¦) but it was using only 800 MB VRAM and %16 GPU Power at peak and after consumed all DRAM and swap, kernel restarted itself.
It can clearly access GPU but still tries to use high DRAM is this how the model supposed to work? Or is something wrong with my setup?

Hi! I ran into the same issue. solution was just given on Wiki: Lesson 1. It was not a setup issue (at least in my case), but reducing the number of workers was necessary for loading/transforming the data as this part is done on CPU/main RAM:
data = ImageClassifierData.from_paths(......, num_workers=1)

1 Like

Regarding a personal DL box, Iā€™m seeing some Presidents day deals coming with a Windows and dual-gpu setup (2 cards of 1050ti or 1070 ). The big price increase of 1080ti since last Nov hasnā€™t helped.

Iā€™m weighing the benefit of having 2 cards to run to experiments versus a faster 1080ti. The main use-case is being ready for part-2 of this course and for my own learning.

Does anyone have such a setup/ know about the pros and cons ?
Iā€™m wondering if I will have to do a dual-OS install for the machine since the box comes with Windows and if Iā€™ll be able to access both cards fully.

i have 2x 1080ti, in a dual boot, separate harddrives setup.

I would recommend a dual boot setup only if you can install the OS separately so there is no interference. I have posted several links regarding this.

as far as multiple gpu cards, if you can afford it, great, but the fastai library will not use both cards while training at this time. However, there are other DL frameworks which can use all available gpus with no real setup. I donā€™t experiment while another is training so for fastai duty 1 card is displaying, the other is training.

Thanks! I did look at your dual boot link. Right, Iā€™ve seen people trying to use both cards for training without much success. Iā€™m more interested in the interleaved approach i.e training on one card and using the other card for some interactive/lightweight work.

I saw the thread on Fastai installation on Windows, not sure how mature this is. If I am able to get this to work, a single OS would suffice. Still a noob in this respect, I have questions whether using 1 card has to be used for display purposes, and if windows gives full access to the 2 gpus etc.

While Iā€™ve got you here, did NVME drives help with performance significantly? And how what RAM did you use, wondering if 32GB is a must.