For those who run their own AI box, or want to

balnazzar · April 27, 2022, 11:00pm

Interesting. What problems, exactly?

Interogativ · April 27, 2022, 11:31pm

I suggest asking (thru this forum or others) for course related download information. I don’t presume to know exactly which versions of the notebooks @jeremy wishes you to download. But after reviewing the video of lesson one again, I was able to see at URL at 1:09:56 which if you go to the Github fast.ai repository should give your a hint for the URL (using git clone ...)of the notebooks I downloaded. How’s that for answering, but not answering your question?

mike.moloch · April 27, 2022, 11:44pm

Well you actually kind a did, so thank you!

Pomo · April 28, 2022, 12:28am

Hi Bart. I am pretty much a Linux ignoramus (and ideally will remain so), and your upgrade instructions worked perfectly for creating a new fastai5 conda environment. Thanks!

A couple of observations:

I did not install jupyter. I think it was installed along with fastbook.
source activate fastai5 does not work for me (bash: activate: No such file or directory)
But conda activate fastai5 does work.
In environment fastai5, cudatoolkit version is 11.3.1. But nvidia-smi shows cuda version 11.6. Is it because I have a system version separate from the conda environment version?
Some of my own Jupyter notebooks no longer by default convert to cuda with torch.Tensor(). They are created as CPU Tensors, but can be moved to the GPU explicitly.

And a question… I want to trace into fastai and PyTorch with the PyCharm debugger. Do I need to do anything special to allow this?

Thanks for your help!

Malcolm
P.S. Using an ancient GTX 1070.

falmerbid · April 28, 2022, 12:57am

major problem:
1- I cannot import any of the fastai modules in the intro chapter from the fastbook notebooks. (I will try to create a new environment and install again using fastchan to share the error message).

minor problem:
1- it doesn’t install the latest supported version of PyTorch (1.10.2)*, it install PyTorch (1.9.1).
2- the installation got interrupted several times (possibly my machine was the reason).

That what I recall, the experience was 3 weeks ago using ubuntu 21.10

*PyTorch 1.11 is not working with Fastai in my machine, the lastest version that is working fine with me is PyTorch 1.10.2 (ubuntu 22.04 Nvidia 510 CUDA 11.6)

Interogativ · April 28, 2022, 2:11am

MiniConda (Anaconda) creates an “environment,” this is like a virtual workspace for your own private instance of Python and all of it’s packages. I created one with the conda create --name fastai5 called strangely enough; “fastai5.” You may have created one with a different name or not created one at all, or maybe just typed conda activate fastai5 instead of source activate fastai5. The source activate fastai5 command switches you into that environment (on Linux machines) and then any Python packages you install there, remain in that environment. The great thing about the Conda environments is that you can experiment with different configurations by using different environments, and when you’re done you can delete the a test environment, packages and all, without affecting the others.

When you install Pytorch you have a choice of CUDA or non-CUDA versions. You may have accidentally chosen a non-CUDA version, or not.

I’m not sure of the differences in CUDA 11.X versions, there are a few ways the NVIDIA drivers can be installed and each method will install it’s own version of CUDA, some methods check your video card, others let you specify which CUDA version to install.

Tracing into Pytorch is something I’ve tried to do in the past, but I usually just do a divide by zero (errors out like a breakpoint) and then use the standard debugger. I’m sure there are better methods, and you’ll probably get a great answer on how to debug and trace from someone else on the forum.

Using Python, Pytorch, Jupyter notebook, Linux, Ubuntu, fastbook, and all the other packages at the same time makes for a very difficult setup if you choose to do it yourself. So don’t be surprised at all when things don’t work. To make matters worse, what works under CUDA version X sometimes doesn’t under CUDA version Y. This is the reason that @jeremy doesn’t cover this and encourages the use of pre-built cloud based solutions. Your mileage may vary

balnazzar · April 28, 2022, 7:38am

Better not to manually install pytorch. Let fastai automatically install the pytorch version it likes.
Also, try not to have the installation process interrupted. If that happens, it’s generally not a good omen, even if no error messages are displayed.

wgpubs · April 28, 2022, 4:41pm

imo, the easiest (best maybe) way to ensure you are installing the right pytorch bits comes by using this library: https://pypi.org/project/light-the-torch/.

My usual work flow when running on bare metal is:

Create a conda environment (I use mini conda and mamba)
pip install light-the-torch
Use ltt to install pytorch, etc.
pip/mamba install everything else via an environment.yml file including fastai.

Interogativ · April 28, 2022, 6:57pm

definitely going to add that to my bag of tricks

Pomo · April 28, 2022, 8:07pm

Thanks, Bart. conda activate fastai5 is advertised as a more recent and more reliable version of source activate fastai5. But why source activate fastai5 fails for me is a Linux mystery.

I have used fastai/PyTorch/conda locally since 2017. Just wanted to create a fresh environment to use with the new course.

I will try to get PyCharm’s debugger set up and post here with details.

Malcolm

Michal_w · April 28, 2022, 10:31pm

I tried 22.04 Ubuntu LTS didn’t work on the first try so rolled it back to 20.04 finally it appeared NVIDIA drivers were wrong selected and didn’t work on 20.04 LTS too.
Officially there are no CUDA installation binaries to 22.04 on the NVIDIA site.
I think up to this time better stick to 20.04 in my opinion

FourMoBro · April 29, 2022, 1:51pm

Running these steps for Nvidia driver installation is working fine for me in both Ubuntu 22.04 and 20.04 in non-Docker situations. (I havent really tested docker for fastai)

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-driver-510

The 510 driver works for both my 30 series and 10 series cards.

EDIT: My setup was done before 4/27. Perhaps the reason for others with errors is due to this?

balnazzar · April 29, 2022, 4:51pm

If you just installed some 22.04 ubuntu flavor and want to install the nvidia drivers, try this:

At the beginning of the installation process, select “install third-party etc etc…”
Do not attempt to install the drivers via the official PPA
Open a terminal and type sudo ubuntu-drivers devices. A list of your GPUs with the available and the recommended drivers will be displayed.
Install the recommended one with sudo ubuntu-drivers install <driver-name>, OR:
Open a terminal and type: sudo software-properties-gtk. This will open the drivers GUI
Select “Additional Drivers”. A list of available drivers will be displayed.
Select the recommended one and apply. The process will take some time since the kernel modules have to be built.

Tried (with success) just a few hours ago with kubuntu 22.04, on my htpc.

brismith · April 29, 2022, 8:36pm

The Ubuntu 22.04 WSL in the Microsoft store has worked fine for me with no driver issues on a Surface Book 3 with the built in NVIDIA GTX 1660. Just in case anyone is thinking of going that route.

jeremy · April 30, 2022, 1:05am

Yup I’ve found that WSL works fine with the RTX3050ti in my Surface Studio Laptop with the regular Windows drivers. That’s what I was using to train during the last lesson!

prairieguy · April 30, 2022, 3:14am

Thanks for sharing update on NVIDIA repository key update! My Ubuntu 20.04 box with its RTX cards has been working with fastai notebooks as expected. I last updated the drivers about a month ago. I’m sure this would have bitten me the next time I tried an upgrade. I will update my box tomorrow and report back any issues.

@FourMoBro, I appreciate you saving me from what would likely have been a tedious many-houred debugging project!

devforfu · April 30, 2022, 7:59pm

Btw, regarding dev boxes, did anyone try to use some “prebuilt” solutions tailored for Deep Learning practitioners, like HP Z series, or Dell’s Precision towers? Do you have any experience with these? Especially, when using for long training, i.e., like few days in a row. I wonder how much power it consumes, and how noisy these stations are to keep them running at night.

prairieguy · April 30, 2022, 8:06pm

I’m using an Ubuntu 20.04 box running nvidia-driver-470. I wanted to upgrade to nvidia-driver-510 and a minor problem which I wanted to document here if others were experiencing a anything similar.

I updated the NVIDIA Repository Key as described here.
sudo apt update
sudo apt install nvidia-driver-510

I then got the following error:

The following packages have unmet dependencies:
 nvidia-driver-510 : Depends: libnvidia-extra-510 (= 510.47.03-0ubuntu1) but it is not going to be installed
                     Depends: nvidia-compute-utils-510 (= 510.47.03-0ubuntu1) but it is not going to be installed
                     Recommends: libnvidia-compute-510:i386 (= 510.47.03-0ubuntu1)
                     Recommends: libnvidia-decode-510:i386 (= 510.47.03-0ubuntu1)
                     Recommends: libnvidia-encode-510:i386 (= 510.47.03-0ubuntu1)
E: Unable to correct problems, you have held broken packages.

In reading that others were having trouble with upgrading to nvidia-driver-510 on Ubuntu 20.04, I was a bit concerned. I was able to fix the issue as follows:

sudo apt install nvidia-compute-utils-510
sudo apt-install libnvidia-extra-510
sudo apt install nvidia-driver-510

With the first install, nvidia-driver-470 was uninstalled. I rebooted, checked nvidia-smi and all was good.

Though I did not try it, I’m guessing that the following would probably work as well:

sudo apt remove nvidia-driver-470
sudo apt install nvidia-driver-510

balnazzar · April 30, 2022, 8:13pm

No direct experence (hands-on) but I’ve seen the previous generation of both at work.
Noisy. Not unbearably so, but still annoying. The thing is that even with an idling gpu, under load the psu and/or the cpu fans are not quiet. The Dell was a bit noisier than the HP.
Maybe the current gen are more silent, but if you want some silent stuff, I think you will be better served by pre-built solutions by companies like exxact, puget, lambda, bizon, system_76, etc… (liquid cooled options too).

jeremy · April 30, 2022, 10:42pm

I’ve used liquid cooled ones and lambda (I’ve got one at home) and they’re still loud - there’s a lot of heat that needs to be removed somehow, and that takes fans!