How to run Pytorch 1.5 and Fast.ai V2.0 on the Jetson NVIDIA Xavier NX Board:
Version 1.0 – 5/25/20
It’s been a little over a year since I wrote the “How to run Pytorch 1.0 and Fast.ai V1.0 on an NVidia Jetson Nano Board” article. NVidia just (May 2020) announced the NVIDIA Jetson Xavier devkit featuring the NVidia Volta Architecture , 8GB of CUDA RAM and 384 NVidia CUDA cores and 48 Tensor cores. It has a bunch other features too. It sells for $399 (got mine on Amazon), which is four times the price of the Jetson Nano. But since it has 8GB of CUDA RAM, you can train with larger batch sizes. I’m never the first one on my block to buy anything, except in this case. I bought one to try it out.
Here’s how you can also make it run the latest and greatest (as of May 2020) version of pytorch and fast.ai (V2). This install is for Python3 only . This install is NOT recommended if you don’t have much Linux experience, don’t know how to use SSH, or have no idea how IP networking works or what an IP address is.
What You Need:
- A ($399) Jetson NVIDIA Xavier NX Development kit – These can be ordered from many places, I got mine from Amazon.
- A (~$15) Class 10 64GB or larger Micro SD Card . – Make sure it’s class 10 or higher speed wise.
- A USB Keyboard – Got a PC? Use that one.
- An HDMI or DisplayPort cable and monitor
- An Ethernet cable , a Wireless router or HUB on your network . The NX does have native wireless support.
- A PC that you can plug the Micro SD card into to flash it. If you only have USB ports, that’s fine. Spend the extra $10 and buy a USB to Micro SD card adapter .
- Software for your PC that can create an SSH terminal , and software that can transfer files using SSH . For Windows I recommend Tera Term (free) and WinSCP (free). Use google to find where you can download these if you don’t have them already.
- Download this Zip File (xavier_nx_setup.zip) to your PC which contains these instructions as a PDF and scripts I’ve written and remember where you put it. It contains these files:
1_setup_fastai_apt.sh
2_setup_fastai_pip.sh
3_setup_fastai_fastai.sh
4_setup_jupyter.sh
5_setup_course_v4.sh
6_xavier_headless.sh
0_venv.txt
jupyter_notebook_config.py
xavier_nx_setup.pdf
What to do first:
After you shiny new box arrives go to the NVidia developer website and follow these instructions to get started. Be sure you do all of the following:
- Download the NVIDIA Ubuntu 18.04 Zipped Image for the NX.
- Flash it to the SD card using their instructions. I use balenaEtcher software to do the flashing.
- Put the SD card into the NX, plug in the USB keyboard, monitor and Ethernet cable attached to the router (to complete this process you must have Internet access).
- Boot the machine, accept their license, etc.
- Pick a machine name that works on you network, pick a user name and password you can remember, you’ll need to know them!
Once it boots up and you’ve verified it’s on your network and the Internet:
- The NX flash card has UBUNTU 18.04 LTS and their version of the Unity desktop. On the top right is a mode dropdown. It defaults to the mode using only 10W with 2 CPU cores active . I set mine to use Mode_15W_6 Cores.
- Go to the Network Settings and find the IP V4 address of your machine, write it down , or if you understand IP networking set up a fixed IP address .
-
Setup SSH Server on the NX : “
sudo apt-get install openssh-server
” You may have to do “sudo apt update
” followed by “sudo apt upgrade
” first. - Use the PC terminal program to open an SSH session with your NX at the IP address (see step 2).
- Use your file transfer program to transfer the files in
xavier_setup_fastai.zip
to your NX user’s~/Downloads/xavier_setup_fastai
directory.
From either the console or via an SSH connection, set execute permissions on the scripts you’ve just downloaded:
cd ~/Downloads/xavier_setup_fastai
chmod +x *.sh
Use python venv to create a virtual python3 environment:
- These instructions are in the
0_venv.txt
file. - Go to your home directory:
cd ~
- Create an environments directory:
mkdir envs
- Go the new envs directory:
cd envs
- Create a “virtual environment for fastai2:
python3 -m venv fastai2
- Activate the fastai2 environment:
source ~/envs/fastai2/bin/activate
Install pytorch and fast.ai:
If at this point you want to try the standard fast.ai and pytorch install, go right ahead, it will fail. For a bunch of reasons I’m not going to go into now, the standard pip commands simply won’t work for this.
If you just run the scripts you downloaded in-order you should be up and running by tomorrow. Now this will take several hours at best , so don’t hold your breath. Each script has a number; ( 1 _setup_fastai_apt.sh
), you must run ALL of them IN-ORDER. You can try combining them all into one big script if you want, but in-case of errors it’s better to just run them one at a time. My advice is to run the first one, check back in a couple of hours, run the next one if it worked and then call it a night . All of the scripts require sudo , so they may stop and ask you for a password . After the 2_setup_fastai_pip.sh
script finishes the rest will go more quickly.
- ./
1_setup_fastai_apt.sh
- takes a couple of hours. -
./2_setup_fastai_pip.sh
– takes a Loooong time. Run overnight ./3_setup_fastai_fastai.sh
- Logout and reboot ( very-important )
- Login again
- Activate your VENV:
source ~/envs/fastai2/bin/activate
- Go back to the directory where the scripts from xavier_nx_setup.zip are installed
Install Jupyter notebook:
After fast.ai is installed, it tells you:
Done with part3 - fastai is now setup, you must logout and login again before doing part4
This is because the Jupyter install doesn’t export the shell variables it needs to run. So shutdown all your terminals, SSH sessions etc. and just reboot the NX from the GUI. Once it comes back up. Open up a terminal from SSH or the GUI run ./4_setup_jupyter.sh
./4_setup_jupyter.sh
This also takes a while, so again; don’t hold your breath. The last step of this script asks for your Jupyter password. This IS NOT your login password, this is a separate password you can use to log into Jupyter notebook from any PC on your network, so pick an appropriate password and write it down . The default Jupyter notebook install only lets you log in from the console or GUI, the modified jupyter_notebook_config.py file you downloaded and installed with the script allows you to login from any machine on your network. To run Jupyter notebook you will have to open a terminal or SSH then activate your fastai2 environment and then run Jupyter notebook.
source ~/envs/fastai2/bin/activate
cd ~/fastai2
jupyter notebook
If it doesn’t run, it’s probably because you didn’t log out and in again.
That’s it. You’re done; you can now run pytorch and fast.ai.
If you want to install Version 4 of Part 1 or the Fast.AI Course:
cd ~/fastai2
git clone https://github.com/fastai/course-v4.git
or
run 5_setup_course_v4.sh
from the xavier_setup_fastai
directory.
A Note about VENV
Whenever you log off, you have to get back to your virtual environment, you can do this by:
source ~/envs/fastai2/bin/activate
Now you’re in a virtual environment, you can’t just say “python …” to do something, or “pip …” to install something, you have to say “python3 or pip3”, because that specified which version of python we’re using. But if you’re lazy and forgetful; like me, you can add the** alias
command below to your .bashrc
file or just type it every time you do a VENV source command:**
alias python=python3
Memory isn’t everything, but it’s definitely something:
Back in the old days (of say 2010), 8GB was a lot of memory. Today if you’re not using the GPU or not training this is enough to get your notebooks running well (the NX version of UBUNTU 18.04 also has 4 GB of virtual swap file helps quite a bit). But if you’re using CUDA, it doesn’t use swap space, so you need each and every byte of that 8GB.
To get that , it’s time to jettison the GUI and run via a remote console using SSH. Running the jetson_headless.sh script will uninstall the GUI, and purge a couple of unnecessary packages that take up over 300MB of RAM. So after you run this and reboot, you’ll only have console access to the NX , but the machine will start using only about 564MB of RAM, leaving you with 7.6GB for pytorch and fast.ai.
- run:
./6_xavier_headless.sh
- reboot and SSH into your NX.
NVIDIA Utilities you need to install and know about:
-
There is a great package called jetson_stats that contains an equivalent utility to nvidia_smi (which doesn’t work on the NX) called
jtop
. There is also ajetson_config
utilty that can purge the desktop for you as well as do other neat things. -
The NX built-in command line utility;
nvpmodel
(must besudo
to run) which is very useful. It lets you set the board modes (number of CPUs, Watts used, fan speed, etc.) I set my processor mode to 6CPUs using 15W
sudo nvpmodel -m MODE_15W_6CORE
and set my fan to max with:
sudo nvpmodel – d cool
See Their development Guide for complete information on this utility.
Something didn’t work, Trouble-shooting:
I’ve spent 4 solid days and tried my best to debug these instructions and scripts, and each time if something fails, I correct the instructions or scripts and continue. Sometimes for whatever reason one of the pip installs fails, so you have to run it again. If a library is not found once Jupyter notebook is up and running, look for the correct pip3
command in the 2_setup_fastai_pip.sh
file.
So the best I can say is “It works for me.” If it doesn’t work for you, try to figure it out using google, because that’s what I did. I didn’t create any of these libraries or tools, so if something fails I probably can’t help you. So don’t be mad if I don’t reply to trouble-shooting queries. If something doesn’t work AND you fix it, tell me how you did it so I can amend these instructions and scripts.
I’ve also found that in the tabular examples, it hangs even with small batch sizes; I think this is because the ZRAM compression (swap file) runs out of memory while loading batches, if someone finds a work-around let me know. Maybe a fixed swap file will help.
A Note about changes:
As of May 2020, this hacky install method works and installs the latest versions of both pytorch 1.5 and fast.ai 2.0, but things change. In the future you will have to update one or more packages or fast.ai itself. Hopefully some clever soul will figure out how to do that and maybe even build a GIT repo . My work here is done.