How to do an Ubuntu local setup for part1 v2?


#1

Hi, I have my own Linux machine with gtx 1080; is there a script for setting up libraries required for fastai v2 for a local Ubuntu machine somewhere? I’m looking for something similar to this but for v2 of the course.

I had a look at the paperspace script here: http://files.fast.ai/setup/paperspace
As far as I can tell this assumes some stuff is already installed on the paperspace machines, is that the case? What other libraries do I need from a fresh Ubuntu 16 install? Do I need pytorch? what else? What is in the conda env called by the line “source activate fastai”? Is that an environment already setup on the paperspace machines?

Paperspace, aws etc has way too much latency to be usable from where I live. Would like some simple instructions for getting set up on my own Linux machine.

Cheers,


#2

Jeremy mentioned that it should work on most fresh installs of Linux and not just only on paperspace.


#3

Thanks, looks like it is indeed working using the paperspace script - I was just missing bcolz


(Cedric Chee) #4

Which version of Ubuntu 16 are you using for your setup? I am about to try doing the same local setup using the Paperspace script. So, the only problem you encountered is the missing bcolz Python package? If you don’t mind, could you share the updated Paperspace script here for the benefits of others? Thank you.


(Sanyam Bhutani) #5

To install bcolz, you can just go
pip install -U bcolz


(sergii makarevych) #6

Another option is to use environment.yml or requirements.txt to install most of the libraries.


(Cedric Chee) #7

Yeah, we know how to use pip to install Python packages manually. As we are already using anaconda, I think it’s better to stick to conda install -y bcolz. I think all these missing commands should be included in a Paperspace script or in another flavour of Paperspace script specifically tested for Ubuntu 16? We should optimize to make things easier/smoother for new comers choosing this route.


(Ben Eacrett) #8

I posted these steps in another discussion thread:

Everything to set up on ubuntu is covered in the thread on setting up your own DL box (Personal DL box) .
Basically:

install anaconda
install nvidia drivers (I followed this: http://www.linuxandubuntu.com/home/how-to-install-latest-nvidia-drivers-in-linux -> makes it easy)
install CUDA, cuDNN (I followed this from the DL box thread: https://www.learnopencv.com/installing-deep-learning-frameworks-on-ubuntu-with-cuda-support/)
set up fast ai, pytorch, tensor flow, keras etc as desired

Then follow @sermakarevich note above for a fastai environment config (basically do a git pull from fast ai and either use the environment.yml with conda or requirements.txt with pip to set up a fast ai specific environment).

I did this on a fresh ubuntu 16.14 lts install and it all just worked smoothly. If there’s issues with any of it check out the ‘building your own deep learning box’ thread which covers a lot of trouble shooting and alternatives.


(Cedric Chee) #9

Thank you for cross posting that thread here. I was not aware of that thread until now. Perhaps a better title like “Personal Ubuntu DL box setup” for that thread will help in future discovery :slight_smile:

Aside from that, you are right, pretty much everything to setup on personal Ubuntu 16 LTS DL box is covered in that thread.

radek's post caught my attention there.

Personal DL box

wget and run this file3 - this is a modified script from part1 v1 that includes everything for part1 v2 and also installs keras, this could potentially be useful to a person setting up their own box (I try to maintain it and it has been tested by a couple of people in the course)

I prefer his method as it helps to simplify the setup process using a shell script that we can pull in from some place and run in our DL box. The nice part about this approach is, the script will be maintained and updated (hopefully) as we progress along the course and avoid bit rot. But before that, we have to first install NVIDIA drivers by following your first link “how to install latest NVIDIA drivers in Linux”.

These are the commands that will be run when the shell script is executed in bash terminal:

Another nice thing about this script is, it also automate the step to setup fast.ai part 1 v2 specific environment configurations.


(Jeremy Howard) #10

The paperspace script already installs bcolz since it’s in the environment.yml. Just make sure you have activated the environment (done automatically by the script).


#11

Thanks every one. I am running Ubuntu 16.04.3 with KDE. To set up I simply installed git and then ran the paperspace script shown at the start of the first video:

curl http://files.fast.ai/setup/paperspace | bash

I’m not sure why bcolz didn’t install, I do see it in the environment.yml file. I installed it manually via anaconda and everything seems to be working now.


(Kevin Dewalt) #14

Ditto for me, it went pretty smooth.

One issue I did encounter was ipywidgets. Fix was a simple upgrade to 7.0 which I describe here:


#15

Hi all,
I’m just curious will this setup going to work on Linux Mint 18.1 Serena?
(Mint is based on Ubuntu so it should?)
Thanks!


(Malcolm McLean) #16

Hello all,

Glad to come back to the course (v2) after a nine month gap - life and cancer happen.

I successfully installed Pytorch etc. using just the paperspace script (Ubuntu 16.04). The script threw a couple of errors which I fixed manually, then restarted the remainder of the script. However, I am too ignorant of Linux to offer a proper diagnosis. In any case, I can now run Lesson 1v2 using the GTX 1070.

One issue is that after suspending/waking Ubuntu, the GPU is no longer available to Pytorch. Google and a forum search offer diverse workarounds. Can anyone recommend a “best” solution?

Thanks in advance,
Malcolm


(Jeremy Howard) #17

Cancer?!? Well I’m really glad to see you back on the deep learning journey! I’ve not seen that GPU sleep problem before - works OK on my laptop…


(Dave Martinez) #19

I have installed fastai environtment using the readme instructions. I’m having a problem with this code, everytime I run it, my whole machine freezes, i tried lowering the sz variable to 100, still the same. I don’t really know what the sz is for but i think it is for batch size so I tried to lower it because I think that my graphic card’s memory is at its limit. I’m using GTX960 2GB Memory.

Also, whenever I run this code, it loads for a moment and the progress bar loads and is only 0% then the freeze starts. I tried googling it, some says it keras-tqdm, but none of it work for me. Any help?

arch=resnet34
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, 100))
learn = ConvLearner.pretrained(arch, data, precompute=True)
learn.fit(0.01, 3)

#20

sz refers to the image size. so you would want to leave it at 224 for this pass.
to control the batch size, which the default is 64, you need to pass the bs argument in that line:

data = ImageClassifierData.from_paths(PATH, bs=32, tfms=tfms_from_model(arch, sz))

then instead of seeing 360 iterations, you should see 719 or so. try it out and go smaller. you may need bs=16 with only 2 GB.


(Malcolm McLean) #21

Hi Jeremy. Thanks for your note. I’m certainly glad to be able to focus more on study.

The cancer was mentioned only in passing, but I’m inspired to say more as it relates to machine learning. Of course, what I had already learned in part of v1 was in the back of my mind during this adventure into the world of medicine and the medical system. The diagnosis and treatment of prostate cancer relies on many standard tests, each of which has distressingly low specificity and low sensitivity. Urologists then use tables, nomograms, and their own experience (and frankly their biases) to make a prognosis and treatment recommendation. And those assessments are all over the map, leaving the patient to play the odds as best he can.

I wondered if there is a way to combine these tests into a measure of higher accuracy. Such would be a great benefit to patients and doctors. Problems: the field is hampered by a lack of outcome measures (and agreement of what those should be), and a lack of “big data”. To be medically legitimate/publishable, any learned function would also have to provide not just a prediction but also a confidence interval, and be able to “justify” its reasoning. Those outputs are not something that ANNs typically provide. Of course, I don’t know much yet about machine learning, but intend to stay awake to any possible applications.

Second, I sat down with my radiologist (rumored to be one of the best in the US), and watched him rapidly and simultaneously look at transverse image slices in four signal channels. Then, with more deliberation, evaluate the level of malignancy and the certainty of his assessment. How does he achieve his high accuracy? “I’ve looked at tens of thousands of scans.” I thought, there must be a better way than looking at four 2D slices, one that would free up an expert’s diagnostic acumen from some of the mechanics of visualization. As a patient, honestly I would be reluctant to ultimately trust a computer’s assessment over that of an acknowledged expert. However, I think tools that automatically discover and visually augment important features could make an expert even better. Furthermore, tumors found on MRI are these days often biopsied and given a histological Gleason score. That provides some concrete data (though the histological scoring itself is highly subjective and variable). I can imagine someday a radiologist taking a “virtual biopsy” of a specific volume in the scan and getting back a predicted Gleason score with confidence interval. Such an assistant could prevent unnecessary biopsies and encourage necessary ones.

I’m just musing on these ideas at this point, and hope to better understand what is feasible as I work through the courses.


(Jeremy Howard) #22

Yes these are much the same observations I’ve had!

I should add however that interpretable ML is much further along than most people understand. In the machine learning course here we talk about it a lot. Hopefully you find it helpful.


(Malcolm McLean) #23

Hi all,

Can you help this Linux newbie with a couple of questions?

  1. I’m in the habit of adding learning experiments and personal annotations into the lesson notebooks. The course also recommends running “git pull” occasionally to update fastai.

Is this command going to overwrite my edited notebooks, or delete any renamed copies added to fastai? But to get to the point, what is a recommended workflow that lets me both keep my notes and have access to the latest notebook version?

  1. I tried to make a shell script that starts nvidia-smi and the jupyter server. Here’s what I have in MLStart.sh
    x-terminal-emulator -e “watch nvidia-smi”
    #x-terminal-emulator -e "jupyter notebook"
    jupyter notebook

The watch nvidia-smi line works perfectly. Either of the last two commands however gives “Failed to execute child process “jupyter” (No such file or directory)”.

But when I manually open a Terminal, either command starts the server as desired, and opens a browser window as specified in the jupyter config.

I’m sure this has a solution so simple that google can’t find it. Thanks for your help.(Ubuntu 16.04 LTS)