Ready to use cross-platform environment for the course

Hey everyone,

I’ve created a batteries-included Docker image for the course that makes it even easier to get started. It’s available at https://github.com/deeprig/fastai-course-1. I’ve tested it on Windows, Linux and MacOS. To use it:

  1. Install Docker on your machine: https://docs.docker.com/engine/installation/
  2. Run the container: docker run -it -p 8888:8888 deeprig/fastai-course-1

That’s it. All notebooks will be available on http://localhost:8888.

The container also works with GPUs when used with nvidia-docker. I have preconfigured AMIs with Docker/nvidia-docker ready to go on AWS p2 and t2 instances, and I’m working on a Python package that spins them up using your AWS credentials, so you don’t have to install Docker locally. Let me know if you’re interested in something like that and I can help you set it up.

16 Likes

You actually don’t need to write that AWS script - Docker has done that for you already, it’s called “Docker Machine”. :slight_smile:

After you install docker machine and set up your credentials, it’s really easy to create a new instance with docker installed - see amazonec2 driver.

e.g. docker-machine create --driver amazonec2 --amazonec2-region us-west-2 --amazonec2-ami $AMI_WITH_NVIDIA_DRIVERS --amazonec2-instance-type p2.xlarge my_awesome_machine_name

You may also want to change some of the other parameters (e.g. root hard drive size, zone, etc). Full AWS driver info

It will set up your VPC and security credentials automatically (default is a “docker-machine” security group whose permissions you can tweak on the AWS console or via command line).

As far as I can tell, there’s no way to get it automatically to associate an elastic IP but it will recognize the new IP once you associate it.

After creating your instance, accessing it is easy, e.g.

docker-machine ssh my_awesome_machine_name

There are several other useful commands too (start, stop, status, ls, ip, env, regenerate-certs, etc).

In case things get messed up, the ssh keys are stored in ~/.docker/machine/machines/my_awesome_machine_name.

Enjoy!

3 Likes

Good point! Yep, if you know Docker (and it’s a really great tool to know) you can manage everything using docker-machine as David pointed out. I’ll build and post links to AWS AMIs (p2/t2) and add instructions to the Docker repo.

I’ve created public AMIs that include docker and nvidia-docker, and added instructions to the Docker repo README. Pasting from there:

# spin up a p2.xlarge instance
docker-machine create \
  --driver amazonec2 \
  --amazonec2-region='us-west-2' \
  --amazonec2-root-size=50 \
  --amazonec2-ami='ami-e03a8480' \
  --amazonec2-instance-type='p2.xlarge' \
  fastai-p2

# open Jupyter port 8888
aws ec2 authorize-security-group-ingress --group-name docker-machine --port 8888 --protocol tcp --cidr 0.0.0.0/0

# open an SSH shell on the new machine
docker-machine ssh fastai-p2

# (on the remote machine fastai-p2) run Jupyter interactively
nvidia-docker run -it -p 8888:8888 deeprig/fastai-course-1

# (on your local machine) get the IP of the new machine:
docker-machine ip fastai-p2

Open http://$INSTANCE_IP:8888 in your browser to view notebooks.

Just wanted to add a couple of things…

  1. Use nvidia-docker instead of docker if you want to use the GPU.

You can get it from https://github.com/NVIDIA/nvidia-docker.

Try nvidia-docker --run nvidia/cuda nvidia-smi to check it.

  1. If you’re running a notebook, you probably want to use screen or tmux first so it stays up after disconnecting from your SSH session.
1 Like

Thanks so much for doing this @anurag. How does this perform when running locally on a mac? Not sure if I can get by doing that or if I should pay up and use AWS.

You probably won’t be able to use GPU acceleration on your Mac (no NVIDIA card -> no CUDA), so large networks will train slowly. Aside from that, it should work.

You’ll most likely want to use AWS at some point though…

I’m testing it on a 2016 MBP. It gets the job done if you’re just stepping through the notebooks, but CPU training is really slow on a full Kaggle dataset (e.g. on Cats vs Dogs). I think it should be fine for training on samples.

Thanks David. Forgot about nvidia-docker in the README (just updated it).

And good point about using tmux.

Note that you can also run the Docker container in daemon mode (instead of docker run -it, use docker run -d ) and it will keep the notebook container running even if you log out.

@anurag is there a way to install unzip? I’m running the container and ssh’d in (docker exec -i -t /bin/bash), and downloaded dogscats.zip. Unzip isn’t installed, running apt-get install unzip results in permission denied, and sudo apt-get install unzip says “sudo: command not found”

Just pushed a new Dockerfile with unzip. You can pull it once it’s built on Dockerhub in ~15 minutes [1]. Also going to update the file sometime today to allow the deeprig user to apt-get install things.

[1] The status is visible here: https://hub.docker.com/r/deeprig/fastai-course-1/builds/

Awesome, thanks.

@anurag have a couple more comments (and I’m happy to contribute to the repo if you agree with my points):

  1. How about just a generic “docker” user?
  2. The .theanorc file has device = gpu. I’m not sure how much a difference it makes (could be big, could be small), but Jeremy said to change that to cpu if you’re not using a gpu.

Just pushed an updated image with a generic docker user.

On gpu vs. cpu in .theanorc, I haven’t tried it myself. Worth investigating if you get a chance.

cool!! thank you for taking the time to build this. I will certainly update my experience when I get to try it out.

If anyone wants to use Python 3 and TensorFlow rc12, you can also check out:

I’ve mainly been using a version of this combined with some of the Continuum recipes: https://hub.docker.com/u/continuumio/

2 Likes

thanks so much David! Good to learn about docker-anaconda as well as keras.

I’m not too familiar with docker, but I’m interested in this idea of anaconda-docker / nvidia-docker containers, are there performance deficits associated with this approach as opposed to installing native?

It adds a bit of overhead (although a lot less than a VM) and an additional layer of complexity, but once a recipe works it works which can save a lot of troubleshooting time. You can specify how much memory and how many CPUs your container controls, as well as which ports are exposed (you’ll usually want at least 8888 and 6006 if you use tensorboard).

To use the GPU, there are some additional issues:

  1. you need to install drivers on the host machine that match the image you’re using
  2. you need to use nvidia-docker instead of docker, which is a wrapper around docker designed to allow the image to access the host GPU.

I’ve been using a modified version of the Keras Dockerfile (on their github repo).

FYI: There’s also a Kaggle one (they use that for their Kernels) which has almost any package you would need for a competition. Don’t think it uses GPU though…

1 Like

Are people using the image? I see 150+ pulls on Docker Hub. Would love to hear if it’s working out or if you’re running into any issues.