Need advice for setting up a custom AMI from scratch

wgpubs · January 3, 2017, 1:43am

I was able to configure everything fine using the setup_p2.sh script … but I’d like to see if I can create my own AMI that uses virtualenv and python3 instead of anaconda and python2.

I’m comfortable with python … and totally noob when it comes to AWS EC2 and AMIs.

Anyhow, any advice for setting up an instance identical to the one used in the course with the exception, it would seem, of a custom “install-gpu.sh” script that I would need to customize to use/configure python3 and virtualenv, etc…

Thanks - wg

rachel · January 3, 2017, 11:15pm

@wgpubs We will be using Python 3 (and TensorFlow) in Part 2 of the course, so if you wait we’ll be sharing a new setup script for it in the future. Why are you interested in using virtualenv instead of anaconda?

wgpubs · January 4, 2017, 12:57am

@rachel … a few of reasons:

Familiarity.
I like the ability to install just the packages I need (however I want to install them) for a given environment. It seems like Anaconda is primarily geared towards data science and by default installs a bunch of software that I usually don’t use. I use python for all kinds of things (web applications, simple utility scripts, basic machine learning projects, etc…) and it’s just cleaner, imo, to do a “pip list” and not have to guess what the application is actually using.
I have both 2.7 and 3.5 pythons on my machine and I understand how to use virtualenv to configure the proper python for whatever project I’m working on. Again, Anaconda may support this, but I already know how to do it with virtualenv.
As I already have virtualenv working in my environment, adding yet another python environment seems prone to confusing me (at some point) and the machine.

With all this said, I’m willing to be persuaded to use Anaconda and be corrected on any false assumptions I may have about it. It may be even superior in ways I don’t know and so I’d love to know why folks use it over things like virtualenv. Every single tutorial on the subject of setting up a deep learning environment uses it … so that says something.

Great to hear about the Part 2 environment. I was able to get the AWS instance up and running just fine from the “Getting Started” page and plan to use that for most of this course.

Additionally, with some help on the forums, I was able to configure a python 3.5 based environment on my Windows 10 laptop (with Nvidia 960M graphics card) Theano and Keras and with full GPU support. I’m going to try doing some of the course using this environment and modify whatever .py files as needed.

wgpubs · January 4, 2017, 1:00am

@rachel Btw, any thoughts on the new (I think new) Deep Learning AMIs from Amazon?

https://aws.amazon.com/marketplace/pp/B01M0AXXQB

Highlights:
New in Version 1.5 - CPU Instance Type Support, MXNet built with MKL support.

6 Deep Learning Frameworks - contains the most popular Deep Learning Frameworks (MXNet, Caffe, Tensorflow, Theano, Torch and CNTK) all prebuilt and pre-installed.

Pre-installed components to speed productivity include Nvidia drivers, CUDA, cuDNN, Anaconda, Python2 and Python3.

Description:
The Deep Learning AMI is a supported and maintained Amazon Linux image provided by Amazon Web Services for use on Amazon Elastic Compute Cloud (Amazon EC2). It is designed to provide a stable, secure, and high performance execution environment for deep learning applications running on Amazon EC2. It includes popular deep learning frameworks, including MXNet, Caffe, Tensorflow, Theano, and Torch, as well as packages that enable easy integration with AWS, including launch configuration tools and many popular AWS libraries and tools. It also includes the Anaconda Data Science Platform for Python2 and Python3. Amazon Web Services provides ongoing security and maintenance updates to all instances running the Amazon Linux AMI. The Deep Learning AMI is provided at no additional charge to Amazon EC2 users.

jeremy · January 4, 2017, 2:28am

You can install anaconda without other packages, but I don’t see the point unless you have a really small machine. The packages it has are very helpful.

I really think conda is better than virtualenv and pip in pretty much every respect, and I’m seeing a lot of people move over. But the older tools still work, so as long as you don’t mind handling it yourself (since the course doesn’t use them) feel free.

I’ve not looked at the most recent AMIs, but the previous ones always seemed to be not quite set up correctly… so I guess I’m not in a hurry to use them, now that we’ve got our own AMIs (and simple scripts to repro them).

wgpubs · January 4, 2017, 2:35am

@jeremy Yah I’m noticing the shift myself … especially wrt to anything machine learning.

davecg · January 24, 2017, 2:00pm

Conda is actually pretty great!

You can install miniconda instead of anaconda and then select which packages you want to install.

You can manually add them with conda install numpy scipy pandas etc or conda install anaconda to get everything in anaconda. You can also create a YAML file with all of your requirements and create a new environment with conda env create -f environment.yml. You can specify your python version in that file as well.

Example file:

name: myenv

dependencies:
  - python>=3.5   # or 2.7
  - bokeh=0.9.2
  - numpy=1.9.*
  - scipy
  - pandas
  - scikit-image
  - flask
  - pip:
    -some_pip_pkg1
    -some_pip_pkg2

(If you don’t want to create it by hand, you can also manually install once and export your environment.)

Switching to a new environment on conda is also straight forward, either:
source activate myenv (Linux/Mac) or activate myenv (Windows)

Conda environment info