From zero to running lesson 1 notebook on AWS instance in 80 seconds

Figuring out the setup can seem quite intimidating where in fact it isn’t. Also, having a nice environment seems to be - at least for me - important to enjoying myself and getting things done.

I wanted to show you how easy things can be (this is the only run in with AWS I have on a daily basis - I don’t even log in to the AWS console) and to lift your spirits up a bit should you be intimidated by some of the info posted here and there, I produced my first youtube video ever :slight_smile:

Important: In reality this takes more than 80 seconds - provisioning the instance and starting the jupyter notebook takes a bit longer but I edited the video so that it is nicer to watch.

I :heart: Linux: get idea while taking shower - google for software - download kazam and openshot - publish :smiley:

10 Likes

Which windows manager are you using? Is that where the nice informational bar at the bottom come from? Thanks!

Yes :slight_smile: I use i3 - it is amazing IMHO :slight_smile:

2 Likes

It reminded me of ratpoison.

How much storage do you provision for your EC2instances? I tried your setup script on p2.xlarge and ran into disk space issues (with 50GB storage). The only other things I had set up myself myself was anaconda, not sure what caused the disk space increase.

IIRC p2 has 20 GB SSD and I think after install it was 65% full - not sure if it was after deleting the installation files or before

1 Like

So when I run as is I get:

AMI with tag:Name set to main-compute-instance not found.

So … I’m assuming that before we can run the script we have to have already done the following:

  1. Create a p2.xlarge instance named “main-compute-instance”
  2. Create a workspace volume named “main-compute-instance”
  3. Create a network interface named “main-compute-instance”

Thanks

Yes, you are right! But that is only necessary if you are going for an instance with persistent storage and persistent public IP address. This allows me to have a bookmark in my browser that takes me directly to the jupyter notebook without any shenanigans of looking up whatever public IP address the instance I just created was assigned :slight_smile:

Or, alternatively, if you are only interested in starting and connecting to a spot instance from your terminal (without any of the persistent goodies) you can go the ./request-spot-instance route.

Cool.

Yah, I like what you are doing in there where you attach the spot instance to your volume. If understand things correctly, that means we won’t have to constantly download code from git or rebuild the datasets.

Anything we need to do when creating the p2 instance on Amazon besides naming it “main-compute-instance”?

When configuring the p2 …

I’m assuming we have to use the VPC (eg. main-env) that was created in ./create-env.sh AND also select the “main-env-security-group” that was created as well?

Running ./request-main-compute-instance will spin up a spot instance of your choice using an ami named main-compute-instance. The way I go about this is I first build an ami that I like for a specific purpose (for example, I used to have an ami for part1 v1) and I install everything that I need - vim, my dot-files, etc. When I am done, I create an ami - remembering or sometimes unfortunately forgetting :wink: - to shred the keys as outlined in the howto, step #7. All I then have to do is slap the name main-compute-instance on the ami and it will be getting booted up.

If I ever want to update anything or make any changes, I just create a new ami in console and move the name across.

Apart from naming the ami main-compute-instance there is nothing you need to do for the script to pick it up and for it to work with this set up assuming you remember the shred the keys!

All this is done automatically by the scripts in the background and you should never have to worry about this. Same with generating keys, moving them to your .ssh directory, etc.

I would encourage you to work through the howto - it walks you through all the necessary steps and much of the complexity like creation of the VPC, security groups, managing of the tear down, etc, is abstracted away.


I have already put the fastai folder into the home or the downloads folder, but when I launch the jupyter notebook using ssh to aws, the folder didn’t appear. What should I do?

Thanks in advance.

You need to download the data on the AWS instance into the directory where you start the notebook.

What do you mean by downloading the data on AWS?

To start jupyter notebook you are running the jupyter notebook command on your remote instance, not locally on your Ubuntu Desktop.

It cannot see whatever files you have on your local computer. If you want your jupyter notebook to see the fastai repository, execute

git clone https://github.com/fastai/fastai.git

before starting the notebook.

Thanks! Have the file inside now, and I think is to execute
git clone https://github.com/fastai/fastai

1 Like

Yo @radek,

When creating the AMI it asks/requires me to specify storage for it. Since your post has us build and associate a volume separately, what should we do in this part of the setup?

Thanks

This is the root volume - you should be okay with leaving whatever the defaults are (should be just one EBS volume).

If this doesn’t answer your question, could you please post a screenshot where in the setup you are and what are the options?

Not sure if this is of value to anyone at this point but this is now even simpler due to an /etc/fstab entry and an entry in /etc/rc.local.

Amending fstab (see step #7)