AWS GPU install script and public AMI


#21

AMI created - the new AMI is ami-acac0ad5.


(Zao Yang) #22

had this issue:

A client error (InvalidParameterValue) occurred when calling the CreateSubnet operation: Value (eu-west-1a) for parameter availabilityZone is invalid. Subnets can currently only be created in the following availability zones: us-west-2a, us-west-2b, us-west-2c.
usage: aws [options] [parameters]
aws: error: argument --subnet-id: expected one argument

Also afterwards I got this issue:

A client error (InternetGatewayLimitExceeded) occurred when calling the CreateInternetGateway operation: The maximum number of internet gateways has been reached.


#23

Oh dear :smiley: Having compiled pytorch on p3.2xlarge this now fails on a p2.xlarge :slight_smile:

Todo for tomorrow for myself:

  • see if the install script works on p2.xlarge
  • if yes, create a new AMI for use with p2.xlarge instances

#24

At what point did you get these errors? Are you using my scripts or the scripts from part1 v1 of the course?


(Ken) #25

@radek

Thanks for posting these instructions. I’m looking forward to getting this set up.

In the first step to spin up the p2.xlarge instance, should I be using “setup_p2.sh” in courses/setup? I was getting an error like this:

(aws) kmatsuda12ctower:setup kmatsuda$ ssh -i /Users/kmatsuda/.ssh/aws-key-fast-ai.pem ubuntu@ec2-52-32-247-202.us-west-2.compute.amazonaws.com
Enter passphrase for key ‘/Users/kmatsuda/.ssh/aws-key-fast-ai.pem’:

Looking in the forums, I found this thread in which somebody ran into the same issue. Jeremy’s response in that thread was:
“Somehow you’ve ended up with a password protected key. Might be easiest to start over.”

I ran “fast-ai-remove.sh” and started over, but am still getting the same error. Should I still be following the instructions in the AWS setup video to get the p2 instance going?

Is anyone else running into this? I can wait for the new AMI, but I would like to understand what the problem is and how to resolve it (if anyone knows).

Thanks


#26

I do not use those scripts so it is hard for me to comment. I use a slightly modified version that you can find here.

I have not visited this repo of mine in quite a while and forgot what was in the readme but this information seems like it might be useful :slight_smile:

In the request-spot-instance.sh on line #11 replace the ami with an ami of your choice:

export ami="ami-785db401" <- the default one is Ubuntu Server 16.04 LTS I believe.

The instructions from the original post in this thread assuming you will have some solution for doing the above (the step 1 and step 2 - spinning up an instance and SSHing into it) but now that I think of it this might be useful for other people as well who - let me update the original post.

BTW this assumes you have the AWS cli configured.


(Ken) #27

Thanks @radek. I think I may have a combination of problems, but re-reading your post it sounds like I’m not in the correct region. When I run these scripts I get stuck with needing that AMI you reference which is in ‘eu’ and I’m in ‘us’. I may still run into the issue I had mentioned before, but I think I’ll wait for Jeremy’s new AMI and re-ask the question if I run into it then. Sorry for the noise.


#28

No worries at all! I think you are spot on regarding the ami availability. I think I could copy it to a different region but not sure if there is a region that everyone in US uses and also not sure if there would really be any interest in the ami. Besides, the one from @jeremy is definitely the way to go.


#29

Well, quite unsurprisingly I guess pytorch built on p3.2xlarge will only work on p3.2xlarge and same goes for p2.xlarge.

I built it again on p2.xlarge and created a public AMI: ami-004aec79. Again, available only in eu-west-1.


(Sanyam Bhutani) #30

@radek
I couldn’t run the Jupyter notebook on a reserved instance.
The install script ran without any errors.
However the ./start-jupyter-notebook doesn’t do anything.
Nor was I required to do the steps 3,4,5.


#31

hey @init_27 - did ./start-jupyter-notebook give you an error? also, after downloading the installation script did you run bash install-gpu-part1-v2.sh?

Could you please run history 20 and copy the output?


(Sanyam Bhutani) #32

@radek Thanks
I re ran the script on a fresh instance and got it running (Reserved instance one for a p2x.large, will try the spot instance one tomorrow ). :sweat_smile:

Could you please help me with logging into the Jupyter notebook via my laptop’s browser?
I have the Jupyter up and running on the instance. It says I can login by http://allip:8888/
I tried doing that with the public IPv4 address but couldn’t.


(Sanyam Bhutani) #33

Also Kindly help me with the step for spot instances ‘Configure the VPC and its dependencies’.


#34

Please try https://

You might also need to open ports in security group or do ssh tunneling (recommended)

Sorry on mobile for the rest of today - can’t type much


#35

Not sure where you came across this step - I do this via modified scripts from part 1 v1 but if attempting this it is best to follow an end to end tutorial or have the scripts do it for you :slight_smile:


(Sanyam Bhutani) #36

Thanks!
I’ve used tunnelling. However if I start the Jupiter notebook with your start- script, it doesn’t allow me to launch it on my local machine. It works fine if I use the command jupyter notebook Nb.ipynb and then try logging in via localhost on my local machine (Through tunnelling)


(Sanyam Bhutani) #37

VPC was mentioned in Your medium post.

‘2. Bring up the VPC along with all the necessary pieces (Internet gateway, subnet, security group, etc).’

So instead of the medium post, should I directly run your script script on my local machine (without any instance running on my AWS that is…)?


#38

Ah okay, yes, this is just a heading of the section - if you run the scripts as outlined in that section you should be fine


#39

trying to get my AWS setup completed. I’ve logged into AWS but can’t see ami-acac0ad5. I’m in eu-west-1 region.

The amis I see are:

  • ami-9b32e8e2 (ubuntu, cuda 9)
  • ami-dca37ea5 (ubuntu, cuda 8)

the two other AMIs are Amazon Linux and I assume we aren’t using Amazon Linux for this course or it doesn’t matter?

Can someone please tell me the correct ami to use? Thanks


(Sanyam Bhutani) #40

Okay, thanks.