Setup problems: AWS

alronlam · September 27, 2017, 6:16am

Hello, I am also getting the same error. Have you found a fix to this? Can anyone help?

P.S. I’ve also requested AWS to increase service limit to allow me to use P2, and they’ve already approved my request.

radek · September 27, 2017, 9:59am

You can access the keys on the EC2 dashboard.

You are using Linux, right? And are checking for the .ssh folder locally on your computer and not on the remote machine?

I am sorry I cannot be of more help but I use a different setup and have some AMIs / volumes configured and thus don’t want to tear down my env which I would need to do to troubleshoot this. Maybe someone following the canonical way of doing this for the course could chime in.

I follow this setup as described here but only look to it as the last resolve - hopefully someone who uses the default scripts will be able to help you troubleshoot further.

lion137 · September 27, 2017, 12:04pm

Hi all, does anyone use crestle https://www.crestle.com/faq ? I’m interested how it works in compare to AWS and are there any issues with docker? Thanks u very much!

fsantos · September 28, 2017, 7:36pm

I guess the AWS agreement is still on, I got this limit increase not so long ago, but there are some other alternatives to get a running GPU machine in a fast way, check https://www.floydhub.com/ or https://www.paperspace.com/

lion137 · September 29, 2017, 8:11am

I’m going to use crestle.com, there is a docker repo designed to courset(thanks to mr. Anurag Goel): https://hub.docker.com/r/deeprig/fastai-course-1/

wally · September 29, 2017, 10:24pm

To save some money, I’ve always used aws spot instances. Here in the eastern US, specifically in zone 1c, prices rarely go above $0.25/hr. The downside is I sometimes get kicked off if prices rise. That hasn’t happened yet with a p2 instance (~20 hrs usage).

Below is the command I run:

aws ec2 request-spot-instances --spot-price 0.30 --instance-count 1 --type “one-time” --launch-specification file://./p2.json

json (sorry couldn’t upload):

{
“ImageId”: “ami-31ecfb26”,
“KeyName”: “xxxxxxxxxx”,
“InstanceType”: “p2.xlarge”,
“NetworkInterfaces”: [
{ “DeviceIndex”: 0, “SubnetId”: “subnet-xxxxxxx”, “Groups”: [ “sg-xxxxxxxxx” ],
“AssociatePublicIpAddress”: true
}
],
“BlockDeviceMappings”: [
{
“DeviceName”: “/dev/sda1”,
“Ebs”: {
“Encrypted”: false,
“DeleteOnTermination”: true,
“VolumeSize”: 128,
“VolumeType”: “gp2”
}
}
],
“Placement”: { “AvailabilityZone”: “us-east-1c” }
}

I also hacked install_gpu. I ran into an issue I think because cuda 9.0 is the current version as of a couple of days ago. So I eliminated ‘sudo apt-get upgrade’ as well as the cuda driver and anaconda installs, since those two items are already installed on the ami.

For now, I keep the server in an ‘old’ state so I can concentrate on the deep learning.

sandeepreddys09 · September 30, 2017, 12:00pm

Hi, I have started this course today and am very excited. Following exactly what was told in the video walkthrough, I am getting the following error while trying to setup a t2 server. My user credentials are correct and I’ve tried with two users. My default location is set to us-west-2 (I’m in India but I’m ok with the extra latency). Can someone help me proceed forward?

An error occurred (AuthFailure) when calling the CreateVpc operation: AWS was not able to validate the provided access credentials

An error occurred (AuthFailure) when calling the CreateTags operation: AWS was not able to validate the provided access credentials
usage: aws [options] [ …] [parameters]
To see help text, you can run:

aws help
aws help
aws help
aws: error: argument --vpc-id: expected one argument
usage: aws [options] [ …] [parameters]
To see help text, you can run:

aws help
aws help
aws help
aws: error: argument --vpc-id: expected one argument

An error occurred (AuthFailure) when calling the CreateInternetGateway operation: AWS was not able to validate the provided access credentials

An error occurred (AuthFailure) when calling the CreateTags operation: AWS was not able to validate the provided access credentials
usage: aws [options] [ …] [parameters]
To see help text, you can run:

aws help
aws help
aws help
aws: error: argument --internet-gateway-id: expected one argument
usage: aws [options] [ …] [parameters]
To see help text, you can run:

aws help
aws help
aws help
aws: error: argument --vpc-id: expected one argument

An error occurred (AuthFailure) when calling the CreateTags operation: AWS was not able to validate the provided access credentials
usage: aws [options] [ …] [parameters]
To see help text, you can run:

aws help
aws help
aws help
aws: error: argument --vpc-id: expected one argument

An error occurred (AuthFailure) when calling the CreateTags operation: AWS was not able to validate the provided access credentials
usage: aws [options] [ …] [parameters]
To see help text, you can run:

aws help
aws help
aws help
aws: error: argument --route-table-id: expected one argument
usage: aws [options] [ …] [parameters]
To see help text, you can run:

aws help
aws help
aws help
aws: error: argument --vpc-id: expected one argument
usage: aws [options] [ …] [parameters]
To see help text, you can run:

aws help
aws help
aws help
aws: error: argument --group-id: expected one argument
usage: aws [options] [ …] [parameters]
To see help text, you can run:

aws help
aws help
aws help
aws: error: argument --subnet-id: expected one argument

An error occurred (AuthFailure) when calling the CreateTags operation: AWS was not able to validate the provided access credentials

An error occurred (AuthFailure) when calling the AllocateAddress operation: AWS was not able to validate the provided access credentials
Waiting for instance start…

Waiter InstanceRunning failed: AWS was not able to validate the provided access credentials
usage: aws [options] [ …] [parameters]
To see help text, you can run:

aws help
aws help
aws help
aws: error: argument --instance-id: expected one argument

An error occurred (AuthFailure) when calling the DescribeInstances operation: AWS was not able to validate the provided access credentials

An error occurred (AuthFailure) when calling the RebootInstances operation: AWS was not able to validate the provided access credentials

All done. Find all you need to connect in the fast-ai-commands.txt file and to remove the stack call fast-ai-remove.sh
Connect to your instance: ssh -i /Users/sandeep.reddy/.ssh/aws-key-fast-ai.pem ubuntu@

Thanks in advance.

sandeepreddys09 · September 30, 2017, 1:00pm

I was able to resolve this issue by correcting the time set on my computer (which was off by 8min). Hope this helps someone.

jaspb · September 30, 2017, 2:36pm

For eu-west-1 (Ireland), in line 16 in the setup_t2.sh file, change to ami-9e1a35ed (from ami-f8fd5998 for us-west-2). That worked for me.

Best, Jasper

jaspb · September 30, 2017, 2:38pm

For eu-west-1 (Ireland), in line 16 in the setup_t2.sh file, change to ami-9e1a35ed (from ami-f8fd5998 for us-west-2). That worked for me. Best, Jasper

kora_am · October 1, 2017, 6:28am

Hi - I have been having the same problem. Dont see a solution posted for it. Please if someone can let me know.
I get the following error:
’ is not validred (InvalidID) when calling the CreateRoute operation: The ID 'rtb-8ac560f3

Thanks

deesoni · October 1, 2017, 7:49am

Hi Jeremy/Rachel,

I am trying to do the setup for t2 instance but i am facing issue, while connecting to ssh command. I am getting resource temporarily unavailable message. Sorry, i was not able to find any relevant post regarding this.

root@01HW1102594:~# ssh -i /root/.ssh/aws-key.pem ubuntu@ec2-52-11-56-142.us-west-2.compute.amazonaws.com
ssh: connect to host ec2-52-11-56-142.us-west-2.compute.amazonaws.com port 22: Resource temporarily unavailable

Thanks!!
Deepak

Manishankar · October 1, 2017, 9:41am

Hello,
I’m unable to connect to the p2 instance through ssh. I’ve made sure the instance is running, went through the forum for possible solutions, tried terminating and starting properly a couple of times (released all resources as mentioned n wiki). there are a few posts on top where @rachel has advised on similar issues, but I’m not getting any errors here, simply connection timing out! I’m using the scripts from git (https://github.com/fastai/courses/tree/master/setup) . Any help is greatly appreciated. TIA.

Edit : tried the below using aws-alias.sh. tried with a t2 xlarge instance as restarts are cheaper.

h.khanpour · October 1, 2017, 9:36pm

I have the same problem…is there anybody to help? my time set is correct…

jayfree · October 2, 2017, 6:03am

Make sure that you ran this line of code

export PATH="$HOME/anaconda2/bin:$PATH"

I was setting up my home deep learning machine and I was manually typing the commands from the “install-gpu.sh” script. I missed to type this line and was getting you same error. After updating the PATH variable, this error was gone.

sabliao · October 2, 2017, 8:02pm

I’m following the above video to set up my environment. I tried pip install awscli like it’s done in the video but I get:

File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/os.py", line 157, in makedirs mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/Library/Python/2.7/site-packages/awscli'

Usually I would rerun the pip install w/ sudo in these cases, but I assumed Anaconda would take care of permission issues like this. Why is the user/instructor in the video able to run w/o sudo?

prav · October 2, 2017, 9:09pm

I did the install as described but was not able to launch the aws instance. Then I deleted everything and did the whole process again. Now I am getting this error:
$ bash setup_p2.sh
rtbassoc-e4d9c49e
True

An error occurred (InvalidAMIID.NotFound) when calling the RunInstances operation: The image id ‘[ami-bc508adc]’ does not exist

I downloaded the setup_p2.sh from location : http://files.fast.ai/files/setup_p2.sh" is that correct?
Also do I have to wait for Amazon to approve before I go ahead and create an instance.

Please help.

Thanks

sabliao · October 3, 2017, 5:33am

To add more context for what I’ve done since posting this question: I have the anaconda path added to my ~/.zshrc file (/Users/sabrinaliao/anaconda2/bin) and sourced it (but I’m not sure my default python is the Anaconda installation b/c I didn’t see anything that says ‘anaconda’ when I run python unlike what’s shown in the example here). The pip install of awscli still failed, but then I thought to create a a virtual environment using conda. I ran conda create -n fast_ai and then activated the virtual environment. However, I still got a permission denied error when I tried pip install. I then tried conda install awscli but got:

Fetching package metadata ...........

PackageNotFoundError: Packages missing in current channels:

  - awscli

We have searched for the packages in the following channels:

  - https://repo.continuum.io/pkgs/main/osx-64
  - https://repo.continuum.io/pkgs/main/noarch
  - https://repo.continuum.io/pkgs/free/osx-64
  - https://repo.continuum.io/pkgs/free/noarch
  - https://repo.continuum.io/pkgs/r/osx-64
  - https://repo.continuum.io/pkgs/r/noarch
  - https://repo.continuum.io/pkgs/pro/osx-64
  - https://repo.continuum.io/pkgs/pro/noarch

Searching for awscli for my fast_ai venv via the anaconda navigator brought up 0 results too (was just double checking). I decided to run the command given here: conda install -c conda-forge awscli, and that didn’t error out, but I don’t know why even after that in this venv awscli isn’t recognized, because during the install with conda-forge, I saw

Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /Users/sabrinaliao/anaconda2/envs/fast_ai:

The following NEW packages will be INSTALLED:

    awscli:          1.11.120-py27_0    conda-forge
...

which implies to me that awscli was installed. How am I supposed to get awscli installed the right way?

Manishankar · October 3, 2017, 10:22am

Hi,
You should be getting your scripts from https://github.com/fastai/courses/tree/master/setup .
and yes, you need to wait till your p2 instance is approved (believe you’ve already raised the request as mentioned in the video)

Regards,
Manishankar

prav · October 3, 2017, 2:54pm

Thanks Mani,

But if I get it from git and run, I get error for newline and doctype etc.( I used wget and use the web-address of the file. Did I get the html page??)

Thanks