AWS GPU install script and public AMI


#61

You can check it under your account > credits. I am not sure how reliable it is though.


(Ismaël Koné) #62

Hi @radek,
I’m on the point of switching from Reserved Instances to Spot Instances (Yeah a bit late!).
Please, could you possibly to show me in a screenshot, your bill during November to get
a clear idea of the cost and compare it to mine (below).

Thank you so much.


#63

hey @iskode!

Please take a look below (this is for November)


(Ismaël Koné) #64

Thank you so much for your prompt reply.
Wow it’s really economical to run spot instances, 3 times lower with more computations time, 76 hrs Vs 59hrs for mine. And even the storage is nearly the same as Reserved Instances.


#65

For the longest time I only kept a 20 GB SSD drive, essentially a workspace. At 10 cents / GB per month that ends up being just 2$ :slight_smile:

Another nice benefit here is that once you set everything up, it becomes very quick and easy to switch between instance types. For example, I am now considering learning embeddings on my local machine and uploading them to AWS to train random forests / xgboost classifiers and to take advantage of those crazily beefed up CPU instances.

Not sure if / when I will get a chance to get around to this cause of time constraints, but the idea has quite a bit of appeal to me :slight_smile:

So many awesome things from p1 v2 / ML course I still haven’t had a chance to play around with!!!


(Upendar) #66

Please help me with this…
I’m just struct at this point.

$ bash setup_p2.sh
rtbassoc-87ceaefc

An error occurred (InvalidID) when calling the CreateRoute operation: The ID 'rt ’ is not valid

An error occurred (InvalidGroupId.Malformed) when calling the AuthorizeSecurityG "oupIngress operation: Invalid id: "sg-ad7e33d1

An error occurred (InvalidGroupId.Malformed) when calling the AuthorizeSecurityG "oupIngress operation: Invalid id: "sg-ad7e33d1
setup_p2.sh: line 13: /home/welcome/.ssh/aws-key.pem: No such file or directory
chmod: cannot access ‘/home/welcome/.ssh/aws-key.pem’: No such file or directory

An error occurred (InvalidKeyPair.NotFound) when calling the RunInstances operat ion: The key pair ‘aws-key’ does not exist
Waiting for instance start…
Waiter InstanceRunning failed: Max attempts exceeded
usage: aws [options] [ …] [parameters]
To see help text, you can run:

aws help
aws help
aws help
aws.exe: error: argument --instance-id: expected one argument
securityGroupId=sg-ad7e33d1
subnetId=subnet-6ad8f822
instanceId=
instanceUrl=None
Connect: ssh -i /home/welcome/.ssh/aws-key.pem ubuntu@None


(Jeremy Howard) #67

@Rishit you should simply use the fast.ai AMI, rather than trying to setup from scratch.


(Ismaël Koné) #68

Hi Radek, I’ve been working my way through instructions using the fastai ami instead of installing everything from scratch. But now I’m worried about the persistence in itself as it’s done in the workspace directory. So what if I update a package or conda, will this be saved? if not how to persist the system state because this will be a big problem as at each spot request the system is restored as it was in the AMI image. So how you deal with such situation?
thank you so much.


(Aditya) #69

If a package is updated, the corresponding changes will be made by someone to the NBS and moreover we ourself can track them as the NBS serves as a reference (how to do play with datasets)
Also upgrading packages always keep backward compatibility…


(Vincent) #70

Hi @radek, thanks again for the great work (I actually came here from your Medium article). I’ve followed that tutorial’s instructions up to the command $HOME/aws_scripts/spot-instance-connect. While before it was throwing an error saying it couldn’t connect, now it’s just hanging (for about 1 hour now). I changed the availability zone and the AMI to the one provided by jhoward in this post to stop the error, but now there’s hang… Do you have any suggestions or ideas about how to fix this? Thanks again for your help!


#71

Hey Vincent! Good to see you around these parts! :slight_smile:

Generally, hard to say what could be going on. If you said you had some issues earlier on, maybe the environment is in an inconsistent state - these scripts are really simple, they do not check much and expect that things will just be there that they require.

On the other hand, I have never had them hang indefinitely and usually the AWS CLI produces at least somewhat legible error messages. If you are trying to connect to the spot instance, did you first request it and did you see a message saying that the instance was spun up for you?

I think what could be really useful here would be logging into AWS Console (their web portal) and going to EC2 Dashboard. You need to go navigate to the region of interest and you will be able to see if there is any VPC already there, etc (VPC is the environment). More importantly, you will also be able to check the status of instances - if you have any running, etc. Also, under spot instance requests you will be able to check if maybe you have any pending requests - this is very useful as they contain a message which can point you to what the reason for the issues you are experiencing might be (incorrect config, something missing, too low spot price, etc).

The two scenarios that come to my mind where you could experience this hanging would be a) requesting an instance and setting too low of a spot price b) not authorizing your IP address to connect (the authorize-current-ip command)

Do take a look at the AWS console and see if anything would stand out. If you would have any further questions, you know where to find me :slight_smile: