Thoughts on spot instances

These are some of my observations of running a Spot Instance for the first time:

Advantages: Cost - you can save maybe on average ~30% compared to on demand prices for p2 instances.

Downsides:

  1. Create a new instance each time since you need to terminate it after your job is done. So model, data etc must be copied to another existing instance or downloaded before termination.
  2. Your job can terminate if the current fluctuating price > your max bid price (not happened to me yet). You can circumvent that by putting your max bid price to on demand price.

If you have used spot instances before please share your experience …

1 Like

If you search through the pricing by region, you’ll find some spot prices that are 80%+ cheaper!

You can detach your volume from one instance and attach it to another, rather than copying all your data around.

5 Likes

It is good to know the flexibility of moving volumes around instances - a small cost though is few extra $ for all those instance’s storage.

If you’re just moving an EBS volume around, it should not cost extra. And any extra storage costs should be offset by major savings on the instance. Spot instances are awesome if you can deal with the possibility of losing the instance unexpectedly.

Also, note that you get 2 mins notice of a shutdown. So I think we could create some methods to use spot instances effectively over the coming weeks - would be a really useful project

4 Likes

Yes ,… that would be great! - happy to contribute.

A quick question: say I have one on demand instance: t1 with root volume of fast-ai ami
and a spot instance p2 with a root volume also from fast-ai ami.

When the spot instance p2 is terminated, will the volume be automatically saved with new data. In other words, can a single volume be shared and data persisted across multiple instances (assuming for now that only one instance is running at any given time and instance can also be terminated)

Yes, in general, as long as it is based on an EBS volume - details here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-detaching-volume.html . Note however that your specific request case, the answer is “no”, since the t1 AMI and p2 AMI are different, and incompatible, because one uses HVM and one doesn’t: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/virtualization_types.html .

Can anyone share their CLI command for starting a p2 spot instance with fast-ai AMI ?

1 Like

For creating spot instance of the p2 ai instance provided by jermey in this course how do i do that? I am new to aws and when i select spot instance i don’t know how to select this perticular AI ami. thanks.

I manually started the spot instance from AWS console and then used the ip address to log into the instance. This was my personal preference over using my own CLI command to initiate a spot instance.

When you are selecting the instance type, you are given the option to select the ami as well. I have regularly used spot instances and they are a gem as your aws bill will be drastically reduced. Make sure you save your results for example in S3 storage.

I’ve been using the following specification to request p2.xlarge spot instances via the CLI.

You’ll need to change the following parameters for your requests:

  • SpotPrice
  • KeyName (if different - make sure this key exists in the region)
  • Placement.AvailabilityZone - Availability zone to start in
  • SubnetId
  • SecurityGroupIds

Make sure the subnet exists in the provided availability zone.

You can use the following command to request the spot instance

To get the public IP of the instance you can use

You can also modify the specification file to attach additional existing EBS volumes if required.

2 Likes

Hi Mike,
Thank you so much for this helpful post!

I have a few questions on how to proceed accordingly. Could you have a look?

  1. I should do starting over with aws First, right?
  2. If I download request-spot.json into the same directory where I source aws-alias.sh, what should I put after --cli-input-json=? Should I just type the file name request_spot.json?
  3. as for SpotPrice, what price would you recommend for a beginner? Is it complicated to change spotPrice later on?
  4. Where can I find my KeyName? I noticed in specification your KeyName is fastio, I am taking deep learning course using AWS WestOregon region, should I use the same fastio too?
  5. for Availability zone, should I replace your region input with --region us-west-oregon or --region us-west-2a? or something else, given my accounting showing US West (Oregon) region now.
  6. I find my SubnetId ok, but I got two SecurityGroupIds (not sure why). Which I should I use for deep learning course? They are differentiated by their descriptions: one is under default VPC security group, the other one is under Generated by setup_vpn.sh, which one should I choose? or should I list both?
  7. I verified that subnet exist and available under us-west-2a
  8. in your code for getting public IP, what is number of --output text | grep b43d1ec7 for? should I use the same?

Given my limited understanding at this moment, what I could think of doing is the following, could you verify them for me, please?

  1. starting over with AWS
  2. download request_spot.json
  3. under SubnetId and SecurityGroupID, type my own IDs
  4. run the following code to request spot instance:
    aws ec2 request-spot-instances --cli-input-json=request_spot.json --spot-price 0.20 --region us-west-2a
  5. run the following code to get public IP for the instance:
    aws ec2 describe-instances --region us-west-2a --output text | grep b43d1ec7
  6. After I successfully get spot Instance and public IP, can I still use source aws-alias.sh, aws-start, aws-ssh to start? If not, how can I get started to use instance?

Thank you very much and Happy New Year!

Daniel

Hi @Daniel if you’re new to AWS it may be quite a bit easier to use on demand instances

(~$0.90/hour) which all the scripts work with rather than spot instances - which are cheaper but often less convenient. This is for a few reasons mainly:

  1. The instance may shut down with little (~2 minutes) notice
  2. Data is not preserved unless it’s on an EBS volume that is detached when the instance terminates - or data/code has been saved externally (on Git, or S3).
  3. It’s not possible to shut down an instance the same way you can shut down a on-demand instance (related to 2))
  4. The scripts the course has provided all assume that you’re using an on-demand instance. For this reason AWS management scripts are unlikely to work without some degree of modification to suit spot instances.

With that disclaimer here’s the answers to your questions

  1. Yes, unless those things already exist.
  2. The aws-alias (and other scripts) provided in the course assume that you’re using on-demand instances rather than spot instances so these scripts aren’t likely to work without modification.
  3. Spot price varies by region, zone and time for a given instance type. I generally try and bid at least 15% above the current spot price for a given zone. You can look up the price variations for a given zone by looking in the interface under EC2 => Instances => Spot requests => Pricing history and select the p2.xlarge instance type. For the most part prices will be significantly lower than on demand but there will be occasional spikes based on demand for that instance type in a certain period.
  4. This will be whatever you’ve named your private key (and the private key you wish to use for the instance). You can get a list of your private keys under EC2 => Network & Security => Key Pairs.
  5. It’ll be us-west-2 if that’s the region you wish to start it in.
  6. I’ve created one from scratch. If you’ve got an existing security group created previously you may be able to use this. The security group should have SSH access (port 22) and Jupyter notebook access (port 8888) from your IP.
  7. You should be able to use the same command as it’s performing the following. Listing the instances in a given region, filtering instances that have been created using the fast.io AMI which has an id of b43d1ec7.

Unfortunately you won’t be able to use the aws-alias commands (start and stop and ssh etc) as this assumes it’s an on demand instance.

2 Likes

Hi Mike,

Thank you so much for your detailed explanation! I think you are right that I should stay with on-demand instance for now.

Happy New Year

Daniel

I’ve been thinking about doing something similar to try and reduce p2 costs. I’m curious what your strategy is for dealing with the data. Do you attach an existing EBS volume with your code and data already stored? Or do you download each time you spin up? Do you do anything to deal with an unexpected shutdown?

I’m thinking about using an EBS volume that holds all my code/data and attaching that each time I do a spot request, then mounting that drive and doing all the work there.

Amazon now offers automatic bidding, where they bid for you so your EC2 instance will not be automatically terminated. So you actually pay at the current spot price and the ceiling is on-demand price.

I was able to create a spot p2.xlarge instance, but I failed to connect using SSH, because I first need to set up a VPC, an internet gateway, a subnet, a route table, and then somehow make them work together. And I have no idea how any of that works. Does anyone know good resources on how to set these up, because at this point it feels overwhelming to me as I do not have experience in networking.

I’ve been using docker-machine to use AWS and it handles the networking stuff (VPC, etc) for you automatically (or you can specify what you already have). It also has the option to bid on spot instances. You should be able to use it to spin up a P2 instance with the fastai AMI pretty easily.

See:

(You can do this even if you aren’t going to use docker on your remote machine.)

I don’t know much about docker, but I think I will be able to follow the guide. Do I then use ssh to connect to AWS or is there another method? What is the reason to use docker in this case and why is it better then default aws-get, aws-start and aws-stop aliases that are introduced in the course? How do I stop an AWS instance created using Docker?

docker-machine ssh machine_name
docker-machine start machine_name
docker-machine stop machine_name

I’m not saying this is a better method, just another option.