Persistent AWS Spot Instances (How to)

slavivanov · March 7, 2017, 5:08pm

@jamestdsmith I believe @slazien had the same issue. Checkout his post above as it might be helpful.
As for jupyter notebook, you might want to look into this script:

github.com

fastai/courses/blob/master/setup/install-gpu.sh

# This script is designed to work with ubuntu 16.04 LTS

# ensure system is updated and has basic build tools
sudo apt-get update
sudo apt-get --assume-yes upgrade
sudo apt-get --assume-yes install tmux build-essential gcc g++ make binutils
sudo apt-get --assume-yes install software-properties-common

# download and install GPU drivers
wget "http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.44-1_amd64.deb" -O "cuda-repo-ubuntu1604_8.0.44-1_amd64.deb"

sudo dpkg -i cuda-repo-ubuntu1604_8.0.44-1_amd64.deb
sudo apt-get update
sudo apt-get -y install cuda
sudo modprobe nvidia
nvidia-smi

# install Anaconda for current user
mkdir downloads
cd downloads

This file has been truncated. show original

Especially this part:
# configure jupyter and prompt for password jupyter notebook --generate-config jupass=python -c “from notebook.auth import passwd; print(passwd())”echo "c.NotebookApp.password = u'"$jupass"'" >> $HOME/.jupyter/jupyter_notebook_config.py echo "c.NotebookApp.ip = '*' c.NotebookApp.open_browser = False" >> $HOME/.jupyter/jupyter_notebook_config.py

jamestdsmith · March 7, 2017, 7:31pm

awesome, many thanks - I’m very new to development so this helps me loads.

xinxin.li.seattle · March 9, 2017, 7:04am

@slavivanov I am getting this error message when I tried to run the bash script fast_ai/start_spot.sh (second approaching using an existing instance).

“An error occurred (InvalidAMIID.NotFound) when calling the RequestSpotInstances operation: The image id ‘[ami-6edd3078]’ does not exist
Spot request ID:
Waiting for spot request to be fulfilled…”

It doesn’t seem to like the image id, but the conf file specifically say not to change this image id. Can you help me take a look at this? Thank you!

z0k · March 9, 2017, 7:34am

Hi,

The AMI should correspond to the region you’re in. Here are a couple of snippets from the ondemand_to_spot.sh script:

export region=`aws configure get region`
# The ami to boot up the spot instance with.
# Ubuntu-xenial-16.04 in diff regions.
# Ubuntu 16.04.1 LTS
if [ $region = "us-west-2" ]; then 
	export ami=ami-a58d0dc5 # Oregon
elif [ $region = "eu-west-1" ]; then 
	export ami=ami-405f7226 # Ireland
elif [ $region = "us-east-1" ]; then
  	export ami=ami-6edd3078 # Virginia
fi

…

# The AMI to be used as the pre-boot environment. This is NOT your target system installation.
# Do Not Modify this unless you have a need for a different Kernel version from what's supplied.
ec2spotter_preboot_image_id=$ami

xinxin.li.seattle · March 9, 2017, 7:59am

@z0k I used the correct ami and got the spot instance launched, but the root volume swapping isn’t happening after 15 minutes (see attached screenshot). I did uncomment and update the value for elastic ip. But other than that, I followed every step in the instruction. Is there anything I need to do manually to swap the root device? If not, can you point me to the right script to debug?

the root attached to the spot instance is 8GB in green (in-use), and my spotter(available) in blue.

z0k · March 9, 2017, 9:03am

Hm, can you verify that the name in your .conf script matches what you wrote (spotter) in the console?

# Name of root volume.
ec2spotter_volume_name=spotter

Other than that, I’m not sure what the problem is, but for the time being you can manually attach the volume to your spot instance in the AWS console, and then mount it after SSHing into your instance:

$ mkdir spotter
$ mount /dev/xvdf1 spotter

Hopefully @slavivanov can shed some light on this.

slavivanov · March 9, 2017, 9:26am

Hi @xinxin.li.seattle,
My first guess is same as @z0k’s: the name of the volume in the my.conf file is different than the actual name of the volume (spotter).

Secondly, the Elastic IP setting in my.conf should be the elastic IP id, not the IP itself. You can find the id from the IP by running:
aws ec2 describe-addresses --public-ips $ip --output text --query 'Addresses[0].AllocationId'
Replace $ip with your elastic IP.

Another reason might be that the ec2spotter_volume_zone is not set correctly (it should be us-west-2a by your screenshot). You can post (or message me) your my.conf file if unsure of any of the settings.

If the above are all set correctly, there might have been some hiccup during the boot. To check for this go to Instances in EC2 Dashboard, select your instance, then Actions, then Instance Settings, then Get System Log. See the last few lines of the log for any errors (or post here if unsure).

Lastly, you can check if the swap commands failed for some reason by running them by hand:
ssh into the server, run sudo su - to use the root account, and then:

Check if the credentials file exists in /root/.aws.creds and that the credentials are correct.
Check that awscli is installed
Check if there are files in /root/ec2-spotter/
Finally, try to run the swap root volume script by hand:
cd ec2-spotter ./ec2spotter-remount-root --force 1 --vol_name ${ROOT_VOL_NAME} --vol_region ${ROOT_REGION} --elastic_ip $ec2spotter_elastic_ip
but replace ${ROOT_VOL_NAME} with spotter, ${ROOT_REGION} with us-west-2a, and $ec2spotter_elastic_ip with your elastic IP id.

Let me know what happens.

xinxin.li.seattle · March 9, 2017, 7:26pm

I just got the spot instance working! I hope I didn’t take up too much of your time @slavivanov, now it’s working beautifully at a fraction of the original cost, can’t thank you enough!

slavivanov · March 10, 2017, 11:39am

I’m glad to have helped!
If it’s not much to ask, please “Like” the original post.

slavivanov · March 10, 2017, 4:28pm

@shgidi The part 2 should work exactly the same. Then after the you start up your instance, run the commands that Jeremy listed.

stevelizcnao · March 13, 2017, 8:17am

You are a GODSEND! Saving me so much money! Thank you!!!

One thing I’ve noticed is that my elastic IP is not attaching to the instance, has anyone had this problem before?

I have to manually attach it in the AWS Console page, which is no big deal, but I’m working why it’s not directly updating. The start_spot.sh is listing the correct elastic IP address, and telling me to connect to it, but in the console it is listing a different IP address.

Double checked the names in my_conf, but they match what I have in my console. Odd!

slavivanov · March 13, 2017, 2:27pm

I’m glad to have helped @stevelizcnao.
The elastic IP code got removed at some point (probably I was debugging something) and forgot to add it back. It’s pushed to the repo so download the code and it should work from now on.

karthik_k314 · March 14, 2017, 2:07pm

Firstly @slavivanov, thank you so much for this! This works neatly for the most part.

Here are a couple of issues I ran into.
a) I was unable to attach my existing volume to the spot instance. Not sure why. I followed the instructions to the T.
The script created a new instance for me. I have set the name of the volume in my.conf, etc. What the script does is it creates a new volume AND attaches the existing volume to the instance.
b) The new instance was 8GB only, I noticed that while running gpu-install.sh. So if you’re creating a new instance do check the size before running that script because it’s painful to debug that script and re-run parts of it.

How do i debug it?
Thanks!

karthik_k314 · March 14, 2017, 2:25pm

Okay. wait. i didnt know about this concept of swapping volumes. I pretty much ignored the last line. It works PERFECTLY NOW.
Also, what’s the cheapest I can go with the instance cost?

Thanks @slavivanov. All of us using spot instances owe you one.

z0k · March 14, 2017, 3:34pm

Hi @karthik_k314,

The prices vary by region and availability zone. To see prices by region have a look here. You can also see a graph of price by AZ for the time period you specify in your AWS console. For example, here’s the graph for AZs in Ireland for the past month:

karthik_k314 · March 14, 2017, 3:41pm

Got it. based on the rates in my area i think i can go as low as 0.2 or 0.17/hour. Thanks for the tip @z0k

karthik_k314 · March 14, 2017, 3:44pm

I’m also noticing one more thing which is the billing isnt updated in my dashboard. Usually when i get done with on-demand instances the billing is almost immediately reflected in my dashboard. is it normal to have delayed updates for spot instances or do i have to look elsewhere. @slavivanov

Thanks!

z0k · March 14, 2017, 3:50pm

Yep, I’ve noticed the billing sometimes takes a little while to update. If it doesn’t you can open a ticket with AWS support.

slavivanov · March 15, 2017, 8:31am

Here is a little bash command I use to check the current prices of AWS spot instances in my region:
aws ec2 describe-spot-price-history --instance-types p2.xlarge --end-time $(date +%FT%T%Z) --start-time $(date +%FT%T%Z) --product-description "Linux/UNIX (Amazon VPC)" --output text --query 'SpotPriceHistory[*].[SpotPrice, AvailabilityZone]'
This requires that you’ve set your aws-cli region via aws configure.

atul · March 15, 2017, 10:15pm

Hi @slavivanov – I’m trying to use your ec2_spotter.git – But I do not find any file called ondemand_to_spot.sh

Am I missing something?
Thx