I have set up a process for myself to use spot instances (p2.xlarge) instead of on demand and it is working well and saving me a lot of money. I set up the instance everytime and then terminating everything. This is my "checklist" (did not spend the time yet to automate it e.g. using aws-cli, but that could be done as well):
EDIT: I will not update this post in case I find some bugs or have any modifications, instead I update (the README.md of) this github repository, where you can find my latest version:
So you can use it if you like.
1) Request Spot Instance
AWS Console -> (Login) -> EC2 Dashboard -> Spot Requests
"Request Spot Instances"
(only changed parameters shown - leave rest as default)
Request type: Request
AMI: Ubuntu Server 16.04 LTS (HVM)
Instance type: p2.xlarge (delete c3.,...)
Set your max price: e.g. 0.3
Instance store: attach at launch
EBS volumes / Size: 32 GiB
Security groups: default
(Next / Review)
You may need to change the security group settings if you cannot login to your instance:
AWS Console -> (Login) -> EC2 Dashboard -> Instances
Select instance -> Security Groups -> "default" (or which ever you are using)
Tab "Inbound" -> Edit
Port Range: 22
Port Range: 8888-8898
2) Configure SSH
copy / paste the HostName (Public DNS) of your AWS instance
It can look something like this:
4) Setup AWS Instance
git clone https://github.com/jonas-pettersson/fast-ai-courses
(this is my forked copy of https://github.com/fastai/courses/ including my own work)
sudo apt install python-pip
(pip is not installed by the script)
pip install kaggle-cli
sudo apt-get install unzip
(unzip is not installed by the script)
pip install backports.shutil_get_terminal_size
(otherwise jupyter notebook does not work properly)
5) Setup for Kaggle Competition
(this is the directory structure I use)
kg config -g -u "your_kaggle_username" -p "your_kaggle_password" -c "your_kaggle_competition"
(this is my own setup script for kaggle, setting up directories, creating
validation set, samle sets etc:
6) Transfer Files
you need to transfer the files you need from your local machine using rsync, e.g.
rsync -avp --progress dogs-cats-redux-model.h5 aws-p2:~/fast-ai/data/dogs-cats-redux/models
This takes some time unfortunately and is kind of a drawback of the spot-instance approach. Best is to consider carefully what you really need.
7) Start Working
8) Save Results
After you're done you would like to transfer your models back:
rsync -avp --progress aws-p2:~/dogs-cats-redux/models/dogs-cats-redux-model.h5 .
You will also want to save your notebooks / scripts etc. to GitHub
git add ...
git commit -m "..."
git push origin master
9) Terminate Instance
Make sure you saved everything you need
AWS Console -> (Login) -> EC2 Dashboard -> Spot Requests -> Actions -> cancel spot request + check box "Terminate instances"
Check in EC2 Dashboard -> Instances that your instance is terminated