Running Keras in Production on AWS?

Hey everyone!

I have fine-tuned my CNN on my own image classification task on a P2 instance and now I’d like to use that model to predict image classes for an application but I’d like to do this on a cheaper CPU instance rather than pay $0.90 * 24 * 30 = $648 per month.

Question 1
Is my understanding correct that we’re training (fine-tuning) the model on a GPU instance since this is very compute intensive (then do but then we can use the trained model to make predictions on a CPU instance (doing model.load on the CPU instance)?

Question 2
If answer to Q1 is yes, can anyone please give some pointers on how to get the instance setup to use Keras & Theano on a CPU instance other than the T2 instance? The “setup_t2” script has different ami ids than “setup_p2” - can someone please help me understand how I can use the setup script to get an instance setup on an arbitrary EC2 instance (which I’m assuming will require setting up my own ami?)

Question 3 - bonus points
Does anyone have any indication as to the speed drop for predicting an image’s class using a CPU vs GPU instance using an already trained model? I will post my results here once I get it working :slight_smile:




I don’t know what all problems you might have in production, but tensorflow has some videos. If you are running on the browser, you might want to check up keras.js

Hey Suresh, thanks for you reply. I’m not trying to run keras in the browser, I’m just trying to build a server that receives images and produces predictions but I want to do this on CPU instances not GPU isntances.

I wonder if the t2 (CPU) ami might work on any sized instance - gonna try this now…

Ok so just in case anyone is wondering, it’s super easy to change the instance type of the CPU setup script and setup the environment on a bigger CPU machine:

just change line 18 of to whatever instance type you’d like, for example to use a t2.medium instance instead of t2.xlarge (say you’re running the server to show demos and need it up 24/7 and don’t want to pay for the xlarge to run constantly)

I only tried instances in the t2 family but I think it’ll work for any instance type as long as you choose the region that contains the AMI.

There’s a list of api names for all instances (and other useful pricing info) here:



Great info!

Are you happy using the CPU for predictions then? I’m also wondering best practices for production.

I wouldn’t say I’m the best person to say what ‘best practice for production’ would be (haha) but it’s working fine for me

I’ve got a demo app running based on the CNN in lesson 2 trained on my own data where you upload an image, it gets fed through the NN, and the next page shows you the class distribution - and it’s taking like 1 second between upload and seeing results which is fine for my purposes…

Also, I think the GPU’s power comes into play when you’re running many things at once so I think in production if you created a queue and handled batches of say 64 images at a time, then the GPU will crush but I think for 1 image at a time demo purposes, running on a lower powered CPU instance is fine.

1 Like

Hi Alex,

I am a newbie to AWS, the course and everything, so please my apologies since my query is very possibly basic. I would like to set up a t2 machine for the cheap runs, but I figured in order to run I must connect to us-west (Oregon), as the ami provided in that script (“ami-b43d1ec7”) is available only for that region. Given then fact I am connecting to Ireland, I was wondering whether there is an image available for my region which I could switch to.

Any help/hint is appreciated!


Hi Luca,

There are AMIs available for a couple of different regions - see full script below.

Also I’m seeing a different AMI ID from the one you mentioned. There is an AMI for Ireland - just make sure your aws config (~/.aws/config)has Ireland saved (region = eu-west-1)

export region=$(aws configure get region)
if [ $region = "us-west-2" ]; then
   export ami="ami-f8fd5998" # Oregon
elif [ $region = "eu-west-1" ]; then
   export ami="ami-9e1a35ed" # Ireland
elif [ $region = "us-east-1" ]; then
  export ami="ami-9c5b438b" # Virginia
  echo "Only us-west-2 (Oregon), eu-west-1 (Ireland), and us-east-1 (Virginia) are currently supported"
  exit 1

export instanceType="t2.xlarge"

. $(dirname "$0")/

Hope that helps?


1 Like


thanks, I tried with the AMI you listed with and it worked like a charm!

Thanks a lot!

1 Like

Happy to hear it! Good luck :blush:

1 Like

So how long does a single inference take on CPU vs GPU according to your tests?

Depends on a lot. Don’t have comparison numbers on hand. Try for yourself and you tell me :slight_smile:

I tried the setup_t2 without changing the t2.micro and saw that Theano was not installed. Did you have to manually install Theano on your t2.medium instance?

weird - i seem to recall everything being installed automatically. Check you installed it in the right region since it’s installed by virtue of being part of the AMIs which are only available for the regions shown in the setup script (open the .sh script in a text editor and make sure the region matches the region in your aws config)