Using AWS P3 instances

jeremy · November 22, 2017, 11:33pm

I just checked, and our fastai part1v2 AMI instance for p2 also works just fine on p3!

However, the conda version of pytorch that’s installed isn’t optimized for the p3, so you need to conda remove pytorch, and then install pytorch from source using the steps on their web site (which turns out to be very easy, thanks to the pytorch team’s awesome process and docs). After that, you can pip install torchvision to get that back.

I haven’t had a chance to benchmark properly yet, but it’s looking pretty good…

jeremy · November 22, 2017, 11:44pm

Oh wow - the P3 is 800% faster than P2 for training with fastai!

(Time to do 12 epochs on the new seedlings competition with the latest fastai repo and new pytorch installed from source.)

Thanks @spisakjo and team for making these available

jamesrequa · November 22, 2017, 11:54pm

Also a good place to take nasnet for a test run

binga · November 23, 2017, 4:03am

Hey @jeremy, I’ve done some rudimentary benchmarking. It certainly is way faster. [Adv] Significant changes to fastai just pushed

However, the learner object creation is taking more than 8 minutes. And then, it’s fast. Are you facing something similar on p3?

Currently I’m using the conda version of PyTorch. I could try benchmarking again by installing it from source.

jeremy · November 23, 2017, 4:17am

That’s your problem - once you install from source this problem will go away.

binga · November 23, 2017, 4:38am

Making the changes right away. Thank you.

binga · November 23, 2017, 8:08am

After I build PyTorch from source, there’s no initialization delay in conv_learner. Works smoothly.

Moody · November 23, 2017, 8:07pm

Just want to confirm I pick the right Package Manager and CUDA from http://pytorch.org/ for AWS p3 instance.

jeremy · November 24, 2017, 1:06am

That’s not source - that’s from a package. You’ll find the install from source docs in the pytorch readme.

mmr · December 8, 2017, 7:35pm

Does anybody out here knows the pricing details for P3 instances ?

reshama · December 8, 2017, 7:48pm

@mmr
Pricing info is available at the end of this file: fastai aws image

mmr · December 8, 2017, 7:50pm

Thank you . I could not find their pricing structure anywhere on the internet . It is really odd.

reshama · December 8, 2017, 8:05pm

@mmr
Yes, AWS does not make it straightforward. Here’s a link to the AWS Pricing Worksheet (aka Simple Monthly Calculator) which requires assumptions and forecasting your use.

mmr · December 8, 2017, 8:20pm

I got this chart after emailing them :

p3.2xlarge - $3.06 per Hour
p3.8xlarge - $12.24 per Hour
p3.16xlarge - $24.48 per Hour

Moody · December 8, 2017, 8:58pm

Under the GPU Instances - Current Generation.

pierreguillou · April 18, 2018, 10:32pm

Hi @jeremy, Hi all users of AWS p3, Hi all people that would like to use a AWS p3,

Just to let you informed. I asked AWS to let me use a p3 instance.

Their answer : no.
The argumentation of their answer : you have to spend (more money) on p2.xlarge and then, we will see if we let you use a p3 (and spend your money there).

The full answer from AWS :

I received an update from the Service Team and they were not able to grant the limit increase of this type due to the amount of Spend on your account.
In order to grant access to this type of instances, you would need to show that you have at least ran large instance types for a while.
I tried my very best to advocate for you as I understand that you needed the P3 instances for Fastai international program.
My suggestion is that you continue to use the p2.xlarge instance that you do have access to in order to increase the spend on your account so that we can reassess the request for p3 instance types.

fizx · April 18, 2018, 11:33pm

You can use p3 spot instances without jumping through $$$ hoops.

pierreguillou · April 18, 2018, 11:52pm

@fizx : I did not understand your answer. How can I use a p3 instance if AWS does not want ?

fizx · April 19, 2018, 12:08am

If you request a SPOT instance, they will let you have whatever you like. Last I checked, spot instances come from a different allocation pool and the base access level is much more permissive.

jeremy · April 19, 2018, 12:44am

That’s crazy - I’ll send them a nasty-gram right away!