Cost effectiveness of the different Cloud/Server renting options

Not everyone has their own server (and some of us missed out on the free aws credits) so I am aiming to compile a list of the different Cloud options for doing ML and their cost-effectiveness.

  1. Not everyone knows of them, but for a while the chepest option for full time use has been hetzner with their 1080, 64gb ram dedicated server for 100/120 euro a month, with a setup fee of another 100/120 euros.

  2. AWS (familiar to most here). K80 - $0.9 for p2.xlarge (as low as $0.1703 with spot pricing). M60 - $1.14 for g3.4xlarge (as low as $0.4 with spot pricing). V100 - p3.2xlarge - $3.06 per Hour ($0.613 for spot).
    On demand pricing Spot pricing - some are available only in specific regions (like Oregon), and they have a few more options

  3. GCP - $0.70/0.77 per hour (gets as low as $0.49) for half a K80 or $2.30/2.53 for P100 (gets as low as $1.61).

  4. Azure - K80/P100/P40 - $0.9/hr, M60 - $1.093/hr. Seems like a surprisingly good option, but haven’t heard of many people using them.

  5. Crestle - again, familiar to most here - $0.34/hr for K80.

  6. Paperspace - P5000 for $0.65/hr, P6000 for $0.9/hr, V100 for $2.30/hr

And then there are some monthly options that are a bit worse than hetzner - 1, 2, 3, and some hourly options that are a bit worse than the rest e.g. - Floydhub - $0.7 for K80

I haven’t (recently) made the calculations of what is most cost-effective and I haven’t tried most of the options. It’d be very helpful if others can chime in, and we can compile any providers I might’ve missed, and the best options depending on what one wants to do, e.g.
Hetzner for full-time use, AWS/Azure for hourly, Crestle for fast-setup, cheap hourly, etc.

Some additional metrics like flops, or time to run a model on the different GPUs can also be useful.

Best case scenario, it’d be nice if we can turn this thread into a Wiki and keep it up-to-date, as a sort of sister thread to the Making your own server one.

14 Likes

I’ve used FloydHub, Crestle and Paperspace.

These are my preliminary thoughts.

FloydHub has a unique structure that takes getting used to.
You create a new ‘Project’ and attach data to it.
Then, you can have a series of ‘jobs’ via .py scripts or Jupyter Notebooks.
The workflow is not very intuitive IMO.
Their Pricing is a bit higher compared to the other two ($0.75/h for a K80 even at their highest $100/mo plan).

Regardless, they have good funding (Y-Combinator), are on an aggressive growth spurt (improvements every few weeks), and have two dedicated co-founders in @sai and @narenst.
Things can only get better.

Crestle, on the other hand, scores a 10/10 on the ‘intuitiveness’ scale.
It literally cannot get any easier to do Deep Learning on the cloud.
You just sign-up and BOOM - there’s a Jupyter Notebook staring at you with unlimited home storage.
Two of the things apart from the simplicity, that I like about them, is that:

  1. You can switch between CPU and GPU using a neat little toggle switch. This saves you a ton of money considering the fact that most of your time will be spent coding and debugging. Very little time is spent on the ‘training’. This makes it easier to tinker with your code without worrying too much about the cost.

  2. You get your own home directory, unlike FloydHub. This is much closer to what you actually would have in your local setup. It makes it easier to share files between notebooks and save data in a folder without having to worry about ‘mounting’ it. Whereas in Floyd, if you need to download a dataset (say from Kaggle using the CLI tool), you need to first run a job. Then you’ll have to ‘mount’ the output of this ‘job’ which contains this data every single time you want to use it.

Created by our own @anurag, Crestle gives the same amount of compute (K80) at about half the price ($0.34/h vs FloydHub’s $0.75/h at best).

The only downside is the storage costs - $0.014/GB/day. So, if you’re using ~50GB (say), it’ll end up costing you ~$20/mo.
Again, this depends if you intend to persist the data (which for most purposes you won’t need to).

Beware, though that both FloydHub and Crestle use EFS for handling file storage

This means, for instance, that you will have a painful time handling a large number of small files (which is what most datasets in Deep Learning/CV have).
I took me an hour and a half to extract the StateFarm dataset on both platforms!

Paperspace alleviates most of these problems.
You get a full Desktop experience (not just a CLI).
They have a P4000 GPU (comparable to the K80) at $0.4/h. But also, you can have a P5000 ($0.65/h), a P6000 ($0.9/h almost 3x the performance of the K80 - comparable to the 1080Ti).
They also have the latest V100 (only AWS and Paperspace have them as of now) at $2.3/h!
You can also subscribe monthly at a discount although Hetzner would be much more cost-effective in that case.
They currently have 3 data centres (2 in the US and 1 in Amsterdam) and are expanding.

Apart from the straightforward, familiar experience and workflow (it’s like your own computer - only not local), the big upside for me is the SSD storage (not EFS unlike the other two). StateFarm only took a few seconds to extract - much like my local build (which is what you expect).
Also, the storage starts at $5/mo for 50GB (against ~$20/mo in Crestle’s case) and increases in $1 increments up to 250GB. The maximum is 2TB for $40/mo.

Paperspace also has a dedicated VM specifically for FastAI.

In summary, if you’re just getting started with Deep Learning, I’d recommend Crestle for its sheer simplicity.

If you want a full cloud desktop and SSD, use Paperspace.

I would recommend against using FloydHub for now.

If you ask me, personally, Paperspace is the winner for me. A personal cloud desktop which I can customize according to my will and treat just like my local build seals the deal. Also, they have multiple GPU options and datacentres and lower pricing.

14 Likes

Thanks for the thoughtful comparison, @svaisakh! (I’m one of the co-founders at FloydHub)

  • As of last week, FloydHub offers SSD storage for all jobs. No more EFS, so dealing with large number of files should be blazing fast

  • We hear your feedback about making the Jupyter Notebook workflow more intuitive. We’re working on it, and will have some improvements to share soon!

For the time being - a Project is like a Github repo, a collection of your jobs and data. Here’s a quick 3-step tutorial to start running your Jupyter Notebook on FloydHub: https://docs.floydhub.com/getstarted/quick_start_jupyter/

  • Pricing: Our current pricing is tiered, with the cheapest being $0.59/hr for K80 GPU (with the 100 hour GPU Powerup). We’re working on a much simpler pricing plan, which should be out in a couple of weeks. Will update here then.

  • GPUs: We currently have the Tesla K80 GPUs. We will be offering V100 very soon.

  • We have a dedicated page for Fast AI: https://www.floydhub.com/explore/courses/fast-ai-part-1. This has all the class projects and datasets, along with instructions on how to get started easily.

Other feature that FloydHub offers are version control and reproducibility - you have full history and reproducibility of all your experiments, including comparing and resuming your work.

Overall, thanks for your feedback! We hear you and are actively working towards making FloydHub more useful for you guys. I’ll be around to answer questions, or take suggestions/feedback.

9 Likes

Nice writeup!

For me, I’m sticking with AWS for the time being for three reasons:

  1. That it represents what most companies are using in the real-world
  2. That I’ve automated the startup/shutdown via bash scripts so that I don’t even need to touch the AWS console and have Jupyter notebook running in < 60 seconds.
  3. The $500 credit.

I really like paperspace and would be tempted to switch, or at least split my development time, working on it, if they matched the sweet deal we got from Amazon.

4 Likes

Thanks, @sai.
Wouldn’t expect any less of you guys.

The SSD alone is a huge improvement for me.

I guess I did forget to mention that the FloydHub workflow gives you full version control - which is unique among all providers as far as I see.

Good luck moving forward.

P.S. Bring on the V100s, baby :wink:

Nvidia GPU cloud is free. At least for now. https://www.nvidia.com/en-us/gpu-cloud/?ncid=pa-pai-nsdplgnclh3-25128&gclid=Cj0KCQiA84rQBRDCARIsAPO8RFwZ6c0Mbr1-iZt4KbPB65kCy6uMjzxOXPk2OguyJRR2Gv_Nyh6riP8aAn4oEALw_wcB

As far as I can tell this is just software (docker images), not free gpu time or something.

My bad: this one is free: https://research.google.com/colaboratory/faq.html

I am using it now… you have to use TF though. Should work for your class projects. Well if you convert from pytorch back to tf. :slight_smile:

1 Like

Interesting :face_with_monocle:

Seems invite only.
Details are sparse.

@dc333, if you’ve got access to it, could you please share your experience?

What’s the config like?
I don’t think a GPU is included, in which case it won’t take us very far.

!pip install -q http://download.pytorch.org/whl/cu75/torch-0.2.0.post3-cp27-cp27mu-manylinux1_x86_64.whl
!pip install future

colab runs torch…

Update: GCP lowers GPU pricing by up to 36%: K80 at 40c/hr and P100 at $1.46/hr

Cresle currently gives only 1 free hour of GPU usage, and cost 0.59$/h .
At this moment Google Cloud is cheaper at 0.49$/h for K80, comes with 300$ free credit and was surprising easy to set up even for a beginner like me.

So, atm google cloud seems like a great option - about 600 hours of free gpu use, and relatively easy to set up.
Also it’s possible to choose p100 instead of k80, which costs 1.6$/h

@malrod,
I tried Google Cloud (a.k.a GCP).
It seems that GPUs cannot be provisioned under the free trial.

Have you successfully managed to create and use a GPU instance?

@svaisakh You can use your 300$ credit for GPUs ! But you need to increase you GPU quotas first (they are at 0 as a default -that’s why you cannot lauch a GPU instance straight-away).
Go to quotas page (compute engine->quotas-> edit quotas) and request an increase in K80 and/or preemtible K80 in your zone - the process is automatic and it should be increased within minutes.
https://cloud.google.com/compute/quotas
You’ll be asked to upgrade your account first, meaning that after using all 300$ of free credit you’ll be automatically charged for any additional use. But as long as you have any promotional funds left, they’ll be used first.
Sounds too good to be true, but with K80 costing 0.49$/h and preemptible k80 at 0.22$/h, that works out to be hundreds of hours of free GPU use.

I used K80, preemptible (spot) K80 and P100 and the billing details tell me that I spent 7.7$ and have 292$ remaining on my account and total amount billed is 0$:

Credits
Promotion ID	Expires	Promotion value	Amount remaining
Free Trial 	17 Dec 2018	$300.00	        $292.26
1 Like

Wow!

And I always thought the ‘upgrade’ meant they’d get rid of the promotional credits.

Guess I should start reading more carefully.:sweat_smile:

Thanks a lot @malrod :blush:

I’ve discovered the following educational discount. I’m going to ask to see if this can be used in conjunction with the fast.ai course.

… Time Passes …

It looks like they have granted me a discount code. Looks like you will have to ask for your own code if you are in any way associated with an educational institution.

Just waiting for my Umbuto 16.04 in the East Coast region on a box with a P4000 GPU.

… More time Passes …
I was refused a P4000 GPU. I’m using the GPU+ Quadro M4000 30 GB RAM based system.

I am not sure but you can get max 30$ from paperspace

First create the account via Fast AI Promo link

https://www.paperspace.com/&R=FASTAI15

Once account is created go to Billing, Don’t add your credit card just go to billing

add this as promo code DDQRI0U you would get 10$

Now on the Machine tab create your machine as instructed in the video

Below the credit card tab there is another promo tab

add this as promo code HNGPU5 you would get another 5 $

once you do this then add your credit card details. You would have 30$

3 Likes

Alas. This code has expired. :frowning:

For those that already have access to Google’s TPUs here is a benchmark making the rounds, which suggests that TPUs are significantly more cost-effective (in at least some cases) than at least the V100s and P100s available on google/aws.
HN Discussion mentions some caveats.

1 Like

try LAUNCH5PX instead of the HNGPU5