Setup help

Thank you :slight_smile:

1 Like

I agree with @jeremy that AWS instances are priced too high. That being said, my company is AWS only right now, and my account under my company covers the cost for me.

I’m currently working on a new AWS image off the g3.4xlarge instance type.

I’d recommend against the p2.xlarge, it just doesn’t cut it value wise. Way too low on memory. Used it in the past. If you’d like, i can create an AMI for you to use once i finish getting mine set up.

Edit: Also, bummer that Amazon didn’t provide anyone to help with the course. If someone from Amazon reads this, please ask around for why not?

Also there is a SageMaker Studio currently in preview (you just go to Ohio region and click on SageMaker studio) it can be much more cost effective than running usual SageMaker notebooks instances, but but it will be a little bit more involved to run fastai there and take advantage of cost savings. So probably for enthusiasts who wants to take this new course in SM studio as of now…

Although I personally prefer some of the other options posted by @jeremy, my company is all AWS, and therefore it’s easier for me to use that platform.

For anyone else who wants to run the course on Amazon, I’ve gotten all of the course contents and dependencies set up in an AMI.

The AMI id:
ami-0079f47689becf81f Edit: Hold off on this for now. Just realized that when I installed fastaiv2 from the dev repo, it upgraded pytorch to a version that isn’t compatible with the CUDA drivers. Will update this when I get the image dialed in.

Second Edit: There appears to be an issue with the base image, causing a CUDA version conflict. Not worth the time for anyone to use this image. Will update if/when I get this working.

**Third Edit: **
Amazon appears to not really be investing in EC2 Deep Learning images anymore, likely in favor of SageMaker. :frowning: Apologies if anyone tried this guide, but the incompatibilities on the images are pretty bad now.

The AMI name:
fastai2_spring_2020

Note that I’ve just now created the image, and it might not show up in AWS image search.

You’ll want to put it on a g3.4xlarge instance, which runs around $1.22 an hour, which is pricey, but you get plenty of memory and a big GPU. I recommend avoiding p2.xlarge. It’s a little cheaper, but way less value in my opinon.

When you ssh into the box, run “source activate fastai2” and navigate to the course_v4 directory and launch jupyter. All the deps are there for running the current course notebooks.

Thanks,

JP

2 Likes

I created a fork of course-v4 where I will be setting up the notebooks for running on Google Colab without any other change. Here’s the link for running the material of the first class:

6 Likes

@wittmannf please add this to the colab wiki post FAQ too.

1 Like

Done! Feel free to review it.

Thanks – would appreciate that!

where can I click to access the video session?
thanks!

Hey John, thanks for putting this together!
Have you tried running on SageMaker, or are you planning to stick with EC2 only?

@jeremy and/or @rachel just to clarify one thing. For this year’s version of the course, there is no AWS support. Therefore your recommendation of using the options listed in this channel

I was able to spin up a AWS Sagemaker with the notebook. However, when I run this
from fastai2.vision.all import *
I see the following error:

AttributeError                            Traceback (most recent call last)
<ipython-input-10-a14e0323ec5b> in <module>()
      1 # Import necessary libraries
----> 2 from fastai2.vision.all import *
      3 # import matplotlib.pyplot as plt

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/vision/all.py in <module>()
      1 from ..basics import *
----> 2 from ..callback.all import *
      3 from .augment import *
      4 from .core import *
      5 from .data import *

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/callback/all.py in <module>()
      3 from .fp16 import *
      4 from .hook import *
----> 5 from .mixup import *
      6 from .progress import *
      7 from .schedule import *

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/callback/mixup.py in <module>()
      6 from ..basics import *
      7 from .progress import *
----> 8 from ..vision.core import *
      9 
     10 from torch.distributions.beta import Beta

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fastai2/vision/core.py in <module>()
     19 # Cell
     20 if not hasattr(Image,'_patched'):
---> 21     _old_sz = Image.Image.size.fget
     22     @patch_property
     23     def size(x:Image.Image): return Tuple(_old_sz(x))

AttributeError: type object 'Image' has no attribute 'size'

I have the following installed
!pip show fastai2 Pillow

Name: fastai2
Version: 0.0.11
...
---
Name: Pillow
Version: 7.0.0
Summary: Python Imaging Library (Fork)
...

I found this thread https://github.com/fastai/fastai2/issues/22
However, that doesnt seem to fix this since I have the correct PIL installed. Any help / pointers ?

It seems you installed it in pytorch_p36 conda environment? or you have created a new env like described in the repo? (GitHub - fastai/fastai2: Temporary home for fastai v2 while it's being developed)

One thing to note, when you restart SM notebook - the preinstalled environments, including packages and conda will revert back to original ones (as if you just created a notebook).

It happenes because user volume is mounted to $HOME/SageMaker folder, everything above it will be reinitialized if you stop and start notebook instance.

The way to make it persistent is to use lifecycle configurations: Customize a SageMaker notebook instance using an LCC script - Amazon SageMaker

Also,
Cloud Formation template from this tutorial https://course.fast.ai/start_sagemaker.html can be updated by just replacing old fastai installation steps, with new fastai installation steps.
For that you need to go to https://course.fast.ai/start_sagemaker.html, download template locally first and then edit yaml file (i mean this one https://s3-eu-west-1.amazonaws.com/mmcclean-public-files/sagemaker-fastai-notebook/sagemaker-cfn.yml)

Thanks @tensoralex I did edit the yml file from there – but only replaced the pip install fastai to fastai2.
Did not change anything else. I will follow the instructions in https://github.com/fastai/fastai2 for the new env and try to run it. Thanks a bunch!

1 Like

Thanks for all your help. I was able to create a notebook on AWS Sagemaker using Cloudfront script.
Here is the script that I used to create a notebook instance in us-east-1

Plz make sure you requested access to ml.p2.xlarge before using this as documented here:https://course.fast.ai/start_sagemaker.html

2 Likes

FYI:

I had an issue here; I conda installed graphviz which installed version 2.40.2. which jupiter failed to import. I then turned to pip install graphviz which installed version 0.13.2. This worked.

Error running notebook 02_production: PermissionDenied

returned from image search.

At code cell 2 key is set to ‘XXX’ guess that’s an issue??

I guess we already discussed that one here: https://forums.fast.ai/t/02-production-permissiondenied-error/65823

(Adding it for others to find, if needed.)

Francesco,

I didn’t try SageMaker, but I will probably pursue it today. I ended up doing Colab Pro, but I’m pretty dissatisfied with it since I can’t easily flip working code to a web app the way i would like to.

The EC2 images issue I had seemed to be related to a CUDA mismatch, and this occurred upon using the conda create command off of the environment.yaml file in the fastai2 repo. Conda installs Pytorch 1.4 (due to torchvision 0.5 requirement) and it seems like it went ahead and updated the cuda packages and blew up my compatibility. No idea for sure though.

Have you used SageMaker before? I haven’t tried it, but I guess I will give it a whirl this evening and will let you know if I get it rolling.

Just looking at Google Cloud options last night, and they are radically cheaper than AWS, which further sours me on Amazon for this class.

Edit: I will give the instructions referenced by @pinaki a try.