I got a Gradient P5000 on the East Coast…
Unfortunately with Gradient, you might end up getting this kind of error when you start a new clone of your notebook…
Error! We are currently out of capacity for the selected VM type
I got approved for GPU machines, maybe because I had support ticket open for something unrelated and mentioned the pending application in it. I still don’t see Gradient GPUs lower than K80 but maybe they are available now and then. I do see P4000 available now so I guess they come and go. Anyway things are looking good for now. Thanks again!
OK, made some progress. I launched a P4000 instance because Gradient GPU instances smaller than K80 kept going in and out of stock. It occurs to me maybe I should have tried a CPU instance. Some observations:
You don’t get a public IP address with an instance–you have to buy it separately ($3/mo) and connect it to the instance. It seems like it takes a minute or two after you connect it before you can use it. That was required even after rebooting the instance.
I think it’s worth getting the public IP so you can “ssh paperspace@[address]” instead of using the in-browser console. Or it might be possible to run a VPN client from the in-browser console and connect to an external VPN that way, but I didn’t feel like setting all this up and getting it to re-up every time the instance was booted.
Running the fast.ai setup script (curl http://files.fast.ai/setup/paperspace | bash) took close to an hour, as mentioned in the video. Some of the stuff it did produced some diagnostics but it seems to have basically worked. It installed a new ubuntu kernel and a bunch of drivers from developer.nvidia.com etc. So that’s why reboot is required.
The setup script uses ubuntu-isms and doesn’t work out of the box on debian, but it could probably be modified to do so, or maybe there’s another version at fast.ai. I had tried it on a Hetzner cloud instance with their debian/conda template just for laughs (Hetzner cloud doesn’t have gpus either).
By default you get snapshots included with your instance, which for 250GB storage (I picked that since it’s not much more expensive than 50GB) is $1.40 per snapshot, or per month, I’m not sure which. Once you have created a machine with snapshots there seems to be no way to turn off the snapshot feature. So if you don’t want snapshots make sure to turn them off when you first create the machine.
The script installs a cudnn 9.1 zip from files.fast.ai which is basically current for the P4000, but if you want to go crazy and use a V100 ($2.30/hour) for something, I think you need cudnn 9.2 to use the tensor hardware which should give a big speed boost. It would be nice if fast.ai (@jeremy ?) were to update the file repo for this.
A nice thing about paperspace hourly instances is they are actually charged by the millisecond according to their docs. I.e. if you shutdown after 1.1 hours you don’t get charged for 2 hours. Booting takes less than a minute so it’s worth shutting down when you’re not using it even for fairly short periods. Of course that runs the risk of it going out of stock since there are only limited # of instances.
Having gotten this install done, I shut down the instance so haven’t tried the fast.ai notebook yet. I might give that a quick try tonight but won’t get to really mess with it until the weekend.
When I created a Gradient notebook I didn’t have to install fastai, I used the fastai template provided by paperspace. It wasn’t perfect though as I still had to pull the latest fastai version, and I wasn’t able to update the environment without running into issues.
I’m now trying out a k80 on google cloud (still waiting for a paperspace VM). So far I’m a lot happier to have my own VM rather than a Gradient instance.
up and running, thanks
how much time did paperspace take after you sent the answer for
This instance type has not been enabled in your account yet. For some GPU types we require that you tell us a bit more about your use case before we enable access. This is designed to reduce fraud and thus keep our prices low.
Tell us a bit more about your use case and we will prioritize your request (required)’
did you ever get a response since i am having same issue
I pretty much gave up, it’s been 14 days. Maybe because I asked for a P5000.
I ended up reapplying after those 2 weeks of waiting and got my approval within the hour!
Sorry about the delay, we are in the process of moving the course to Gradient and we’re quite a bit backed up on regular VM approvals. FYI there is not wait for Gradient (no approval process).
I think what I did was I deleted the machine and created a new one without doing git pull and updating environment. That worked for me at that time. Not sure if there is another solution.
I’m all setup on my p5000 VM and happy with it! Was worth the wait.
It’s quite an upgrade to the k80 I have with Google. Thank you
thanks that helped