Hi Sanyam, thank you for your interest!
Primarily I am trying to reduce the barrier-to-entry for my colleagues with respect to developing deep learning models; specifically, using the fast.ai library.
I’ve noticed the people on my team (myself included) tend to innovate with the systems they can iterate on locally much more than the ones they need to bootstrap a remote environment for. If it can’t be run locally, a system tends to go into maintenance mode soon after the minimally-viable-product is released. I doubt that this is limited to my experience - everyone works differently, and using a local machine is one way to accomodate the variety of working styles.
rCUDA (or its alternatives… are there any?) doesn’t seem like it would be trivial to set up (I can’t even find a way to download it, outside of submitting a form, which is fine… just not something I’m used to.), so I figured I’d ask prior to diving in.
By the way, some more specifics of my use case:
In my use case, everyone has a Mac OS X laptop, which has everything we need for local Jupyter notebook-based development, except a powerful GPU (e.g. for training a TabularLearner). I was hoping there could be a way to interface with a remote GPU without changing too much from a developer’s point of view. That is, they could write the code not caring whether the GPU was present on their machine. At this point, I think it’d even be worth jumping through a few hoops in order to set up access to a remote GPU, since I have a hunch that there are benefits outside of this use-case for having access to ephemeral compute resources.
Some circuitous background on my motivation:
I think the preference for a local environment is much more psychological than technical. Technically speaking, there is nothing wrong about port-forwarding to a JupyterHub instance running on a beefy remote machine. In fact, it’s not even that expensive if you spin the machine down when not in use, and have the necessary bootstrap scripts for spinning it back up, within a few minutes of needing it. That’s the approach I started with several months ago: an ephemeral server with bootstrap scripts on an AWS spot instance.
This approach has been great for focused one-off tasks. Having access to a notebook environment proved especially useful for exploratory work, compared to relying on a REPL. Ultimately though, it being on a bootstrapped ephemeral server proved problematic for maintaining an evolving, long-term project. Developing in a local environment was far less limiting and brittle.
The obvious alternative is to use a semi-permanent EC2 instance, but we’ve long ago stepped away from purchasing permanent servers. Basically, there is no one on staff willing to administrate them over the long-term. pets vs cattle arguments, etc…