Best practice for ml development env


There is an overload of software tools for python / deep learning. I’d appreciate some best practices on how to set up the development process.

Some questions that are boggling my mind.

  1. pipenv is the way to go?

  2. where does docker come in to play? what does it replace? is it for dev, or only for production? im still installing packages through pip or pipenv, should I just run docker containers?

  3. when using remote machine, how to setup the IDE on my laptop? ssh to remote machine interpreter? should I write code locally, push to git and pull on the remote. how to handle data folder (i don’t want copy on my laptop)?

right now I endup doing a lot of editing in the terminal on the remote machine, then pushing back to git and pulling on local…

I’d appreciate some examples of software development flows you guys are using.


Quick update from my end.

I’ve been testing AWS cloud9 and its quite amazing. Its running on my own remote machine not on ec2.

The upside:

  • everything is developed direct on the remote machine, no local copy required.
  • easy to setup.
  • easy to deploy on aws

The downside: since all the development is on the remote machine, you need constant internet access.