Hi!
I recently came across and started playing around with https://azure.microsoft.com/en-us/blog/azure-machine-learning-service-a-look-under-the-hood/ and it seems to me a potentially very good tool.
In general, I like the idea of provisioning deep-learning-ready VMs with a line of Python code (scaling to larger machine if I need more compute) and keeping track of ML experiments (durations, graphing performance metrics, etc.), and having hyperparameter tuning and model deployment made easier. Also, VMs are deactivated once a training/eval (experiment) run is complete, which should be cost efficient.
Microsoft has also published notebooks to run popular NLP models with PyTorch and TF, see https://github.com/Microsoft/AzureML-BERT, which is extra nice.
However, the first attempts I made (with the PyTorch Jupyter notebook in the repo above to run an experiment), were not pleasant in terms of user experience: executions sometimes keep running and I couldn’t figure out how to halt them properly, the logs shown on Azure (Jupyter) Notebooks were not very transparent and I needed to use Azure Portal to (try to) better understand what was going on with my run.
In essence, for now to me this Azure ML service seem to automate/abstract out some of the things needed to run experiments tidily (and potentially help along the “full stack of data science”), but at the cost of making you lose a more direct handle on basic controls (has training even started?) and too much instability for now.
To be fair, the launch is fairly recent and devs are updating the platform every day https://docs.microsoft.com/en-us/azure/machine-learning/service/azure-machine-learning-release-notes, so things may improve noticeably in the future.
Does anyone have a similar or better experience to share? (as maybe I am just doing things wrong…)