I don’t have much experience with collab / sagemaker but that sounds quite cool for people who don’t have their machine set up / are only starting their journey
I was planning on doing docker setups for the new functionality, where you run a docker container and it pulls the data for you and you can open a jupyter notebook and hit run all cells and it works. Anyhow, maybe will go with this idea at least to some extent.
Sucks a bit getting data is so involved. For instance, there is the imagenet data on Kaggle, but you need to log in to pull it I believe or authenticate via the API… Probably for colab / sagemaker the data can be preloaded? Would be nice to have some sort of canonical repository of the datasets used for fastai lectures… CIFAR10, dogsvscats, imdb, VOC pascal, maybe even COCO but I guess probably the legality of this comes into play… I have been downloading things from pjreddie.com - this gentleman is kind enough to host the CIFAR10 and pascal so at least there’s that