Hi guys, we are working to build out a new docker container for V1 with PyTorch 1.0 here at Paperspace and would love some help/testing. For some reference we have been maintaining a docker image for the current fastai course here.
What would be a good smoke test to make sure that this is working correctly with all the latest libs? Do you see anything missing from the Dockerfile? Hopefully we can get a clean working container that can be generally used by people in this new course. Any feedback is much appreciated!
For anyone else looking at this thread, CUDA 9.2 and 10.0 work great as base images but CUDA 9.0 does not. Also it seems to require a minimum nvidia driver version of 396.26 or it will hang (even if the container itself successfully builds)
I am currently using fastai in paperpspace, with a Linux 16 and installed all the dependecies with conda.
I had the problem that installing fastai in editable mode (pip -e) reinstalled a lot of dependencies already installed with conda (i was using pip installed by conda).
Anyway, fastai works, but a docker image would be much appreciated, to be able to change machines easily.
I also had de spacy bug.
I also tried Nvidia 4.10 and it also works great.
@tcapelle we have been maintaining a docker container here. It has worked well and if you find anything that would make sense to change you can definitely submit a PR.
This is what I’m currently using. Works for me! You have to register for the nvidia GPU cloud (free) in order to get the pytorch-batteries-included image. Remove the last two rows if you don’t like the dark theme, I’m a when it comes to themes.
And of course you have to customize the volumes in the docker-compose file for whatever you want to have in your container. The SCRATCH Folder you can see in the volumes is a folder, that I use as a layer below git to keep all my code synched between the different machines I use. I’ve heard several people questioning my sanity for that, but I actually never had any corruption for several years now.
Is it still working as of today ?
For some reason the line conda install -y -c zegami libtiff-libjpeg-turbo returns an error for me, something with conflicting versions.
The usage of this configuration is such that all the files that you will edit and want to save will be created under one of the /ws folders. The design is that nbs will contain the notebooks (potentially copied from fastai course cloned), lib will be code that I write to reuse in my nbs, and data will contain data files that I want to cache or otherwise persist across sessions.
The files are meant to be created and written in side the jupyter lab but they are persisted by the fastai docker service we’re creating. To commit/push them to github is a separate manual activity done on the host machine.
For me, my root project directory is ~/code/analyze. To activate the docker environment, I do:
cd ~/code/analyze
sudo nvidia-docker build . -t fastai-image
sudo docker-compose up
# Then in a browser, you will be able to see: `localhost:8900`
# I have utilized a LocalForward argument in my .ssh/config file to see this on my laptop