Ready to use cross-platform environment for the course


#21

Really useful thanks.

I have been linking a folder on the server to the docker container using -v. However your dockerfile creates a new user “docker” rather than sticking with the default root. This user does not have the correct permissions on the host folder…I am sure resolvable but why not just use the root user in the container? All the other container images I have tried use the root user.


(Anurag Goel) #22

Hi simoneva,

The decision to not use the root user was based on a lot of popular images also not using it. For example, keras-docker uses keras. Regardless, I was able to mount and edit a host volume using the image. Mind sharing the command and the environment (OS, Docker version) you’re using?


#23

I used the amazon linux ami.
Docker version 1.12.6, build 7392c3b/1.12.6
I ran the container using this:
volumes = "-v /v1/.jupyter:/docker/.jupyter “
”-v /v1:/host"
fab.run(“docker run {volumes} -w=/host -p 8888:8888 -d -i “
”–name notebook deeprig/fastai-course-1”.format(**locals()))
fab.run(“docker exec -d notebook jupyter notebook”)

This asks for the password then gives a 404 error. If I docker exec into the container then do “ls -a” it says “permission denied”. If I “sudo ls -a” then it works.


(Anurag Goel) #24

In the second fab command (docker exec -d notebook jupyter notebook) is there a reason you’re running the notebook again? docker run in the previous command should spawn a notebook on port 8888, so the second command just tries to do it again. When you exec into the containers, are you doing it with exec -it /bin/bash?

As a workaround for now, I would add the option “-u root” to docker run. This will override the user in the image and run the notebook as root.


(David Gutman) #25

For people with multiple GPUs using docker, another tidbit:

Set the NV_GPU flag before running nvidia-docker to assign your container to use a specific GPU.

e.g.

NV_GPU=0 nvidia-docker run nvidia-smi

or

NV_GPU=0,1 nvidia-docker run nvidia-smi

If you run these commands, you will see only one gpu for the first command and two for the second.

One caveat is that if you set NV_GPU=1, for example, within your container that will be GPU 0 and tensorflow will see it as /gpu:0.


#26

@anurag, after executing the second instruction in your first post I get a problem with Jupyther not having any kernel. Any ideas on how to fix?

bash:
"[I 05:39:29.301 NotebookApp] Writing notebook server cookie secret to /home/docker/.local/share/jupyter/runtime/notebook_cookie_secret
[I 05:39:29.552 NotebookApp] Serving notebooks from local directory: /home/docker/fastai-courses/deeplearning1/nbs
[I 05:39:29.552 NotebookApp] 0 active kernels
[I 05:39:29.554 NotebookApp] The Jupyter Notebook is running at: http://0.0.0.0:8888/
[I 05:39:29.554 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 05:40:55.280 NotebookApp] 302 GET / (172.17.0.1) 5.69ms
[I 05:40:55.307 NotebookApp] 302 GET /tree? (172.17.0.1) 9.64ms
[I 05:41:01.839 NotebookApp] 302 POST /login?next=%2Ftree%3F (172.17.0.1) 6.07ms
[E 05:41:02.888 NotebookApp] Failed to load kernel spec: ‘python2’"

Also thanks for the repo and good documentation.


What is a good work flow to use in order to minimize cost?
(Anurag Goel) #27

@gnak I can’t reproduce the issue. Have you modified the Docker image at all?


#28

No, I have not…


#29

Hi all,

I’m running Arch Linux and use this Docker image (thanks for that!) on my local machine, which works flawlessly until vgg16.h5 is downloaded. If I then stop the container and try to start it again, I always get the following error (exit status 1):

Traceback (most recent call last):
  File "/opt/conda/bin/jupyter-notebook", line 6, in <module>
    sys.exit(notebook.notebookapp.main())
  File "/opt/conda/lib/python2.7/site-packages/jupyter_core/application.py", line 267, in launch_instance
    return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
  File "/opt/conda/lib/python2.7/site-packages/traitlets/config/application.py", line 657, in launch_instance
    app.initialize(argv)
  File "<decorator-gen-7>", line 2, in initialize
  File "/opt/conda/lib/python2.7/site-packages/traitlets/config/application.py", line 87, in catch_config_error
    return method(app, *args, **kwargs)
  File "/opt/conda/lib/python2.7/site-packages/notebook/notebookapp.py", line 1290, in initialize
    super(NotebookApp, self).initialize(argv)
  File "<decorator-gen-6>", line 2, in initialize
  File "/opt/conda/lib/python2.7/site-packages/traitlets/config/application.py", line 87, in catch_config_error
    return method(app, *args, **kwargs)
  File "/opt/conda/lib/python2.7/site-packages/jupyter_core/application.py", line 243, in initialize
    self.migrate_config()
  File "/opt/conda/lib/python2.7/site-packages/jupyter_core/application.py", line 169, in migrate_config
    migrate()
  File "/opt/conda/lib/python2.7/site-packages/jupyter_core/migrate.py", line 241, in migrate
    with open(os.path.join(env['jupyter_config'], 'migrated'), 'w') as f:
IOError: [Errno 13] Permission denied: u'/home/docker/.jupyter/migrated'

I already tried googling but to no avail.
Did anyone else have such a problem, or am I doing something wrong with Docker?

Thank you! :slight_smile:


(ry101) #30

Hi,

Trying to use it, new to docker,
I get :No such file or directory: ‘data/dogscats/sample/train’ when trying to run lesson1

Do I need to do something else then running docker run -it -p 8888:8888 deeprig/fastai-course-1


#31

I have the same exact problem. According to the README of the repository, you should do the following, but it’s not working for me: https://github.com/anurag/fastai-course-1#data-management

I’m running the following command:
docker run -it -p 8888:8888 -v /Users/obssd/data:/home/docker/data deeprig/fastai-course-1

But I keep getting the following error:
OSError: [Errno 2] No such file or directory: ‘data/dogscats/sample/train’

… when running this line:
batches = vgg.get_batches(path+‘train’, batch_size=4)


(ry101) #32

Hii,

I think I succeeded loading the pictures using

docker run -it -p 8888:8888 -v //D/courses/deep-udacity/fastai/courses/deeplearning1/nbs/data/dogscats/sample:/home/docker/data deeprig/fastai-course-1

now I’m gewtting kernel is dying, do I need gpu for lesson1?


#33

I really don’t know what to do anymore. I have the data under /Users/obssd/data, but running
docker run -it -p 8888:8888 -v /Users/obssd/data:/home/docker/data deeprig/fastai-course-1
doesn’t work and gives that error that I showed you :frowning:
I’ll try running this command on a Windows machine and see how it goes there.

Regarding your GPU question, I suppose that yes you do need a gpu if you try the whole dataset and not the sample only.


(Mike Moloch) #34

Did you change the path in your notebook to point to /home/docker/data as well? I used this docker image and I was getting this error but I was able to run the sample set after I changed my notebook.

This is the second line in the notebook


#35

I don’t know how I missed this. Thank you so much for your help :slight_smile:


(Anurag Goel) #36

Thanks Mike. I’ve updated the README to add this reminder.


(Mike Moloch) #37

Hey Anurag, the docker container worked perfectly on my local box with a 1050ti (which I was having all kinds of problems trying to get it to work using native drivers etc.) So kudos for this great work!

I also signed up for Crestle account and when I ran the Crestle notbook with dogscats full data set, it took about 645 seconds. When I ran it on my nvidia-docker container on 1050ti … it took about 620 seconds. I was expecting the Crestle GPU backed notebook to run much much faster… since it uses a dedicated Tesla K80? Any ideas on what I maybe doing wrong with the Crestle notebook GPU enablement?

Thanks,

Mike


(Anurag Goel) #38

Do you see the same times on Crestle when you run it again? Sometimes the first run can take longer because it’s missing cached data.


(Mike Moloch) #39

Actually, I did not. I turned off the GPU notebook. Does this happen every time a notebook is GPU enabled? Because my workflow might be to work with samples with a cpu backend and then restart Jupyter with GPU when I want to do a full data run. I’ll try it again and note the timings etc.


(Mike Moloch) #40

OK, just a quick update, I did multiple back to back runs with different batch sizes for the dogs cats data set and it seems it is slightly better than a 1050ti local gpu but it stayed constant around 570-580s for these runs regardless of batch sizes. I did not restart the notebook in between the runs i.e., they were all one after another

restart gpu
2) batch size 8, full run 568s
3) batch=8, 568s - loss: 0.2030 - acc: 0.9676 - val_loss: 0.0987 - val_acc: 0.9850
4) batch=32, 580s - loss: 0.1309 - acc: 0.9683 - val_loss: 0.0995 - val_acc: 0.9780
5) batch=20, 581s - loss: 0.1544 - acc: 0.9681 - val_loss: 0.0746 - val_acc: 0.9825
6) batch=20, 581s - loss: 0.1527 - acc: 0.9693 - val_loss: 0.1056 - val_acc: 0.9790
7) batch=64, - 583s - loss: 0.1221 - acc: 0.9686 - val_loss: 0.0572 - val_acc: 0.9830