Developer chat

This might be useful to some of you - just discovered it:

Switching Conda Environments in Jupyter

Other than the normal switching environments with restarts:

source activate env1
jupyter notebook
(Ctrl-C to kill jupyter)
source activate env2
jupyter notebook

You can install nb_conda_kernels, which provides a separate jupyter kernel for each conda environment, along with the appropriate code to handle their setup. This makes switching conda environments as simple as switching jupyter kernel (e.g. from the kernel menu). And you don’t need to worry which environment you started jupyter notebook from - just choose the right environment from the notebook.

source: https://stackoverflow.com/a/47262847/9201239

4 Likes

Thank you, @TheShadow29. I haven’t tried yet installing with miniconda. I have added conda update conda to the docs, which should take care of this situation.

1 Like

If possible please test the new diagnostics function:

git pull
python -c 'import fastai; fastai.show_install(1)'

If possible to test it on cpu-only setup too.

We need it now to help with debugging install issues, and also it will be useful for dealing with functionality bug reports.

Thank you.

Dear Stas,

this is what I get on a CPU-only system:

platform info  : Darwin-17.7.0-x86_64-i386-64bit
python version : 3.6.6
fastai version : 1.0.5.dev0
torch version  : 1.0.0.dev20180921
cuda available: False
cuda version   : None
cudnn available: True
gpu count      : 0
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/XY/Downloads/fastai/fastai/torch_core.py", line 242, in show_install
    gpus = GPUtil.getGPUs()
  File "/Users/XY/anaconda3/lib/python3.6/site-packages/GPUtil/GPUtil.py", line 64, in getGPUs
    p = Popen(["nvidia-smi","--query-gpu=index,uuid,utilization.gpu,memory.total,memory.used,memory.free,driver_version,name,gpu_serial,display_active,display_mode", "--format=csv,noheader,nounits"], stdout=PIPE)
  File "/Users/XY/anaconda3/lib/python3.6/subprocess.py", line 709, in __init__
    restore_signals, start_new_session)
  File "/Users/XY/anaconda3/lib/python3.6/subprocess.py", line 1344, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'nvidia-smi': 'nvidia-smi'

I seems to need the nvidia-smi tool, which it does not find on this CPU-only machine.

I will check it later on my paperspace machine with GPU and post what I get there.

Best regards
Michael

Thanks a lot, @MicPie. I have made some extra tweaks and hopefully now you should get a clean output.

git pull
python -c 'import fastai; fastai.show_install(1)'

Thank you!

I’ve just added a module that will display test docstrings when running tests, so you’ll need to install it:

pip install pytest-pspec

I’ve added a basic end-to-end MNIST vision test that checks >98% accuracy after 1 epoch. It takes about 5 secs on a 1080ti. I think it’s a good idea to have at least one full integration test, although I’m open to using something else if the speed of this one is an issue for too many people. Or maybe there needs to be some easy way for particular people to disable it, if they don’t have a GPU.

1 Like

Dear Stas,

I still get a similar error after the line with “torch gpu count”:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/MMP/Downloads/fastai/fastai/torch_core.py", line 271, in show_install
    gpus = GPUtil.getGPUs()
  File "/Users/MMP/anaconda3/lib/python3.6/site-packages/GPUtil/GPUtil.py", line 64, in getGPUs
    p = Popen(["nvidia-smi","--query-gpu=index,uuid,utilization.gpu,memory.total,memory.used,memory.free,driver_version,name,gpu_serial,display_active,display_mode", "--format=csv,noheader,nounits"], stdout=PIPE)
  File "/Users/MMP/anaconda3/lib/python3.6/subprocess.py", line 709, in __init__
    restore_signals, start_new_session)
  File "/Users/MMP/anaconda3/lib/python3.6/subprocess.py", line 1344, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'nvidia-smi': 'nvidia-smi'

Best regards
Michael

OK, GPUtil is a wrapper for nvidia-smi, and doesn’t handle lack of it gracefully. I removed it. If you can kindly try the third time after git pull - hopefully it’ll work OK now. Thank you for your support, @MicPie.

Do we want this enabled for make test though? This addition puts it into the detailed mode, whereas normally we want it to be compact.

let’s experiment. What’s an easy way to tell torch to ignore my gpu?

Yes, probably need to skip this kind of tests for CPU by default and have an option to override it. Otherwise people won’t run the test suite. I have an old PC at the moment so it’s very slow on CPU:

time py.test tests/test_vision.py
time CUDA_VISIBLE_DEVICES=" "  py.test tests/test_vision.py

w/  GPU: ~30 secs
w/o GPU: ~15 min

No we don’t. Sorry didn’t realize it changes the detail level. Ideally I’d just like it to show the pspec-style names of failing tests. I’ll add that to the todo list to figure that out.

I’d like devs to always run the integration test before pushing a non-trivial change.

I haven’t spent time learning about pytest yet but I’m sure there will be some way we can have different categories of tests.

1 Like

One way, instead of adding addopts = --pspec to setup.cfg, it will need to be added at run time. we could make a new Makefile target test-verbose or vtest, or something similar, which would push this argument in.

@stas most of the time I’m running just one test file. So I’m looking for something where “pytest file.ipynb” shows the full names of tests.

Just pushed a free commits that may be of general interest:

  • If you have a Learner called learn, you can now say doc(learn.fit) instead of needing doc(Learner.fit)
  • I’ve added a mention of doc in the docs’ index.html - it’ll show documentation in a preview window in Jupyter
  • Callbacks now have an order. Default is 0. Recorder is -10. If you want your callback to have a different order, just set its _order attribute
  • When creating an Image, you can now pass an ndarray and it’ll turn it in to a tensor for you
  • Added rand_pad function that does basic padding and random cropping, as used for CIFAR10
1 Like
$ pytest --pspec tests/test_vision.py

I was just trying to find a way not to need to remember the long option, I guess an alias would do.

A few questions about tagging/version release process:

  1. I’m evaluating various solutions for the version bumping/release tagging, it’s a bit complicated due to the timings of version change and tagging.

what do you think about this one?


It’ll install a few files into the project files.
It is currently used by both pandas and matplotlib.

The idea is to get to this level of simplicity of new releases:

1: git tag 1.0
2: make release (which will include a commit of the version change)

So when you want a new version you tag the commit number you want.

I’ve been reworking the Makefile of fastprogress, so I thought I could use it as a guinea-pig, especially since it currently doesn’t have fastprogress.version and it won’t be a bad idea to add it.

And see #4 next where this approach might not work at all if we use release branches.

  1. Also, should we tag “v1.0.5”, or just “1.0.5”?

  2. And should we use annotated tags:

    git tag -a v1.0.5 -m "v1.0.5"
    

    vs. lightweight tags:

    git tag v1.0.5
    

https://git-scm.com/book/en/v2/Git-Basics-Tagging
sections:

  • Annotated Tags
  • Lightweight Tags
  1. Finally currently we have a bit of a potential race condition with the release process.

How do we ensure no commits are made, between (1) the moment the decision to make the current code base into a new release is made, and (2) the version is changed and committed into git, after which normal commit process can resume. Basically, some kind of code freeze must happen for the duration of the release, which ideally should be pretty quick. But I’m not sure how this no commits period can be enforced. And there could be problems with the pending release, so it might not be quick at all.

Alternatively, a release branch can be used, and any changes merged back into master upon successful release. The only thing I don’t understand about this method is the release tag, which seems to be wrong.

The release branch idea is (adopted from release-branches):

  1. create a release branch:
git checkout -b release-1.0.5 master
bump-version
git commit -a -m "Bumped version number to 1.0.5"
  1. make release

This may involve some extra commits if something needs to be fixed.

package(s) are uploaded

  1. merge the release branch back into master and delete it
git checkout master
git merge --no-ff release-1.0.5
git tag -a 1.0.5

The very last command of the last part that doesn’t compute for me (git tag -a 1.0.5 in the master branch). If master’s HEAD has moved since release-1.0.5 was made, the tag will include changes that weren’t part of the release, so if someone checks out tag 1.0.5 it won’t be the same as the files that were released in step #2.

For example, if we use the approach of 1.0.6.dev0 to be able to tell whether a user uses a released 1.0.5 or the git HEAD version, we will never see version==1.0.5 in the master, since when we merge back there will be 1.0.6.dev0 already.

Now it works nicely:

platform info : Darwin-17.7.0-x86_64-i386-64bit
python version : 3.6.6
fastai version : 1.0.6.dev0
torch version : 1.0.0.dev20180921
cuda available : False
cuda version : None
cudnn version : None
cudnn available: True
torch gpu count: 0
no supported gpus found on this system

I am happy to help. :slight_smile:

EDIT: Also works nicely on my paperspace machine:

platform info  : Linux-4.4.0-128-generic-x86_64-with-debian-stretch-sid
distro info    : Ubuntu 16.04 Xenial Xerus
python version : 3.6.6
fastai version : 1.0.6.dev0
torch version  : 1.0.0.dev20180928
nvidia driver  : 390.67
cuda version   : 9.0.176
cudnn version  : 7102
cudnn available: True
torch gpu count: 1
  [gpu0]
  name         : Quadro P4000
  total memory : 8119MB

Sun Oct  7 09:27:00 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.67                 Driver Version: 390.67                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P4000        Off  | 00000000:00:05.0 Off |                  N/A |
| 46%   32C    P0    28W / 105W |     10MiB /  8119MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
1 Like

I like using release branches. The only bit I’ve done differently to the thing you linked to is that I don’t delete the release branch. Instead I tag it there. The benefit is that you can update that release branch with non-breaking fixes (tagging eg “1.0.5post1”), and merge those back in to master. I don’t see the benefit of deleting the release branch.

1 Like

I don’t see the point of the ‘v’. Is there some reason people add that?

Lightweight tags seem fine to me. I’m not an expert though on tags.

1 Like

I haven’t tried it, but from the readme it looks like it would be a good choice. Thanks for the research!

1 Like