Improving/Expanding Functional Tests

Here are some notes that we might need later.

If we want global learn_vision, learn_text, etc. objects, we can pre-create them from conftest.py:


from fastai.vision import *
@pytest.fixture(scope="session", autouse=True)
def learn_vision():
    path = untar_data(URLs.MNIST_TINY)
    data = ImageDataBunch.from_folder(path, ds_tfms=(rand_pad(2, 28), []), num_workers=2)
    data.normalize()
    learn = Learner(data, simple_cnn((3,16,16,16,2), bn=True), metrics=[accuracy, error_rate])
    learn.fit_one_cycle(3)
    return learn

now inside let’s say tests/test_vision_train.py I can access this via fixture:

def test_accuracy(learn_vision):
    assert accuracy(*learn_vision.get_preds()) > 0.9

If we use:

@pytest.fixture(scope="session", autouse=True)

all global objects will be pre-created no matter whether the tests need them or not, so we probably don’t want autouse=True. w/o it these will be created on demand.


There is a cosmetic issue with having learn_vision, learn_text, since now we either have to spell out:

def test_accuracy(learn_vision):
    assert accuracy(*learn_vision.get_preds()) > 0.9

or rename:

def test_accuracy(learn_vision):
    learn = learn_vision
    assert accuracy(*learn.get_preds()) > 0.9

both aren’t very great…

We want to be able to copy-n-paste quickly and ideally it should always be learn.foo, especially since there are many calls usually.


Unrelated, with this and even with test module scoped variables (e.g. learn) that we currently use there is an issue that tests can and will modify the object and the next test doesn’t get the same object but a modified one, so this can be very bad for tests, as it can mask serious issues.

Instead we need to have some kind of clone() method which will give each test a clean copy of the object.

Writing tests that exercise APIs that have a large impact on gpu RAM (e.g., purge, load, destroy) proved to be very problematic, since the memory utilization patterns seem to vary greatly depending on:

  1. what has been run a second earlier to this test, so changing the order of test execution changes the numbers
  2. gpu card model

So at the moment I have to put all gpu tests in a single test, which could make troubleshooting somewhat difficult.

But the biggest problem is that the test assert numbers are now currently tied to the gpu card model I use and can’t be shared and tested by others unless they have the same model. Sure, we could prepare a few sets of these numbers based on common card models, but it’d be very difficult to maintain these sets as tests and the code base evolve.

I already use a small tolerance in the checks (i.e. not checking exact numbers), but making it large would defeat the purpose of these tests - since we want to make sure that these special methods either actually reclaim memory or they don’t leak memory. e.g. until recently save/load sequence actually wasted more memory than if you weren’t to call these, but fixed now. So we absolutely need to have tests that check for possible regressions and leaks.
And it wold have been easier to run a large tolerance of say 100MB if the tests were to load huge models and data, but it’s not possible in the testing environment since it’d slow things down a lot.

If you would like to see what I have been doing so far, skip to GPUMemTrace calls in https://github.com/fastai/fastai/blob/master/tests/test_basic_train.py

You can unskip the first part of test_memory to run these (unless you have the same card as I then it’ll just run).

You can uncomment:
#check_mem_expected = report_mem_real
earlier in the test module, to print the memory usage instead of asserting on it.

If you have some good ideas on how to handle this problem, I’m all ears.

Until then I will be testing it only on my machine.

p.s. I suppose perhaps the solution is to use large models and data sets, so that the memory usage numbers will become much larger and thus it’d be possible to greatly increase absolute tolerance numbers without defeating the purpose of these tests and thus support a wider range (all?) card models. And make those into slow tests that will be run only occasionally, and certainly before a new release.

1 Like

@stas I cant say I had the time to go into all details of the perf & load tests you do (and if I had, it would take me a moment to understand), but splitting of normal, fast running and always running tests from such larger tests, maybe with larger models it imhe totally the way to go. So, one has two sets of tests - maybe there is a flag or alike. One could also use the fake class maybe to have either small or larger models based on such tests and then of course some test scripts for per & load would simply run only before releases.

Having said that, such tests add really a lot of value (compared to the sometimes trivial assert tests). Great you are helping out. If I had more time, after the current doc test would love to help but can so far only give (hopefully) these ‘smart comments’ here :wink:

So my 2 cents summarised:

  • maybe one just has to agree on one reference environment with Jeremy and Sylvain. So you say: we guarantee / run load & perf tests on a typical, standard cloud like Google or Azure. Other setups might deviate but we tested on a standard setup and people can use these tests for their specific settings (this could be an addition to the doc then eventually)
  • I do wonder if we might want to separate these complex tests with a switch (so they run only before releases) and maybe even separate out the code base or use naming conventions like test_perfload_[APIXYZ] For example, I could imagine in the doc_test project it would be great one can identify the perf load tests based on the name and @ashaw could potentially dump them later on in a separate section Perf & load tests.

This is of course completely fantastic and complex work you are doing here! :v::+1::metal:

1 Like

@stas I am afraid I will not get to this task with the attention and effort it deserves and took out my name here: Dev Projects Index

Hope can help finishing and handing over the doc_test project, which takes more effort than expected.

I might find time to show this test stuff at my local meetup and maybe some folks then submit PRs. Hopefully also the doc_test project will motivate others

Thank you for starting and leading it so far, @Benudek!

1 Like

@kdorichev

Hello, I would like to help writing the tests for the project. I’m actually rather new to python development, but I would like to take a stab at it!

So far, I have followed the instructions for setting up the dev environment, but I’m not exactly sure how to get started. Can I choose a test under the Sorted Tasks, or should I request to be assigned a task?

2 Likes

Hey thats great ! No worries on python, this is a good way to learn!

So, basically you are free to pick what you like best ! the above list was compiled with Sylvain then and reflecting prios … but your preference is really also a top priority, so make your pick ! What you should do of course, is check where there is not test coverage yet. The above list might not be fully up to date (some things done), so also feel free to update that wiki then.

Process is typically ask simple questions here and then submit a PR, likely @stas will review it.

Did you see the awesome [test] link in the docs? So, after you register the test with this_tests your ‘oeuvre’ will get published to the fast.ai community via the doc, which links your github code then. Some details in above links and here an announcement from Stas

2 Likes

Thank you for your clarification! And yes, I’ve read about the test registry; it looks really neat.

So If I’m interested in writing tests for vision/models/unet.py, should I change the Status from Unassigned to Assigned and put my name down under Developers column?

It seems pretty intuitive, but I just wanted to make sure if this is the correct workflow.

1 Like

yes, thats fine. And no worries, if python looks new to you: How to contribute to fastai [Discussion]

Wrt workflow: the table is a little clumsy so if you have other ideas we can adjust this process.

But essentially for now just as you said, have fun !

1 Like

I wanted to add some callback tests to the framework.

ref:
https://docs.fast.ai/dev/test.html#quick-guide

Getting error in "make test " - any one see the same?

  • Step 1. Setup and check you can run the test suite:
 git clone https://github.com/fastai/fastai
 cd fastai
 tools/run-after-git-clone # python tools\run-after-git-clone on windows
 pip install -e ".[dev]"
 >>make test # or pytest

(py3) suvasiss-MBP:fastai suvasismukherjee$ make test #
python setup.py --quiet test
warning: no previously-included files matching '__pycache__' found under directory '*'
warning: no files found matching 'conf.py' under directory 'docs'
warning: no files found matching 'Makefile' under directory 'docs'
warning: no files found matching 'make.bat' under directory 'docs'
ImportError while loading conftest '/Users/suvasismukherjee/fastai/fastai/tests/conftest.py'.
tests/conftest.py:15: in <module>
    from utils.mem import use_gpu
tests/utils/mem.py:4: in <module>
    from fastai.utils.mem import *
fastai/utils/mem.py:3: in <module>
    from ..imports.torch import *
fastai/imports/__init__.py:1: in <module>
    from .core import *
fastai/imports/core.py:2: in <module>
    import math, matplotlib.pyplot as plt, numpy as np, pandas as pd, random
../../Anaconda2ins/anaconda2/envs/py3/lib/python3.7/site-packages/matplotlib/pyplot.py:2372: in <module>
    switch_backend(rcParams["backend"])
../../Anaconda2ins/anaconda2/envs/py3/lib/python3.7/site-packages/matplotlib/pyplot.py:207: in switch_backend
    backend_mod = importlib.import_module(backend_name)
../../Anaconda2ins/anaconda2/envs/py3/lib/python3.7/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
../../Anaconda2ins/anaconda2/envs/py3/lib/python3.7/site-packages/matplotlib/backends/backend_macosx.py:14: in <module>
    from matplotlib.backends import _macosx
E   ImportError: Python is not installed as a framework. The Mac OS X backend will not be able to function correctly if Python is not installed as a framework. See the Python documentation for more information on installing Python as a framework on Mac OS X. Please either reinstall Python as a framework, or try one of the other backends. If you are using (Ana)Conda please install python.app and replace the use of 'python' with 'pythonw'. See 'Working with Matplotlib on OSX' in the Matplotlib FAQ for more information.

Have you considered following the instructions in the error message? I don’t think this is a fastai issue, but matplotlib’s on OSX.

Based on the error message, this is failing on your setup:

python -c "import matplotlib.pyplot"

Thanks.

That helped. Now I see the following:

The fastai test documentation doesn’t discuss these issues.

(py3) suvasiss-MBP:fastai suvasismukherjee$ python -c “import matplotlib.pyplot”
(py3) suvasiss-MBP:fastai suvasismukherjee$ make test
python setup.py --quiet test
warning: no previously-included files matching ‘pycache’ found under directory ‘*’
warning: no files found matching ‘conf.py’ under directory ‘docs’
warning: no files found matching ‘Makefile’ under directory ‘docs’
warning: no files found matching ‘make.bat’ under directory ‘docs’
ImportError while loading conftest ‘/Users/suvasismukherjee/fastai/fastai/tests/conftest.py’.
tests/conftest.py:16: in
from fastai.gen_doc.doctest import TestRegistry
fastai/gen_doc/init.py:1: in
from . import gen_notebooks, nbdoc, core, doctest, nbtest
fastai/gen_doc/gen_notebooks.py:2: in
import pkgutil, inspect, sys,os, importlib,json,enum,warnings,nbformat,re
E ModuleNotFoundError: No module named ‘nbformat’
make: *** [test] Error 4
(py3) suvasiss-MBP:fastai suvasismukherjee$

The fastai test documentation doesn’t discuss these issues

The fastai test documentation assumes users have a working python environment and trusts they can figure it out if it isn’t.

E ModuleNotFoundError: No module named ‘nbformat’

Have you followed the instructions, which say?

pip install -e ".[dev]"

which should have installed nbformat and any other required modules to run the test suite. Please review again, https://docs.fast.ai/dev/test.html#quick-guide

I used “pip install -e .”. Now it worked ok.

However, some tests are failing. Are these expected:

=================================== FAILURES ===================================
______________________ test_image_to_image_different_tfms ______________________

def test_image_to_image_different_tfms():
    this_tests(get_transforms)
    get_y_func = lambda o:o
    mnist = untar_data(URLs.COCO_TINY)
    x_tfms = get_transforms()
    y_tfms = [[t for t in x_tfms[0]], [t for t in x_tfms[1]]]
    y_tfms[0].append(flip_lr())
    data = (ImageImageList.from_folder(mnist)
            .split_by_rand_pct()
            .label_from_func(get_y_func)
            .transform(x_tfms)
            .transform_y(y_tfms)
            .databunch(bs=16))
  x,y = data.one_batch()

tests/test_vision_data.py:342:


fastai/basic_data.py:168: in one_batch
try: x,y = next(iter(dl))
fastai/basic_data.py:75: in iter
for b in self.dl: yield self.proc_batch(b)
…/…/Anaconda2ins/anaconda2/envs/py3/lib/python3.7/site-packages/torch/utils/data/dataloader.py:631: in next
idx, batch = self._get_batch()
…/…/Anaconda2ins/anaconda2/envs/py3/lib/python3.7/site-packages/torch/utils/data/dataloader.py:610: in _get_batch
return self.data_queue.get()
…/…/Anaconda2ins/anaconda2/envs/py3/lib/python3.7/multiprocessing/queues.py:94: in get
res = self._recv_bytes()
…/…/Anaconda2ins/anaconda2/envs/py3/lib/python3.7/multiprocessing/connection.py:216: in recv_bytes
buf = self._recv_bytes(maxlength)
…/…/Anaconda2ins/anaconda2/envs/py3/lib/python3.7/multiprocessing/connection.py:407: in _recv_bytes
buf = self._recv(4)
…/…/Anaconda2ins/anaconda2/envs/py3/lib/python3.7/multiprocessing/connection.py:379: in _recv
chunk = read(handle, remaining)


signum = 20
frame = <frame at 0x1281db228, file ‘/Users/suvasismukherjee/Anaconda2ins/anaconda2/envs/py3/lib/python3.7/multiprocessing/connection.py’, line 379, code _recv>

def handler(signum, frame):
    # This following call uses `waitid` with WNOHANG from C side. Therefore,
    # Python can still get and update the process status successfully.
  _error_if_any_worker_fails()

E RuntimeError: DataLoader worker (pid 44770) is killed by signal: Unknown signal: 0.

…/…/Anaconda2ins/anaconda2/envs/py3/lib/python3.7/site-packages/torch/utils/data/dataloader.py:274: RuntimeError
----------------------------- Captured stderr call -----------------------------
ERROR: Unexpected segmentation fault encountered in worker.

--------------------------- Captured stderr teardown ---------------------------
ERROR: Unexpected segmentation fault encountered in worker.
ERROR: Unexpected segmentation fault encountered in worker.
ERROR: Unexpected segmentation fault encountered in worker.

======== 1 failed, 258 passed, 17 skipped, 4 xfailed in 107.78 seconds =========
make: *** [test] Error 1

[dev] is what installs the group of optional dependencies.

pip install -e ".[dev]"

However, some tests are failing

Many tests are non-deterministic and occasionally fail. Does it happen if you re-run it again?

I guess I should move this discussion into the testing thread.

RuntimeError: DataLoader worker (pid 44770) is killed by signal: Unknown signal: 0.

Is a pytorch issue and usually has to do with not having enough of shared memory issue, please see:

Also, are you running the latest pytorch-1.0.1?

Finally, if none of the above helps, please see our macOS CI setup steps that doesn’t have problems running the test suite: https://github.com/fastai/fastai/blob/master/azure-pipelines.yml#L315
please check if yours varies and what’s different.

Hey, I’ve got some time on my hands and I’d like to try writing some tests. Can anyone suggest a task to work on? Are any of the unassigned tasks higher priority or easier than others? I’ve never written a test before but I do have a fair amount of experience with Python.

1 Like

These tests were ordered in prio. The list might be outdated.

But check also what you believe is most motivating for you!

Let me know if you need help

@stas

2 Likes

I’m writing tests for vision.gan, and there seem to be a bunch of implicit requirements for the shapes of input images and values of certain parameters. Am I correct in assuming that image sizes need to be powers of 2? Is that normal or should I look more into it?

1 Like