Improving/Expanding Functional Tests

sgugger · December 24, 2018, 11:01am

I’ve changed the way fake data is created. It’s still synthetic data, since it’s there to test functions in callbacks or basic train facility, but it’s fully compatible with the fastai library.

Benudek · December 24, 2018, 12:10pm

ok, thx! Will have a look latest after NYE to use for some testclasses.

stas · January 9, 2019, 1:35am

FYI, I moved tests/fakes.py to tests/utils/fakes.py - and please refactor any reusable functions into test util modules as you write tests. Thank you.

stas · January 9, 2019, 3:57am

added new tests:

callbacks/callback_fns: https://github.com/fastai/fastai/commit/7b6b127eff4dee74a90173656b991faee5af2449
metric as a custom class: https://github.com/fastai/fastai/commit/092661681603279b79b152209667900ec0088c53

I’m not quite sure how to update the listing as it doesn’t indicate parts that need to be covered.

Benudek · January 9, 2019, 8:20am

great, thx ! Enough if you update here, have updated the list above and added your name (not ideal format for tracking, but good enough I guess)

I am working on more simple tests for the learner and hope to extend my PR soon … bloody day job eating up my time

Benudek · January 9, 2019, 8:17pm

@sgugger would you think there is a meaningful, at least useful test case for testing the split function on a linear model (as given in fakes.py)? Am aware of convnet and unet usages for example.

Will check some other functions meanwhile, small (almost trivial) PR with some incremental improvements here:

Was thinking btw, if we have meaningful test cases we might want to paste them in the docs as examples also. Had looked here for example usages of split, hadn’t found them and upon searching the forum found image examples for split.

Benudek · January 10, 2019, 8:36am

@stas any ideas, how if to test split one fake data in a use- / meaningful way?

stas · January 10, 2019, 9:25pm

FYI, added CaptureStdout context manager for a much more compact stdout capture and clearing:

github.com

fastai/fastai/blob/master/tests/utils/text.py#L18


# or contextlib.redirect_stdout) contains any temporary printed strings, followed by
# \r's. This helper function ensures that the buffer will contain the same output
# with and without -s in pytest, by turning:
# foo bar\r tar mar\r final message
# into:
# final message
# it can handle a single string or a multiline buffer
def apply_print_resets(buf):
return re.sub(r'^.*\r', '', buf, 0, re.M)


class CaptureStdout():
""" Context manager to capture stdout, clean it up and make it available via obj.out or str(obj).


Example:


with CaptureStdout() as cs:
    print("Secret message")
print(f"captured: {cs.out}")
# or via its stringified repr:
print(f"captured: {cs}")

Benudek · January 10, 2019, 9:29pm

oh good, I will look at it. Quite some tests could work over screen output easily.

But here or there was a little reluctant to use it too much - imagine we change some trivial screen output in wording and the tests fail. So want to be a little careful, maybe we need a little strategy / best practices to ensure we do not add tests that break all the time.

Good stuff, thanks !

stas · January 10, 2019, 9:30pm

I don’t know, I have never used this.

Probably it will be easier to write meaningful test cases by prioritizing writing tests for bug reports, so then you always have a meaningful test case (assuming the report included enough of a setup to reproduce it). I’m not opposing your systematic approach, but perhaps a lot of those methods will almost never be used, so why not wait till someone asks about it, reports it not working, etc. and meanwhile focus on the small sub-set of the code that’s really important. i.e. following the 80-20% Pareto principal.

Benudek · January 10, 2019, 9:32pm

fair point, yes some of these might not too much add value - I meanwhile follow more the docs simply to test what is described.

stas · January 10, 2019, 9:32pm

I was in no way implying that it should be used as much as possible, it was just a refactoring step (prompted by your own refactoring recommendation) and I was sharing what was refactored. That’s all.

Benudek · January 10, 2019, 9:33pm

yes, get it! thx.

Benudek · January 11, 2019, 12:18am

and thx for your tips and corrections @stas ! worthwhile mentioning here

Benudek · January 20, 2019, 8:14pm

submitted a PR for inspecting and asserting over fit and fit_one_cycle, hope it makes sense.

All ears, if one can improve these tests here.

@sgugger

Benudek · January 27, 2019, 11:55am

Btw, there seem to be same regression tests broken in vision, when running this locally with a fresh pull. Getting in trouble with a.o. these lines in test_vision_data.py

from fastai.vision.data import verify_image
import PIL
***#import responses***
…
test_verify_image
…

So, while the individual tests of my changes work and the checks of my above PR works an azure, locally having errors.

Not sure what it is, tried some simple de-comments of culprit lines -didn’t solve it. Happy to help, if you give me some pointers.

@stas FYI

stas · January 27, 2019, 4:55pm

Please provide the output of the failing tests and any pertinent info that you think will help reproduce the problem. I have no problem running the test suite and neither azure CI. You can check how the CI is setup https://github.com/fastai/fastai/blob/master/azure-pipelines.yml (choose the the entry that’s similar to yours and compare with how yours is different).

Benudek · January 27, 2019, 6:04pm

ok, thx for the homework am pretty sure its a local problem and will check the yml setup. Let’s assume the issue is closed, otherwise would come up with an analysis here.

thx @stas

stas · February 2, 2019, 1:50am

Do we have any testing-suitable datasets with variable image sizes? or perhaps it’d be handy to add an autogenerator in fakes.py? or perhaps taking MNIST_TINY and making a variable image size copy of it by random cropping it - might be handy for other testing? MNIST_TINY_VAR_SIZE?

I have a half-baked test for tests/test_vision_data.py which works, but needs to also test on variable image size, it’s really a resize/collate_fn test:

def test_from_name_re_resize(path, capsys):
    fnames = get_files(path/'train', recurse=True)
    pat = r'/([^/]+)\/\d+.png$'
    # check 3 different size arg are supported and no warnings are issued
    for size in [14, (14,14), (14,20)]:
        data = ImageDataBunch.from_name_re(path, fnames, pat, ds_tfms=None, size=size)
        captured = capsys.readouterr()
        assert len(captured.err)==0, captured.err

Benudek · February 2, 2019, 9:20am

cool @Stas and thx a lot! I

I wouldn’t know if we have such testing-suitable datasets, I really only scan the existing tests to answer such a question. Maybe @sgugger knows?

IMHO: All for extending fakes.py as we can reuse this then.