Improving/Expanding Functional Tests

stas · January 27, 2019, 4:55pm

Please provide the output of the failing tests and any pertinent info that you think will help reproduce the problem. I have no problem running the test suite and neither azure CI. You can check how the CI is setup https://github.com/fastai/fastai/blob/master/azure-pipelines.yml (choose the the entry that’s similar to yours and compare with how yours is different).

Benudek · January 27, 2019, 6:04pm

ok, thx for the homework am pretty sure its a local problem and will check the yml setup. Let’s assume the issue is closed, otherwise would come up with an analysis here.

thx @stas

stas · February 2, 2019, 1:50am

Do we have any testing-suitable datasets with variable image sizes? or perhaps it’d be handy to add an autogenerator in fakes.py? or perhaps taking MNIST_TINY and making a variable image size copy of it by random cropping it - might be handy for other testing? MNIST_TINY_VAR_SIZE?

I have a half-baked test for tests/test_vision_data.py which works, but needs to also test on variable image size, it’s really a resize/collate_fn test:

def test_from_name_re_resize(path, capsys):
    fnames = get_files(path/'train', recurse=True)
    pat = r'/([^/]+)\/\d+.png$'
    # check 3 different size arg are supported and no warnings are issued
    for size in [14, (14,14), (14,20)]:
        data = ImageDataBunch.from_name_re(path, fnames, pat, ds_tfms=None, size=size)
        captured = capsys.readouterr()
        assert len(captured.err)==0, captured.err

Benudek · February 2, 2019, 9:20am

cool @Stas and thx a lot! I

I wouldn’t know if we have such testing-suitable datasets, I really only scan the existing tests to answer such a question. Maybe @sgugger knows?

IMHO: All for extending fakes.py as we can reuse this then.

stas · February 15, 2019, 8:04pm

116 posts were merged into an existing topic: Doc_test project

stas · February 2, 2019, 5:44pm

this dataset was created from mnist_tiny dataset, by randomly resizing it to slighly more than 28x28 and padding with black bg - so the resulting images are somewhere between 28x28 and 35x42.

wget http://files.fast.ai/data/examples/mnist_tiny.tgz
tar -xvzf mnist_tiny.tgz
find mnist_tiny -type f -name "*png" -exec perl -le '$f=shift; $w=28+int rand 7; $h=28+int rand 14; $s="${w}x${h}"; qx[mogrify -resize $s -background black -gravity center -extent $s $f]' {} \;
mv mnist_tiny mnist_var_size_tiny
rm mnist_var_size_tiny/models/tmp.pth
tar -cvzf mnist_var_size_tiny.tgz mnist_var_size_tiny

It’s on s3 - thanks Jeremy!
‘https://s3.amazonaws.com/fast-ai-imageclas/mnist_var_size_tiny.tgz’
and it’s in fastai datasets under URLs.MNIST_VAR_SIZE_TINY

now we can start using it in the tests

stas · February 10, 2019, 9:57pm

Hmm, this thread has been up for almost 2 months now and resulted in many posts, but it doesn’t look like it brought any results of having users not involved in this discussion contributing new tests. So we either need to do something differently or abandon efforts around the testing writing enticements, systems, maps and guides and instead spend the little resources that we have to actually write tests. Just an observation…

PierreO · February 10, 2019, 10:27pm

Just my 2 cents but even though I’m interested in contributing to this I feel a bit overwhelmed by this long thread. I (probably mistakenly) am under the impression that I need to read it all in order to be able to contribute (also I’m not used to writing tests at all, but that’s on me). If the first wiki is enough maybe you could say so at the top of it ? Maybe that’s just me…

Also there’s a lot of activity in the fastai part 1 category due to the recent opening of part 1 as a MOOC. Maybe it would be a good idea to point folks over there to this thread as well as the one about the documentation as good ways to get started if they wish to contribute to fastai.

Benudek · February 10, 2019, 10:36pm

Make a suggestion Stas, whatever makes sense eg close the thread or reorganize this task?

Initially hope was indeed others would pick some tasks, maybe with too many posts this also gets hard to read for others (therefore also tried to put my code to github).

Open for suggestions, whatever work best !

stas · February 10, 2019, 10:43pm

It should be just the top wiki post.

Please feel free to make suggestions or editing it so it’s more clear and less intimidating.

You see in general it’s very difficult for those who know to tell what is lacking. So your input is very helpful.

Personally, when I go from not knowing to knowing, I tend to document what brought me to knowing and typically share that with others.

Also there’s a lot of activity in the fastai part 1 category due to the recent opening of part 1 as a MOOC. Maybe it would be a good idea to point folks over there to this thread as well as the one about the documentation as good ways to get started if they wish to contribute to fastai.

How would you go about that? This thread is in dev projects - perhaps it should be moved to ‘users’ category so that it’s visible to all?

I pinned this topic for now.

stas · February 10, 2019, 10:45pm

In no way I was suggesting to close this thread, just questioning whether we need to be doing things differently.

And then Pierre showed up and made it clear why it’s not going anywhere - see his post. I agree.

PierreO · February 10, 2019, 10:48pm

Sure, I’ll look at it tomorrow

Maybe by making a post in the part 1 (2019) category titled ‘Contributing to fastai’ or something like that, that explains the ways to get started with useful links to threads here in the forums and to parts of the documentation (I go back often to the ‘git notes’ page, I find it very useful and I think many don’t know it’s even there because its kinda hidden under the intimidating ‘Library development’ section).

stas · February 10, 2019, 10:50pm

Maybe by making a post in the part 1 (2019) category titled ‘Contributing to fastai’ or something like that, that explains the ways to get started with useful links to threads here in the forums and to parts of the documentation.

You feel that category is the right one because it’s hot and the users are more likely to notice it? I guess it makes sense, and then move it to part 2 when that goes hot?

I was thinking ‘fastai users’, but I think your idea is a much better one.

(I go back often to the ‘git notes’ page, I find it very useful and I think many don’t know it’s even there because its kinda hidden under the intimidating ‘Library development’ section).

Perhaps we should move it to the normal category then? It’s a fine line.

PierreO · February 10, 2019, 10:52pm

Yes exactly. I think among all the new users they are at least a few that (like me not so long ago) would like to contribute but don’t know where to start, find it intimidating, and don’t know about the ressources provided by fastai and the amazing welcoming environment they would find

Not sure it’s necessary, I think mentioning it in a post would be enough. Also maybe it would be useful to have a pinned post a the top of the fastai user category about contributing, with relevant links ? Instead of this one.

Benudek · February 10, 2019, 10:58pm

Yes, so i guess what I mean is: keep the task, reorganize the thread (or close and make a new one) to make this more efficient. I am also fine to spend my little time on pure test scripting. And do the more fancy stuff with documenting laters.

If the thread is to overloaded we could also take out this design idea and make a separate thread

stas · February 10, 2019, 11:01pm

Let’s do it - would you like to start on it and fill out the parts that you do know and the way you’d have presented such a document now to a new contributor? Then others can improve on it. Perhaps the wiki should allow users to add questions in the form of:

XXX: how do I do foo?

I guess such a post could encompass contributing to docs and tests and not to be test-specific.

The same applies to the doc improvement post - the intention that all the todo-information and the how is in the first wiki post. the thread is just discussing work in progress.

Alternatively, we use a different approach where we create a thread with no discussion and just the todo lists and howtos, and leave the discussion thread for discussion? Not sure… I’m thinking how to make it look less intimidating to the users.

stas · February 10, 2019, 11:06pm

Let’s finish what we have started. By now you and I invested a lot of time into it. This feature is useful without any relationship to test writing enticement. I see its benefits primarily as a quick way to find which tests to look at to see some API in action.

Currently, I often grep the test suite to see examples, and its not the most efficient way because many APIs are used as part of another test.

Let’s not confuse this particular effort with the original purpose of this thread.

Benudek · February 10, 2019, 11:09pm

Yes, great. And thx for the help and guidance !

Benudek · February 10, 2019, 11:12pm

Testscripts are perfect for picking small tasks

PierreO · February 10, 2019, 11:12pm

Sure ! I’ll write a draft tomorrow (it’s getting a bit late here in France ) and submit it here before posting it on the part 1 category, if that works for you.

That’s a good idea. A bit like the lessons are organised : a wiki post that recaps everything for a lesson, and then another post for discussion of said lesson. If you decide to use such an approach it would be best to have it set up before making the post about contributing in the part 1 category.