Test Registry project

Benudek · March 6, 2019, 8:16pm

happy to make a PR out of updating the json, if its not too urgent and we plan here a little (as I don’t know the specific case).

updating a json in python shouldnt be an issue I suppose, just read the file, parse and search whatever id you need, update and write back.

but you decide if helpful pls, because handing over the task and reviewing might be more work than doing it?

stas · March 6, 2019, 9:31pm

I was thinking about it and I think yes more or less what you said but even simpler by merging instead of replacing each entry at a time.

prepare the new registry dict as before
call merge_with_old_registry (new function which does):

a. Check if the old registry doesn’t exist return the new one
b. if it exists, load it and merge the new one onto old one (direction is important!) and return the resulting dict.
It should be a very compact 1-2 lines of list comprehension code by borrowing from clean_nb in tools/fastai-nbstripout
write the dict as before

So really, there is just an extra merge step.

And if we ever want to reset the registry to discard old entries, we just delete the registry file before running make test-full.

So it’s a very low-tech solution. The bonus is that we will be able to discard the special flag and test writers will get to update the registry right away.

Implementation-wise, It’s up to you, @Benudek - I thought you are taking time off so I wasn’t proposing anything for you, but if that has changed, you’re welcome to tackle this.

Benudek · March 7, 2019, 6:18am

ok, let me have a look at this, maybe can do over the weekend. thx Stas

stas · March 7, 2019, 8:02pm

@ashaw, found another glitch.

If I do:

rm fastai/test_api_db.json
make test-full

it fails in tests/test_gen_doc_nbtest.py::test_wrapped:

    def lookup_db(elt)->List[Dict]:
        "Finds `this_test` entries from test_api_db.json"
        db_file = Path(abspath(join(dirname( __file__ ), '..')))/DB_NAME
        if not db_file.exists():
>           raise Error(f'Could not find {db_file}. Please make sure it exists at this location or run `make test`')
E           NameError: name 'Error' is not defined

a chicken and egg problem there, since it requires the registry file to run make test (which generates it), and the latter shouldn’t depend on the former. It probably should skip that test dynamically if the file is not there.

Plus Error should probably be Exception?

ashaw · March 7, 2019, 10:16pm

Fixed on latest master - going to keep throwing an error for now until we know this module is a bit more stable.
I’d still like to know when db search fails

stas · March 7, 2019, 10:16pm

@ashaw, what is the logic for related tests and how does it help the user? e.g.:

https://docs.fast.ai/basic_train.html#fit_one_cycle

Tests found for fit_one_cycle :

pytest -sv tests/test_callback.py::test_callbacks_fit [source]
pytest -sv tests/test_train.py::test_fit_one_cycle [source]

Related tests:

pytest -sv tests/test_text_train.py::test_qrnn_works_with_no_split [source]
pytest -sv tests/test_tabular_train.py::test_empty_cont [source]
pytest -sv tests/test_text_train.py::test_qrnn_works_if_split_fn_provided [source]

I’m not sure how those are related and how this information is going to help the reader?

Thank you!

ashaw · March 7, 2019, 10:17pm

It’s just there for more examples of usage. If it doesn’t seem helpful we can take it out!

stas · March 7, 2019, 10:22pm

Oh, for sure, the check should stay there - Just needed it not to impact the test suite.

And since we have to revamp how the registry file is update (see comments above), being able to delete the registry and re-create it will become important.

Your fix looks good now - Thank you, @ashaw

stas · March 7, 2019, 10:26pm

My confusing was about trying to understand why they are related and why I should look at them.

Basically, it is saying: here are other tests where this api has been used, right?

Great! let’s call it that. and replace:

“Related tests:”

with:

“Other tests where fit_one_cycle is used:”

then it’s unambiguous and potentially useful.

What do you think?

ashaw · March 7, 2019, 10:27pm

Awesome I like that. Naming is hard.

stas · March 7, 2019, 10:29pm

Indeed. God bless the “edit” button.

ashaw · March 7, 2019, 10:31pm

Any thoughts on naming Direct tests?

Or should I just put them all under the Other tests where x is used umbrella

stas · March 7, 2019, 10:36pm

Meh for ‘Direct tests’, it’s ambiguous - the first group are direct tests too.

How about we start with an explicit ‘Other …’ and then improve down the road if someone finds a word to encapsulate this long phrase.

Moreover this is just a sample of those tests, e.g. fit_one_cycle is used probably in 50+ tests. So perhaps not to mislead again we use:

Some other tests where x is used

I know I made it even longer, but a lawyer would be proud of the choice!

stas · March 7, 2019, 10:42pm

The cool thing about the test registry is its unintentional features - I already found several tests that were testing the same thing in slightly different ways, so I merged them and as a result discovered a bug in fastai and a bug in our registry. Awesome!

ashaw · March 7, 2019, 10:43pm

Nice!! Useful already

ashaw · March 7, 2019, 10:44pm

To clarify, we should combine both Direct tests: and Related tests: into:
Some other tests where x is used:
Right?

Benudek · March 7, 2019, 10:46pm

yeeah, I like it and click around on the test button too. awesome work @ashaw!

Mostly as it serves a double purpose: testing is for testing and can serve as documentation.

I do wonder, if there will be more PRs on testing - crossing fingers!

stas · March 7, 2019, 10:46pm

Oh, sorry, I thought you were suggesting ‘Direct tests’ as a better name - I misunderstood you.

What are ‘direct tests’?

Looking at DataBunch - these are tests called on the object of the class that one may want to test, right?

ashaw · March 7, 2019, 10:52pm

Yeah none of it makes sense. I’ll just loop all of them into other.

Direct tests:
Looks to see if any def test_xxx: names contain the function name
test_DataBunch_xxx: -> matches DataBunch
test_DataBunch_oneitem -> matches DataBunch and DataBunch.oneitem

Related tests:
any time the function is called in the body

stas · March 7, 2019, 10:53pm

since function names can be random and worse “misnamed” probably yes the best is to lump them into “other”.