[UnitTests] Automatic parametrization over targets, with explicit opt-out #8010

Lunderberg · 2021-05-10T15:47:19Z

Implemented features for the python tests to automatically parametrize over enabled targets, and to explicitly list the targets that were skipped. PR includes testing framework changes, along with changes to a single test file (test_topi_relu.py) as a proof of concept.

~~Link to RFC~~ Link to RFC on tvm-rfcs, documenting differences in the testing style, advantages of the proposed style, and changes needed to use the new style.

tkonolige

Thanks @Lunderberg for this great PR! I think one major thing that would really improve the error handling situation would be a document that describes how to use the testing infrastructure. Something that we could point developers to when we are receiving PRs.

tkonolige · 2021-05-26T16:39:03Z

python/tvm/testing.py

            target_str,
        )
-        return {"llvm"}
+        return _get_targets("llvm")


Does this loop forever if llvm is not enabled?

Good catch, it would. Updating to check tvm.runtime.enabled('llvm'). If enabled, maintain current behavior. Otherwise, raise an exception.

tkonolige · 2021-05-26T16:41:49Z

python/tvm/testing.py

+            xfail_targets = set()
+
+        target_marks = []
+        for t in _get_targets():


Doesn't _get_targets filter out all non-unable targets? So we are not including unrunable targets here?

The updated implementation of _get_targets returns all targets without filtering, but marks them as running or un-runnable. This allows enabled_targets() to maintain its current behavior of filtering out un-runnable targets, while the _pytest_target_params can return all targets, but marked with pytest.skipif to indicate which ones cannot run on the current platform.

tkonolige · 2021-05-26T16:43:33Z

python/tvm/testing.py

+    Use this decorator when you want your test to be run over a
+    variety of targets and devices (including cpu and gpu devices).
+
+    Alternatively, a test that accepts the "target" and "dev" will


Maybe specify that you want to use parameterize_targets when you have a specific set of targets you want to run over. Otherwise users should not use the decorator. Also mention that exclude_targets may be a better option.

Good point, edited documentation to reflect new intended usage, and to recommend that exclude_targets or known_failing_targets should typically be used instead.

tkonolige · 2021-05-26T16:48:13Z

python/tvm/testing.py

+            metafunc.parametrize(names, value_sets, indirect=True)
+
+
+def fixture(func=None, *, cache_return_value=False):


Are you allowed to have an optional parameter before regular arguments? I think lint will not be happy with this one.

In python2 it would be an error, but in python3 it is allowed, and passes the linter both locally and on the CI. I did this intentionally so that cache_return_value would be a keyword-only argument. My goal is to make it as obvious as possible at the fixture-definition site whether a fixture is going to be cached or not. Mandating fixture(cache_return_value=True) makes that obvious, where fixture(True) may not be.

tkonolige · 2021-05-26T21:02:08Z

python/tvm/testing.py


-    >>> @tvm.testing.parametrize("llvm", "cuda")
+    >>> @tvm.testing.parametrize_targets
    >>> def test_mytest(target, dev):
    >>>     ...  # do something


I'd just remove this example as we want people to only use the decorator with arguments.

Sounds reasonable, and removed.

Lunderberg · 2021-06-01T15:21:46Z

Rebased on main to start CI again, now that the CI fix #8160 is in.

tkonolige

Thanks for all the hard work @Lunderberg!

Lunderberg · 2021-06-09T22:48:36Z

Added one more bugfix. First implementation of removing fixture functions from module scope was a bit overzealous, also removed any objects that implement __getattr__, such as caffe.layers.

jwfromm

LGTM, thanks for these changes @Lunderberg. My only concern is that you've introduced a lot of new features but not all of them are used yet in the codebase. Do you think its worth adding a meta-test file to make sure things like known_failing_targets work as expected?

Lunderberg · 2021-06-22T20:26:54Z

@jwfromm That's a good point. I had initially thought that there were few enough features that they could be implicitly tested by their use in other tests, but with the additional features that I added following discussion, it would be good to have dedicated tests for the testing features. I will add them.

tmoreau89

Great PR - LGTM! One note is that the error message link won't work until apache/tvm-rfcs#7 gets merged, but that should happen fairly soonish, so I don't see this as blocking. Given that only the relu tests is being ported to the parameterized tests, this is a low-risk merge.

Lunderberg · 2021-06-22T23:23:50Z

@jwfromm And added meta-tests for all the new functionality.

@tmoreau89 Good point, that was an intentional choice to point to the main branch of tvm-rfcs. I figured that since the main discussion was on the intended behavior, it would be likely that the two would be accepted or rejected together.

Thank you both for the reviews, and I think the only thing remaining is the CI.

tmoreau89 · 2021-06-23T00:44:31Z

python/tvm/testing.py

+    # Optional cls parameter in case a parameter is defined inside a
+    # class scope.
+    @pytest.fixture(params=values, ids=ids)
+    def as_fixture(*cls, request):


@Lunderberg looks like the linter was not too happy about the unused argument here, that's the only thing blocking CI

Whoops, modified to _cls which passes the linter when running locally.

tmoreau89 · 2021-06-23T00:44:40Z

python/tvm/testing.py

+
+        # Optional cls parameter in case a parameter is defined inside a
+        # class scope.
+        def fixture_func(*cls, request):


And same change made here.

…TS but were skipped Previously, these were removed by a filter in tvm.testing._get_targets(), and weren't listed at all. With this change, they are instead removed by pytest.skipif, and show up as explicitly skipped tests in pytest's summary when using tvm.testing.parametrize_targets.

…dev) Should make it easier to convert tests from using tvm.testing.enabled_targets to use pytest's parametrized tests instead.

…cular test Uses tvm_exclude_targets variable, which can be set (1) in the conftest.py to apply to a test directory, (2) in a test script to apply to that module, or (3) on an individual test function to apply to it. The @tvm.testing.exclude_targets decorator is provided for readability in case apache#3.

Intended to mark tests that fail for a particular target, and are intended to be fixed in the future. Typically, these would result either from implementing a new test, or from an in-progress implementation of a new target.

…ipif

These were implemented to exclude or mark as failing an entire file or directory of tests. In https://discuss.tvm.apache.org/t/rfc-parametrized-unit-tests/9946/4, it was pointed out that the global variables would be vulnerable to typos in the names, resulting in the option being silently ignored. The decorators `@tvm.testing.exclude_targets` and `@tvm.testing.known_failing_targets` do not have this failure mode, and are the preferred version.

- tvm.testing.parameter() defines a parameter that can be passed to tests. Tests that accept more than one parameter are run for all combinations of parameter values. - tvm.testing.parameters() defines multiple sets of parameter values. Tests that accept more than one parameter are run once for each set of parameter values. - tvm.testing.fixture() is a decorator that defines setup code. The `cache=True` argument can be passed to avoid repeating expensive setup across multiple tests.

Previously, if the @parametrize_targets were present, but had other @pytest.mark.parametrize after it, "target" would get parametrized a second time. Now, it checks more than just the closest "parametrize" marker.

…_return_value"

@tkonolige

As recommended by @tkonolige: - Avoid infinite loop if LLVM target isn't enabled - Update documentation for preferred use cases of tvm.testing.parametrize_targets, and recommended alternatives.

- Documentation, removed previous example usage of tvm.testing.parametrize_targets

- Previously, a fixture function defined in a module was accessible through the global scope, and the function definition is accessible if a test function uses that name but fails to declare the fixture as a parameter. Now, it will result in a NameError instead.

…bal scope. - Initial implementation only checked hasattr(obj, "_pytestfixturefunction") before removing obj, which gave false positives for objects that implement __getattr__, such as caffe.layers. Now, also check that the value contained is a FixtureFunctionMarker.

…eturn_value=True) To avoid unit tests being able to influence each other through a shared cache, all cached fixtures are passed through copy.deepcopy prior to use.

jwfromm

Thanks for adding the tests, this is an excellent PR.

tmoreau89 · 2021-06-24T17:48:17Z

Thank you @tkonolige @jwfromm @Lunderberg the PR is now merged!

areusch · 2021-06-25T17:31:27Z

python/tvm/testing.py

+            try:
+                cached_value = cache[cache_key]
+            except KeyError:
+                cached_value = cache[cache_key] = func(*args, **kwargs)


is the exception case tested here?

e.g. what happens if func itself raises another exception?

If the fixture definition func raises an exception, then the exception gets passed on to pytest, and it gets treated as a failure to generate the fixture. These still result in the test failing, but are recorded as a failed setup. The test itself is never run in that case. This behavior is pytest's default, and is the same in both the cached and uncached versions of tvm.testing.fixture.

I don't have a unit test yet to verify this behavior, but I'll add one.

Unit test added in #8343

areusch · 2021-06-25T17:34:26Z

python/tvm/testing.py

+    # numpy array as input, then calculates uses a slow method to
+    # compute a known correct output for that input.  Therefore,
+    # including a fallback for serializable types.
+    def get_cache_key(*args, **kwargs):


is this guaranteed to be deterministic? pickle.dumps and maybe hash with stuff like dicts might not be...though maybe the dict thing is fixed now?

in any case, tbh i think this is pretty complicated for a cache key function. since we are trying to use this with parameterizable test cases, can't we just whitelist types that have an obvious, stable conversion to a cache key, and then error on the rest? i am not going to ever run python tests/python/unittest/test_bar.py --param=<pickled data>.

For hash, it is guaranteed to be deterministic, but pickle.dumps is not. For numpy arrays, pickle.dumps is, but that isn't guaranteed across all types.

The difficult part here is that the caching should work for fixtures that are based on other fixtures. For example, consider the following case. If we want to cache correct_output, then the cache needs to be based on the input_data argument. I agree that I don't think anybody will ever input pickled data from the command line, but this pattern of comparing to the correct output feels like it would be pretty common.

arr_size = tvm.testing.parameter(1, 16, 256) @tvm.testing.fixture def input_data(arr_size): return np.random.uniform(size=arr_size) @tvm.testing.fixture def correct_output(input_data): run_very_slow_method(input_data) def test_func(target, dev, input_data, correct_output): output = func(target, dev, input_data) tvm.testing.assert_allclose(target, dev, correct_output)

The other scheme I considered was to look up which parameters were indirectly involved in computing a particular fixture and caching based on that parameter or parameters. In this case, correct_output is indirectly based on arr_size. However, that would have introduced a potential failure mode if correct_output is cached but input_data is not. In that case, the second target to use arr_size==1 would look up the cached version correct_output associated with arr_size==1, but would generate a new random value for input_data. This felt like a worse failure mode than the current one of repeating the fixture setup, which is why I used pickle.dumps as the fallback.

@tkonolige

…-out (apache#8010) * [UnitTests] Explicitly list tests that were enabled by TVM_TEST_TARGETS but were skipped Previously, these were removed by a filter in tvm.testing._get_targets(), and weren't listed at all. With this change, they are instead removed by pytest.skipif, and show up as explicitly skipped tests in pytest's summary when using tvm.testing.parametrize_targets. * [UnitTests] Automatic parametrize_targets for tests that use (target,dev) Should make it easier to convert tests from using tvm.testing.enabled_targets to use pytest's parametrized tests instead. * [UnitTests] Added ability to explicitly exclude a target from a particular test Uses tvm_exclude_targets variable, which can be set (1) in the conftest.py to apply to a test directory, (2) in a test script to apply to that module, or (3) on an individual test function to apply to it. The @tvm.testing.exclude_targets decorator is provided for readability in case apache#3. * [UnitTests] Refactored test_topi_relu.py to use pytest.mark.parametrize * [UnitTests] Added tvm_known_failing_targets option for the unittests. Intended to mark tests that fail for a particular target, and are intended to be fixed in the future. Typically, these would result either from implementing a new test, or from an in-progress implementation of a new target. * [UnitTests] Known failing targets now marked with xfail instead of skipif * [UnitTests] Removed tvm_excluded_targets and tvm_known_failing_targets These were implemented to exclude or mark as failing an entire file or directory of tests. In https://discuss.tvm.apache.org/t/rfc-parametrized-unit-tests/9946/4, it was pointed out that the global variables would be vulnerable to typos in the names, resulting in the option being silently ignored. The decorators `@tvm.testing.exclude_targets` and `@tvm.testing.known_failing_targets` do not have this failure mode, and are the preferred version. * [UnitTests] Added helper functions to tvm.testing. - tvm.testing.parameter() defines a parameter that can be passed to tests. Tests that accept more than one parameter are run for all combinations of parameter values. - tvm.testing.parameters() defines multiple sets of parameter values. Tests that accept more than one parameter are run once for each set of parameter values. - tvm.testing.fixture() is a decorator that defines setup code. The `cache=True` argument can be passed to avoid repeating expensive setup across multiple tests. * [UnitTests] Bugfix for auto parametrizing of "target" Previously, if the @parametrize_targets were present, but had other @pytest.mark.parametrize after it, "target" would get parametrized a second time. Now, it checks more than just the closest "parametrize" marker. * [UnitTests] Renamed "cache" argument of tvm.testing.fixture to "cache_return_value" * [UnitTests] Minor updates to parametrized test implementation. As recommended by @tkonolige: - Avoid infinite loop if LLVM target isn't enabled - Update documentation for preferred use cases of tvm.testing.parametrize_targets, and recommended alternatives. * [UnitTests] Minor updates to parametrized test implementation - Documentation, removed previous example usage of tvm.testing.parametrize_targets * [UnitTests] Changed accidental use of pytest fixtures to a NameError. - Previously, a fixture function defined in a module was accessible through the global scope, and the function definition is accessible if a test function uses that name but fails to declare the fixture as a parameter. Now, it will result in a NameError instead. * [UnitTests] More careful removal of fixture functions from module global scope. - Initial implementation only checked hasattr(obj, "_pytestfixturefunction") before removing obj, which gave false positives for objects that implement __getattr__, such as caffe.layers. Now, also check that the value contained is a FixtureFunctionMarker. * [UnitTests] Copy cached values when using tvm.testing.fixture(cache_return_value=True) To avoid unit tests being able to influence each other through a shared cache, all cached fixtures are passed through copy.deepcopy prior to use. * [UnitTests] Added meta-tests for tvm.testing functionality Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>

@tkonolige

…-out (apache#8010) * [UnitTests] Explicitly list tests that were enabled by TVM_TEST_TARGETS but were skipped Previously, these were removed by a filter in tvm.testing._get_targets(), and weren't listed at all. With this change, they are instead removed by pytest.skipif, and show up as explicitly skipped tests in pytest's summary when using tvm.testing.parametrize_targets. * [UnitTests] Automatic parametrize_targets for tests that use (target,dev) Should make it easier to convert tests from using tvm.testing.enabled_targets to use pytest's parametrized tests instead. * [UnitTests] Added ability to explicitly exclude a target from a particular test Uses tvm_exclude_targets variable, which can be set (1) in the conftest.py to apply to a test directory, (2) in a test script to apply to that module, or (3) on an individual test function to apply to it. The @tvm.testing.exclude_targets decorator is provided for readability in case apache#3. * [UnitTests] Refactored test_topi_relu.py to use pytest.mark.parametrize * [UnitTests] Added tvm_known_failing_targets option for the unittests. Intended to mark tests that fail for a particular target, and are intended to be fixed in the future. Typically, these would result either from implementing a new test, or from an in-progress implementation of a new target. * [UnitTests] Known failing targets now marked with xfail instead of skipif * [UnitTests] Removed tvm_excluded_targets and tvm_known_failing_targets These were implemented to exclude or mark as failing an entire file or directory of tests. In https://discuss.tvm.apache.org/t/rfc-parametrized-unit-tests/9946/4, it was pointed out that the global variables would be vulnerable to typos in the names, resulting in the option being silently ignored. The decorators `@tvm.testing.exclude_targets` and `@tvm.testing.known_failing_targets` do not have this failure mode, and are the preferred version. * [UnitTests] Added helper functions to tvm.testing. - tvm.testing.parameter() defines a parameter that can be passed to tests. Tests that accept more than one parameter are run for all combinations of parameter values. - tvm.testing.parameters() defines multiple sets of parameter values. Tests that accept more than one parameter are run once for each set of parameter values. - tvm.testing.fixture() is a decorator that defines setup code. The `cache=True` argument can be passed to avoid repeating expensive setup across multiple tests. * [UnitTests] Bugfix for auto parametrizing of "target" Previously, if the @parametrize_targets were present, but had other @pytest.mark.parametrize after it, "target" would get parametrized a second time. Now, it checks more than just the closest "parametrize" marker. * [UnitTests] Renamed "cache" argument of tvm.testing.fixture to "cache_return_value" * [UnitTests] Minor updates to parametrized test implementation. As recommended by @tkonolige: - Avoid infinite loop if LLVM target isn't enabled - Update documentation for preferred use cases of tvm.testing.parametrize_targets, and recommended alternatives. * [UnitTests] Minor updates to parametrized test implementation - Documentation, removed previous example usage of tvm.testing.parametrize_targets * [UnitTests] Changed accidental use of pytest fixtures to a NameError. - Previously, a fixture function defined in a module was accessible through the global scope, and the function definition is accessible if a test function uses that name but fails to declare the fixture as a parameter. Now, it will result in a NameError instead. * [UnitTests] More careful removal of fixture functions from module global scope. - Initial implementation only checked hasattr(obj, "_pytestfixturefunction") before removing obj, which gave false positives for objects that implement __getattr__, such as caffe.layers. Now, also check that the value contained is a FixtureFunctionMarker. * [UnitTests] Copy cached values when using tvm.testing.fixture(cache_return_value=True) To avoid unit tests being able to influence each other through a shared cache, all cached fixtures are passed through copy.deepcopy prior to use. * [UnitTests] Added meta-tests for tvm.testing functionality Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>

Lunderberg force-pushed the unittests_explicit_skip branch from 57b495d to 89a3aaf Compare May 11, 2021 19:19

Lunderberg force-pushed the unittests_explicit_skip branch 3 times, most recently from ab4261e to 3e7fcb6 Compare May 25, 2021 22:02

tkonolige requested changes May 26, 2021

View reviewed changes

tkonolige reviewed May 26, 2021

View reviewed changes

Lunderberg force-pushed the unittests_explicit_skip branch from 19eb0e6 to 692c50b Compare June 1, 2021 15:20

Lunderberg mentioned this pull request Jun 3, 2021

[Vulkan][UnitTests] Compatibility fix for test_vulkan_unique() #8186

Merged

tkonolige approved these changes Jun 7, 2021

View reviewed changes

Lunderberg force-pushed the unittests_explicit_skip branch from 449fbf8 to 0da04a8 Compare June 18, 2021 21:27

Lunderberg mentioned this pull request Jun 18, 2021

[UnitTests] Parametrized Unit Tests apache/tvm-rfcs#7

Merged

jwfromm reviewed Jun 22, 2021

View reviewed changes

tmoreau89 approved these changes Jun 22, 2021

View reviewed changes

tmoreau89 reviewed Jun 23, 2021

View reviewed changes

Lunderberg force-pushed the unittests_explicit_skip branch from acfef75 to 68947c8 Compare June 23, 2021 03:45

Lunderberg added 10 commits June 23, 2021 09:33

[UnitTests] Automatic parametrize_targets for tests that use (target,…

4f1b976

…dev) Should make it easier to convert tests from using tvm.testing.enabled_targets to use pytest's parametrized tests instead.

[UnitTests] Refactored test_topi_relu.py to use pytest.mark.parametrize

740cf27

[UnitTests] Known failing targets now marked with xfail instead of sk…

3fbf78a

…ipif

[UnitTests] Bugfix for auto parametrizing of "target"

0a59bcf

Previously, if the @parametrize_targets were present, but had other @pytest.mark.parametrize after it, "target" would get parametrized a second time. Now, it checks more than just the closest "parametrize" marker.

[UnitTests] Renamed "cache" argument of tvm.testing.fixture to "cache…

da24d0f

…_return_value"

Lunderberg added 6 commits June 23, 2021 09:33

[UnitTests] Minor updates to parametrized test implementation.

657c74c

As recommended by @tkonolige: - Avoid infinite loop if LLVM target isn't enabled - Update documentation for preferred use cases of tvm.testing.parametrize_targets, and recommended alternatives.

[UnitTests] Minor updates to parametrized test implementation

775ecb8

- Documentation, removed previous example usage of tvm.testing.parametrize_targets

[UnitTests] Copy cached values when using tvm.testing.fixture(cache_r…

68afd65

…eturn_value=True) To avoid unit tests being able to influence each other through a shared cache, all cached fixtures are passed through copy.deepcopy prior to use.

[UnitTests] Added meta-tests for tvm.testing functionality

0f17d65

Lunderberg force-pushed the unittests_explicit_skip branch from 68947c8 to 0f17d65 Compare June 23, 2021 16:33

jwfromm approved these changes Jun 23, 2021

View reviewed changes

tmoreau89 merged commit 07701f2 into apache:main Jun 24, 2021

Lunderberg deleted the unittests_explicit_skip branch June 24, 2021 20:09

areusch reviewed Jun 25, 2021

View reviewed changes

Lunderberg mentioned this pull request Jun 29, 2021

[Topi][Unittests] Parametrized tests in test_topi_dense.py, split out gpu-independent implementations #8336

Merged

This was referenced Jul 28, 2021

[UnitTests] Apply correct requires_gpu() pytest marks for parametrized target #8542

Merged

[Onnx][UnitTests] Excluded additional onnx tests #8574

Closed

		metafunc.parametrize(names, value_sets, indirect=True)


		def fixture(func=None, *, cache_return_value=False):

[UnitTests] Automatic parametrization over targets, with explicit opt-out #8010

[UnitTests] Automatic parametrization over targets, with explicit opt-out #8010

Uh oh!

Conversation

Lunderberg commented May 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tkonolige left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Lunderberg commented Jun 1, 2021

Uh oh!

tkonolige left a comment

Choose a reason for hiding this comment

Uh oh!

Lunderberg commented Jun 9, 2021

Uh oh!

jwfromm left a comment

Choose a reason for hiding this comment

Uh oh!

Lunderberg commented Jun 22, 2021

Uh oh!

tmoreau89 left a comment

Choose a reason for hiding this comment

Uh oh!

Lunderberg commented Jun 22, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jwfromm left a comment

Choose a reason for hiding this comment

Uh oh!

tmoreau89 commented Jun 24, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Lunderberg commented May 10, 2021 •

edited

Loading