Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Create a macro which automates creation of benchmark test suites. #8104

Merged
merged 41 commits into from
Feb 16, 2021

Conversation

coriolinus
Copy link
Contributor

@coriolinus coriolinus commented Feb 11, 2021

Example:

- #[cfg(test)]
- mod tests {
-	use super::*;
-	use crate::tests::{new_test_ext, Test};
-	use frame_support::assert_ok;
-
-	#[test]
-	fn test_benchmarks() {
-		new_test_ext().execute_with(|| {
-			assert_ok!(test_benchmark_accumulate_dummy::<Test>());
-			assert_ok!(test_benchmark_set_dummy::<Test>());
-			assert_ok!(test_benchmark_another_set_dummy::<Test>());
-			assert_ok!(test_benchmark_sort_vector::<Test>());
-		});
-	}
- }
+ impl_benchmark_test_suite!(
+     Module,
+     crate::tests::new_test_ext(), 
+     crate::tests::Test,
+ );

Expansion

Excerpted from

cargo expand -p pallet-example --features runtime-benchmarks --tests --lib  benchmarking
    #[cfg(test)]
    mod tests {
        use super::*;
        use ::frame_benchmarking::frame_support::assert_ok;
        extern crate test;
        #[cfg(test)]
        #[rustc_test_marker]
        pub const test_benchmarks: test::TestDescAndFn = test::TestDescAndFn {
            desc: test::TestDesc {
                name: test::StaticTestName("benchmarking::tests::test_benchmarks"),
                ignore: false,
                allow_fail: false,
                should_panic: test::ShouldPanic::No,
                test_type: test::TestType::UnitTest,
            },
            testfn: test::StaticTestFn(|| test::assert_test_result(test_benchmarks())),
        };
        fn test_benchmarks() {
            crate::tests::new_test_ext().execute_with(|| {
                use ::frame_benchmarking::Benchmarking;
                for benchmark_name in Module::<crate::tests::Test>::benchmarks(true) {
                    let is = test_bench_by_name::<crate::tests::Test>(benchmark_name);
                    match is {
                        Ok(_) => (),
                        _ => {
                            if !false {
                                {
                                    ::std::rt::begin_panic_fmt(
                                        &::core::fmt::Arguments::new_v1_formatted(
                                            &["Expected Ok(_). Got "],
                                            &match (&is,) {
                                                (arg0,) => [::core::fmt::ArgumentV1::new(
                                                    arg0,
                                                    ::core::fmt::Debug::fmt,
                                                )],
                                            },
                                            &[::core::fmt::rt::v1::Argument {
                                                position: 0usize,
                                                format: ::core::fmt::rt::v1::FormatSpec {
                                                    fill: ' ',
                                                    align: ::core::fmt::rt::v1::Alignment::Unknown,
                                                    flags: 4u32,
                                                    precision: ::core::fmt::rt::v1::Count::Implied,
                                                    width: ::core::fmt::rt::v1::Count::Implied,
                                                },
                                            }],
                                        ),
                                    )
                                }
                            }
                        }
                    };
                }
            });
        }
    }
    fn test_bench_by_name<T>(name: &[u8]) -> Result<(), &'static str>
    where
        T: Config + frame_system::Config,
    {
        let name =
            sp_std::str::from_utf8(name).map_err(|_| "`name` is not a valid utf8 string!")?;
        match name {
            "accumulate_dummy" => test_benchmark_accumulate_dummy::<T>(),
            "set_dummy" => test_benchmark_set_dummy::<T>(),
            "another_set_dummy" => test_benchmark_another_set_dummy::<T>(),
            "sort_vector" => test_benchmark_sort_vector::<T>(),
            _ => Err("Could not find test for requested benchmark."),
        }
    }

Other Notes

  • I applied the "B3-noteworthy" tag because I think new public APIs like this are likely to be of interest. That said, I'm not experienced in picking any B tag other than "silent", so this might be a misclassification; I'd appreciate input as to the appropriate tag.

@coriolinus coriolinus added A0-please_review Pull request needs code review. B3-apinoteworthy C1-low PR touches the given topic and has a low impact on builders. labels Feb 11, 2021
@coriolinus coriolinus self-assigned this Feb 11, 2021
///
/// ## Arguments
///
/// The first argument, `new_test_ext`, must be the path to a function which takes no arguments
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking no arguments is a bad assumption for this function unfortunately.

Take a look at here:

https://github.com/paritytech/substrate/blob/master/frame/babe/src/benchmarking.rs#L76

One thing that will help you figure out if this PR satisfies the needs we have is to replace all of the benchmark test instances with your new macro and confirm things compile. You will have to be careful of such customization instances (although any really heavy customization we can just say is not supported by the macro and should be implemented manually)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Working on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a6cf5d4 should resolve this.

$new_test_ext().execute_with(|| {
use $crate::Benchmarking;
for benchmark_name in Module::<$test>::benchmarks(true) {
assert_ok!(test_bench_by_name::<$test>(benchmark_name));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned with this approach if one of the benchmarks fails, it will be hard to identify which benchmark is the issue.

I would prefer we actually generate N tests in this macro where each benchmark gets its own entire function, and thus when it fails, it is easy to identify which test failed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is critical for debugging during benchmark development.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer we actually generate N tests in this macro where each benchmark gets its own entire function

I'd prefer that too. Unfortunately, it's hard: we don't have a simple way to iterate over the names of the benchmarks within the macro scope which generates the test module, which is why we introduce the test_bench_by_name function at all.

I'll think some more about how to accomplish this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moving the assert to inside test_bench_by_name should be equivalent to what was before no ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No: a failing assertion within test_bench_by_name still fails the single test without testing anything else, or propagating information about which test failed.

A short-term hack for identifying failing benchmarks would be to simply print the current benchmark to stdout: you can identify the failing benchmark because its name will be the last printed. On success, cargo test will just swallow stdout.

However, doing this properly involves building the functionality of impl_benchmark_test_suite directly into the benchmarks macro; there just isn't another way that I can see to generate individual test cases without requiring the user to explicitly enumerate the generated test functions. Obviously, explicitly enumerating those is worse than the status quo: it retains the boilerplate, but makes it less clear what is being accomplished.

If I have to substantially modify the benchmarks macro, I'd like to rewrite it as a procedural macro, because that's just much easier to understand than some 900 lines of interlinked recursive example macros as currently exist. However, that feels like kind of an enormous job. I can say that it'll take me a fair while to complete it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I start on a huge project rewriting the benchmarks macro, what do you think of cb6a33d? It fails at the goal of generating a single test case per benchmark test, but it succeeds at the goals of running all the tests, and providing a list of failing tests, and what the failures were.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is hard for me to tell without a UI test, but I can be fine with it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm actually a lot of code in benchmark would simply do assert and unwrap instead of returning error.
For them maybe we can print the name of the test before its exeuction

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. fd71312

@shawntabrizi
Copy link
Member

Generally looks great. I have some feedback here that needs some tweaks, but should be easy enough.

I would also say I expect this PR to go and update all instances of benchmarks in FRAME to use this macro now. This will help you identify that indeed this macro has all the features that are needed.

@gui1117 gui1117 added A5-grumble and removed A0-please_review Pull request needs code review. labels Feb 12, 2021
Much better practice than depending on it showing up implicitly in
the namespace.
($bench_module:tt, $new_test_ext:path, $test:path) => {
#[cfg(test)]
mod tests {
use super::{test_bench_by_name, $bench_module};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_bench_by_name is still used implicitly by macro

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm not sure that we want to make this explicit: the function is ultimately generated by the benchmarks! macro, and isn't visible to the user unless they do something like cargo doc --all-features. Passing it in explicitly gives the user more flexibility in terms of putting the impl_benchmark_test_suite invocation somewhere other than the benchmarks invocation, but if we require the user to provide it, it'll just be noise an easy majority of the time.

What do you think of this: we could add an optional parameter $path_to_benchmarks_invocation:path which defaults to super. We then use $path_to_benchmarks_invocation::test_bench_by_name;. It's still kind of implicit, but when the user needs to deal with it, it's less abstruse; they just fill in the path to the module in which they invoked benchmarks!.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another idea would be to move impl_benchmark_test_suite inside benchmark macro itself as an optional final argument.
I don't really now what is best

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ebf64f0 has what I was talking about. I think that's more usable than requiring the end-user to specify a generated function.

// just iterate over the `Benchmarking::benchmarks` list to run the actual implementations.
#[macro_export]
macro_rules! impl_benchmark_test_suite {
($bench_module:tt, $new_test_ext:path, $test:path) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
($bench_module:tt, $new_test_ext:path, $test:path) => {
($bench_module:ident, $new_test_ext:path, $test:path) => {

Maybe we can use ident instead of tt.
Also here I guess we can't make it a path because later we do: $bench_module::<$test>::benchmarks and macro is failing to concatenate path somehow, isn't it?
If this is the reason then we can consider moving to procedural macro also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's precisely why we can't use path. I'll try it with ident. I'd prefer not to move to a procedural macro, if possible.

Copy link
Contributor

@gui1117 gui1117 Feb 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not moving to procedural ?
I actually prefer them, it is easier to extend and has less weird stuff like this
cc @shawntabrizi

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main objection to procedural macros is that they're a heavy hammer: they take much longer to develop (at my level of experience) than macros-by-example. There is also some weirdness about requiring their own specially-configured crate. It doesn't feel worth it for a macro as small as impl_benchmark_test_suite currently is.

However, as I wrote in #8104 (comment), the only way I can see to get a single test case per benchmark is to merge the functionality into the benchmarks macro. Once I'm at that level of rewriting, I agree: it's worth the time to refactor the existing functionality into a procedural macro because doing so will be easier than substantially modifying a huge recursive macro by example such as currently exists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2 cents is that once you get the initial scaffolding there, proc-marcos are well worth the extra time in the long run. For this use case, I'd admit that I would personally go with them, but can totally agree with that it is simpler to start with a simple macro_rules.

coriolinus and others added 11 commits February 12, 2021 13:12
Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com>
This turned out to be surprisingly easy. On reflection, it turns out
that of course the compiler can't eagerly evaluate the function call,
but needs to paste it in everywhere desired.
This reverts commit 0209e4d.
unfortunately, setting the profile here doesn't do anything; we'd
need to set it in every leaf package anyway. However, as this was
just making the default explicit anyway, I think it's safe enough
to remove entirely.
These tests fail identically with and without the change, so the change
seems unlikely to be the origin of the failures.
Fails identically with and without this change.
It turns out to be important to be able to exclude items marked
`#[extra]` sometimes. Who knew?
Note that benchmark tests fail identically before and after this change.
@@ -536,6 +536,9 @@ benchmarks! {
}
}

// can't use impl_benchmark_test_suite for this module because the tests depend on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we pass in a custom exec name instead of assuming execute_with?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can; it took a while to figure out the appropriate syntax to use optional arguments in macros, but f1859ee should implement precisely that.

Took a _lot_ of macro-wrangling to get the functionality that I want,
but now you have the option to pass in

```rust
impl_benchmark_test_suite!(
	Elections,
	crate::tests::ExtBuilder::default().desired_members(13).desired_runners_up(7),
	crate::tests::Test,
	exec_name = build_and_execute,
);
```

and have it expand out properly. A selected fragment of the expansion:

```rust
        fn test_benchmarks() {
            crate::tests::ExtBuilder::default()
                .desired_members(13)
                .desired_runners_up(7)
                .build_and_execute(|| {
```
}
impl_benchmark_test_suite!(
Elections,
crate::tests::ExtBuilder::default().desired_members(13).desired_runners_up(7),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +1004 to +1006
/// There is an optional fifth argument, with keyword syntax: `extra = true` or `extra = false`.
/// By default, this generates a test suite which iterates over all benchmarks, including those
/// marked with the `#[extra]` annotation. Setting `extra = false` excludes those.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dunno why we would want to exclude the extra benchmarks from tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was necessary to replicate current behavior when swapping the implementation here:

impl_benchmark_test_suite!(
Module,
crate::mock::new_test_ext(),
crate::mock::Test,
extra = false,
);

The existing tests didn't test the extra benchmarks. Those extra benchmarks happen to fail. In order to maintain a passing test suite for pallet-session, I had to make it possible to exclude the extras.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you add some extra macro argument because some tests are failing instead of fixing the failing tests?

Copy link
Member

@shawntabrizi shawntabrizi Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that failing tests should have been fixed and this option should be removed

@coriolinus coriolinus merged commit d054593 into master Feb 16, 2021
@coriolinus coriolinus deleted the prgn-auto-benchmark-test-mod branch February 16, 2021 09:01
KalitaAlexey pushed a commit to KalitaAlexey/substrate that referenced this pull request Jul 9, 2021
…ritytech#8104)

* Create a macro which automates creation of benchmark test suites.

* bump impl_version

* allow unused on test_bench_by_name

* use proper doctest ignore attribute

* Explicitly hand the Module to the test suite

Much better practice than depending on it showing up implicitly in
the namespace.

* explicitly import what we need into `mod tests`

* bench_module is `ident` not `tt`

Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com>

* allow end users to specify arguments for new_test_ext

This turned out to be surprisingly easy. On reflection, it turns out
that of course the compiler can't eagerly evaluate the function call,
but needs to paste it in everywhere desired.

* enable explicitly specifying the path to the benchmarks invocation

also enable optional trailing commas

* Revert "bump impl_version"

This reverts commit 0209e4d.

* list failing benchmark tests and the errors which caused the failure

* harden benchmark tests against internal panics

* suppress warning about ignored profiles

unfortunately, setting the profile here doesn't do anything; we'd
need to set it in every leaf package anyway. However, as this was
just making the default explicit anyway, I think it's safe enough
to remove entirely.

* impl_benchmark_test_suite for assets

* impl_benchmark_test_suite for balances

* impl_benchmark_test_suite for bounties

* impl_benchmark_test_suite for Collective

* impl_benchmark_test_suite for Contracts

* impl_benchmark_test_suite for Democracy

* don't impl_benchmark_test_suite for Elections-Phragmen

* impl_benchmark_test_suite for Identity

Note that Identity tests currently fail. They failed in an identical
way before this change, so as far as I'm concerned, the status quo is
good enough for now.

* impl_benchmark_test_suite for ImOnline

* impl_benchmark_test_suite for indices

For this crate also, the test suite fails identically with and without
this change, so we can say that this change is not the cause of the
tests' failure to compile.

* impl_benchmark_test_suite for lottery

* impl_benchmark_test_suite for merkle-mountain-range

* impl_benchmark_test_suite for Multisig

These tests fail identically with and without the change, so the change
seems unlikely to be the origin of the failures.

* impl_benchmark_test_suite for offences

* impl_benchmark_test_suite for Proxy

Fails identically with and without this change.

* impl_benchmark_test_suite for scheduler

* impl_benchmark_test_suite for session

It turns out to be important to be able to exclude items marked
`#[extra]` sometimes. Who knew?

* impl_benchmark_test_suite for staking

* impl_benchmark_test_suite for system

* impl_benchmark_test_suite for timestamp

* impl_benchmark_test_suite for tips

* impl_benchmark_test_suite for treasury

* impl_benchmark_test_suite for utility

Note that benchmark tests fail identically before and after this change.

* impl_benchmark_test_suite for vesting

* fix wrong module name in impl_benchmark_test_suite in Offences

* address line length nits

* enable optional keyword argument: exec_name

Took a _lot_ of macro-wrangling to get the functionality that I want,
but now you have the option to pass in

```rust
impl_benchmark_test_suite!(
	Elections,
	crate::tests::ExtBuilder::default().desired_members(13).desired_runners_up(7),
	crate::tests::Test,
	exec_name = build_and_execute,
);
```

and have it expand out properly. A selected fragment of the expansion:

```rust
        fn test_benchmarks() {
            crate::tests::ExtBuilder::default()
                .desired_members(13)
                .desired_runners_up(7)
                .build_and_execute(|| {
```

* get rid of dead code

Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C1-low PR touches the given topic and has a low impact on builders.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants