-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Create a macro which automates creation of benchmark test suites. #8104
Conversation
frame/benchmarking/src/lib.rs
Outdated
/// | ||
/// ## Arguments | ||
/// | ||
/// The first argument, `new_test_ext`, must be the path to a function which takes no arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking no arguments is a bad assumption for this function unfortunately.
Take a look at here:
https://github.com/paritytech/substrate/blob/master/frame/babe/src/benchmarking.rs#L76
One thing that will help you figure out if this PR satisfies the needs we have is to replace all of the benchmark test instances with your new macro and confirm things compile. You will have to be careful of such customization instances (although any really heavy customization we can just say is not supported by the macro and should be implemented manually)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Working on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a6cf5d4 should resolve this.
frame/benchmarking/src/lib.rs
Outdated
$new_test_ext().execute_with(|| { | ||
use $crate::Benchmarking; | ||
for benchmark_name in Module::<$test>::benchmarks(true) { | ||
assert_ok!(test_bench_by_name::<$test>(benchmark_name)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am concerned with this approach if one of the benchmarks fails, it will be hard to identify which benchmark is the issue.
I would prefer we actually generate N tests in this macro where each benchmark gets its own entire function, and thus when it fails, it is easy to identify which test failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is critical for debugging during benchmark development.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer we actually generate N tests in this macro where each benchmark gets its own entire function
I'd prefer that too. Unfortunately, it's hard: we don't have a simple way to iterate over the names of the benchmarks within the macro scope which generates the test module, which is why we introduce the test_bench_by_name
function at all.
I'll think some more about how to accomplish this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moving the assert to inside test_bench_by_name should be equivalent to what was before no ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No: a failing assertion within test_bench_by_name
still fails the single test without testing anything else, or propagating information about which test failed.
A short-term hack for identifying failing benchmarks would be to simply print the current benchmark to stdout: you can identify the failing benchmark because its name will be the last printed. On success, cargo test
will just swallow stdout.
However, doing this properly involves building the functionality of impl_benchmark_test_suite
directly into the benchmarks
macro; there just isn't another way that I can see to generate individual test cases without requiring the user to explicitly enumerate the generated test functions. Obviously, explicitly enumerating those is worse than the status quo: it retains the boilerplate, but makes it less clear what is being accomplished.
If I have to substantially modify the benchmarks
macro, I'd like to rewrite it as a procedural macro, because that's just much easier to understand than some 900 lines of interlinked recursive example macros as currently exist. However, that feels like kind of an enormous job. I can say that it'll take me a fair while to complete it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before I start on a huge project rewriting the benchmarks
macro, what do you think of cb6a33d? It fails at the goal of generating a single test case per benchmark test, but it succeeds at the goals of running all the tests, and providing a list of failing tests, and what the failures were.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is hard for me to tell without a UI test, but I can be fine with it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm actually a lot of code in benchmark would simply do assert and unwrap instead of returning error.
For them maybe we can print the name of the test before its exeuction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. fd71312
Generally looks great. I have some feedback here that needs some tweaks, but should be easy enough. I would also say I expect this PR to go and update all instances of benchmarks in FRAME to use this macro now. This will help you identify that indeed this macro has all the features that are needed. |
Much better practice than depending on it showing up implicitly in the namespace.
frame/benchmarking/src/lib.rs
Outdated
($bench_module:tt, $new_test_ext:path, $test:path) => { | ||
#[cfg(test)] | ||
mod tests { | ||
use super::{test_bench_by_name, $bench_module}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_bench_by_name
is still used implicitly by macro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm not sure that we want to make this explicit: the function is ultimately generated by the benchmarks!
macro, and isn't visible to the user unless they do something like cargo doc --all-features
. Passing it in explicitly gives the user more flexibility in terms of putting the impl_benchmark_test_suite
invocation somewhere other than the benchmarks
invocation, but if we require the user to provide it, it'll just be noise an easy majority of the time.
What do you think of this: we could add an optional parameter $path_to_benchmarks_invocation:path
which defaults to super
. We then use $path_to_benchmarks_invocation::test_bench_by_name;
. It's still kind of implicit, but when the user needs to deal with it, it's less abstruse; they just fill in the path to the module in which they invoked benchmarks!
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another idea would be to move impl_benchmark_test_suite
inside benchmark macro itself as an optional final argument.
I don't really now what is best
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ebf64f0 has what I was talking about. I think that's more usable than requiring the end-user to specify a generated function.
frame/benchmarking/src/lib.rs
Outdated
// just iterate over the `Benchmarking::benchmarks` list to run the actual implementations. | ||
#[macro_export] | ||
macro_rules! impl_benchmark_test_suite { | ||
($bench_module:tt, $new_test_ext:path, $test:path) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
($bench_module:tt, $new_test_ext:path, $test:path) => { | |
($bench_module:ident, $new_test_ext:path, $test:path) => { |
Maybe we can use ident instead of tt
.
Also here I guess we can't make it a path
because later we do: $bench_module::<$test>::benchmarks
and macro is failing to concatenate path somehow, isn't it?
If this is the reason then we can consider moving to procedural macro also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's precisely why we can't use path
. I'll try it with ident
. I'd prefer not to move to a procedural macro, if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not moving to procedural ?
I actually prefer them, it is easier to extend and has less weird stuff like this
cc @shawntabrizi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main objection to procedural macros is that they're a heavy hammer: they take much longer to develop (at my level of experience) than macros-by-example. There is also some weirdness about requiring their own specially-configured crate. It doesn't feel worth it for a macro as small as impl_benchmark_test_suite
currently is.
However, as I wrote in #8104 (comment), the only way I can see to get a single test case per benchmark is to merge the functionality into the benchmarks
macro. Once I'm at that level of rewriting, I agree: it's worth the time to refactor the existing functionality into a procedural macro because doing so will be easier than substantially modifying a huge recursive macro by example such as currently exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My 2 cents is that once you get the initial scaffolding there, proc-marcos are well worth the extra time in the long run. For this use case, I'd admit that I would personally go with them, but can totally agree with that it is simpler to start with a simple macro_rules
.
Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com>
This turned out to be surprisingly easy. On reflection, it turns out that of course the compiler can't eagerly evaluate the function call, but needs to paste it in everywhere desired.
also enable optional trailing commas
This reverts commit 0209e4d.
unfortunately, setting the profile here doesn't do anything; we'd need to set it in every leaf package anyway. However, as this was just making the default explicit anyway, I think it's safe enough to remove entirely.
These tests fail identically with and without the change, so the change seems unlikely to be the origin of the failures.
Fails identically with and without this change.
It turns out to be important to be able to exclude items marked `#[extra]` sometimes. Who knew?
Note that benchmark tests fail identically before and after this change.
@@ -536,6 +536,9 @@ benchmarks! { | |||
} | |||
} | |||
|
|||
// can't use impl_benchmark_test_suite for this module because the tests depend on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we pass in a custom exec name instead of assuming execute_with
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can; it took a while to figure out the appropriate syntax to use optional arguments in macros, but f1859ee should implement precisely that.
Took a _lot_ of macro-wrangling to get the functionality that I want, but now you have the option to pass in ```rust impl_benchmark_test_suite!( Elections, crate::tests::ExtBuilder::default().desired_members(13).desired_runners_up(7), crate::tests::Test, exec_name = build_and_execute, ); ``` and have it expand out properly. A selected fragment of the expansion: ```rust fn test_benchmarks() { crate::tests::ExtBuilder::default() .desired_members(13) .desired_runners_up(7) .build_and_execute(|| { ```
} | ||
impl_benchmark_test_suite!( | ||
Elections, | ||
crate::tests::ExtBuilder::default().desired_members(13).desired_runners_up(7), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
/// There is an optional fifth argument, with keyword syntax: `extra = true` or `extra = false`. | ||
/// By default, this generates a test suite which iterates over all benchmarks, including those | ||
/// marked with the `#[extra]` annotation. Setting `extra = false` excludes those. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dunno why we would want to exclude the extra benchmarks from tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was necessary to replicate current behavior when swapping the implementation here:
substrate/frame/session/benchmarking/src/lib.rs
Lines 172 to 177 in f3d2f8f
impl_benchmark_test_suite!( | |
Module, | |
crate::mock::new_test_ext(), | |
crate::mock::Test, | |
extra = false, | |
); |
The existing tests didn't test the extra benchmarks. Those extra benchmarks happen to fail. In order to maintain a passing test suite for pallet-session
, I had to make it possible to exclude the extras.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you add some extra macro argument because some tests are failing instead of fixing the failing tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that failing tests should have been fixed and this option should be removed
…ritytech#8104) * Create a macro which automates creation of benchmark test suites. * bump impl_version * allow unused on test_bench_by_name * use proper doctest ignore attribute * Explicitly hand the Module to the test suite Much better practice than depending on it showing up implicitly in the namespace. * explicitly import what we need into `mod tests` * bench_module is `ident` not `tt` Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com> * allow end users to specify arguments for new_test_ext This turned out to be surprisingly easy. On reflection, it turns out that of course the compiler can't eagerly evaluate the function call, but needs to paste it in everywhere desired. * enable explicitly specifying the path to the benchmarks invocation also enable optional trailing commas * Revert "bump impl_version" This reverts commit 0209e4d. * list failing benchmark tests and the errors which caused the failure * harden benchmark tests against internal panics * suppress warning about ignored profiles unfortunately, setting the profile here doesn't do anything; we'd need to set it in every leaf package anyway. However, as this was just making the default explicit anyway, I think it's safe enough to remove entirely. * impl_benchmark_test_suite for assets * impl_benchmark_test_suite for balances * impl_benchmark_test_suite for bounties * impl_benchmark_test_suite for Collective * impl_benchmark_test_suite for Contracts * impl_benchmark_test_suite for Democracy * don't impl_benchmark_test_suite for Elections-Phragmen * impl_benchmark_test_suite for Identity Note that Identity tests currently fail. They failed in an identical way before this change, so as far as I'm concerned, the status quo is good enough for now. * impl_benchmark_test_suite for ImOnline * impl_benchmark_test_suite for indices For this crate also, the test suite fails identically with and without this change, so we can say that this change is not the cause of the tests' failure to compile. * impl_benchmark_test_suite for lottery * impl_benchmark_test_suite for merkle-mountain-range * impl_benchmark_test_suite for Multisig These tests fail identically with and without the change, so the change seems unlikely to be the origin of the failures. * impl_benchmark_test_suite for offences * impl_benchmark_test_suite for Proxy Fails identically with and without this change. * impl_benchmark_test_suite for scheduler * impl_benchmark_test_suite for session It turns out to be important to be able to exclude items marked `#[extra]` sometimes. Who knew? * impl_benchmark_test_suite for staking * impl_benchmark_test_suite for system * impl_benchmark_test_suite for timestamp * impl_benchmark_test_suite for tips * impl_benchmark_test_suite for treasury * impl_benchmark_test_suite for utility Note that benchmark tests fail identically before and after this change. * impl_benchmark_test_suite for vesting * fix wrong module name in impl_benchmark_test_suite in Offences * address line length nits * enable optional keyword argument: exec_name Took a _lot_ of macro-wrangling to get the functionality that I want, but now you have the option to pass in ```rust impl_benchmark_test_suite!( Elections, crate::tests::ExtBuilder::default().desired_members(13).desired_runners_up(7), crate::tests::Test, exec_name = build_and_execute, ); ``` and have it expand out properly. A selected fragment of the expansion: ```rust fn test_benchmarks() { crate::tests::ExtBuilder::default() .desired_members(13) .desired_runners_up(7) .build_and_execute(|| { ``` * get rid of dead code Co-authored-by: Guillaume Thiolliere <gui.thiolliere@gmail.com>
Example:
Expansion
Excerpted from
Other Notes