Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1 of 2] Fix & re-enable perf profiler test: update Java test program to better match test expectations. #904

Merged
merged 9 commits into from
Feb 27, 2023

Conversation

etep
Copy link

@etep etep commented Feb 22, 2023

Summary: In preparation for re-enabling the profiler test cases, we update the Java test program such that its leaf functions do the exact amount of work expected. Previously, we computed two different Fibonacci numbers F27 and F52 with the expectation that F52 would require 2x the work. Unfortunately, there was some variance between JVMs and test runs, and this caused test flakiness.

In this patch, we simplify the test program. The new (and renamed) leaf functions simply count by some increment. The increment is chosen such that a certain leaf function does twice the counting and hence twice the work.

In a future diff, to reduce test run time, we can remove some of the Java test cases in a different PR.

Relevant Issues: #719

Type of change: /kind bug fix.

Test Plan: Tested locally using --runs_per_test=32. All tests passed. Also verified test flakiness by running tests without the fix.

Pete Stevenson added 3 commits February 22, 2023 10:12
Signed-off-by: Pete Stevenson <jps@pixielabs.ai>
Signed-off-by: Pete Stevenson <jps@pixielabs.ai>
Signed-off-by: Pete Stevenson <jps@pixielabs.ai>
@etep etep requested a review from a team February 22, 2023 19:42
Comment on lines 455 to 456
// TODO(jps/oazizi): This test is flaky.
TEST_F(PerfProfileBPFTest, DISABLED_GraalVM_AOT_Test) {
Copy link
Member

@ddelnano ddelnano Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to remove the DISABLED_ prefix and this TODO, right? Can you confirm that your testing was performed with this applied?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing was done using a different branch that re-based the next diff (which removes DISABLED_) onto this diff.

Signed-off-by: Pete Stevenson <jps@pixielabs.ai>
Pete Stevenson added 2 commits February 22, 2023 12:43
Signed-off-by: Pete Stevenson <jps@pixielabs.ai>
Signed-off-by: Pete Stevenson <jps@pixielabs.ai>
@etep etep marked this pull request as draft February 22, 2023 20:45
@etep
Copy link
Author

etep commented Feb 22, 2023

Converting to draft while the tests run.

@etep etep changed the title [1 of 2] Fix & re-enable perf profiler tests: update Java test program to better match test expectations. [1 of 2] Fix & re-enable perf profiler test: update Java test program to better match test expectations. Feb 22, 2023
// Unfortunately, different JVMs inline functions in different ways. Some of them would inline
// leaf1x and leaf2x and as a result, we were unable to find the expected symbols.

public static long leaf1x() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we can't just sleep for 5 seconds in one and 10 seconds in the other?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does sleep just de-schedule the process? If so, then we probably wouldn't actually see our process doing much at all. Also, this gives us control over the symbols (i.e. versus whatever specific impl. of sleep comes with the multitude of JVMs that we may test with).

Copy link
Member

@JamesMBartlett JamesMBartlett Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose sleep wouldn't quite work but something like:

long startTime = System.currentTimeMillis();

while ((System.curentTimeMillis() - startTime) < 5*1000) {
}

seems a lot simpler to me. As it is there's m, n, s, i, j, and k and its quite hard to understand what's going on.

In terms of the stack traces, couldn't we just ignore anything below leaf1x or leaf2x in the stack when doing assertions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We removed some of the seemingly useless code and added a better comment to explain what remains.

In more detail, we ran a few tests to evaluate different versions of ProfilerTest.java.

  1. we tested using the call to System.currentTimeMillis().
  2. we tested both with and without the "dead" and "useless" code in the loop.

For experiment (1) we found that use of the system call caused stack traces to get "broken": when the stack trace is sampled inside of the clock system call, it does not get connected to its caller, i.e. the system call to clock is not connected to its calling function leaf1x or leaf2x. The test expectations were met, but with fewer samples and more variance.

For experiment (2), we removed the "prime number mod" in the loop. We found that some JVMs were able to statically analyze the very simple code and create a highly optimized representation of the function. For example, without the "dead code" in the loop, for our "azul-zulu-alipine" image, we found that leaf1x and leaf2x did the same amount of work and that they had a 10x speedup versus their "harder to optimize version" that we are merging in this PR. We want the "harder to optimize" version because we want leaf2x and leaf1x to do the work so that our test expectation is met.

Pete Stevenson added 3 commits February 22, 2023 16:01
Signed-off-by: Pete Stevenson <jps@pixielabs.ai>
Signed-off-by: Pete Stevenson <jps@pixielabs.ai>
Signed-off-by: Pete Stevenson <jps@pixielabs.ai>
@etep etep marked this pull request as ready for review February 24, 2023 00:22
Copy link
Member

@JamesMBartlett JamesMBartlett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a legitimate issue that stack traces are broken on Java with anything that has system calls. I also don't think its a rare case, so likely many users are seeing missing stacktraces (for the functions they care about) when there are system calls involved. But for now I think this is fine to get the test working again.

@vihangm vihangm merged commit f510a50 into pixie-io:main Feb 27, 2023
RagalahariP pushed a commit to RagalahariP/pixie that referenced this pull request Mar 23, 2023
… to better match test expectations. (pixie-io#904)

Summary: In preparation for re-enabling the profiler test cases, we
update the Java test program such that its leaf functions do the exact
amount of work expected. Previously, we computed two different Fibonacci
numbers F27 and F52 with the expectation that F52 would require 2x the
work. Unfortunately, there was some variance between JVMs and test runs,
and this caused test flakiness.

In this patch, we simplify the test program. The new (and renamed) leaf
functions simply count by some increment. The increment is chosen such
that a certain leaf function does twice the counting and hence twice the
work.

In a future diff, to reduce test run time, we can remove some of the
Java test cases in a different PR.

Relevant Issues: pixie-io#719

Type of change: /kind bug fix.

Test Plan: Tested locally using --runs_per_test=32. All tests passed.
Also verified test flakiness by running tests without the fix.

---------

Signed-off-by: Pete Stevenson <jps@pixielabs.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants