Change significance to be determined by IQR fencing #996

rylev · 2021-09-08T12:36:31Z

This changes how we define (and subsequently implement in code) a "significant" test result to a more formal and less arbitrary mechanism (described below). Additionally, the documentation is updated to reflect this change.

Before
Until we've used a simple threshold of either 0.2% change for non-"dodgy" test cases (i.e., test cases which we've determined to not have some sort of historical noise) and 0.8% for "dodgy" test cases.

After
Significance is defined as being an outlier when compared with historical data. We use interquartile range fencing to determine whether a given result is an outlier.

IQR fencing uses this formula:

interquartile_range = Q3 - Q1
result > Q3 + (interquartile_range * 1.5)

rylev · 2021-09-08T14:57:17Z

I'd like to hold off on merging this for now since it makes a lot more comparisons show up as "definitely relevant". We should address that first.

Change significance to be determined by IQR fencing

d07ca42

rylev requested a review from Mark-Simulacrum September 8, 2021 12:36

Mark-Simulacrum approved these changes Sep 8, 2021

View reviewed changes

rylev added 4 commits September 9, 2021 12:17

Calculate magnitude based on amount over threshold and size of change

384f918

Add ability to use old significance scheme

42db59e

Fix bug where we were calculating mean delta incorrectly

da65e82

Update docs

b351e06

rylev merged commit cd2cd93 into rust-lang:master Sep 9, 2021

rylev deleted the significance branch September 9, 2021 12:41

rylev mentioned this pull request Sep 10, 2021

Introduce new "severity" column in comparison #997

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Change significance to be determined by IQR fencing #996

Change significance to be determined by IQR fencing #996

Uh oh!

rylev commented Sep 8, 2021 •

edited

Loading

Uh oh!

rylev commented Sep 8, 2021

Uh oh!

Uh oh!

Change significance to be determined by IQR fencing #996

Change significance to be determined by IQR fencing #996

Uh oh!

Conversation

rylev commented Sep 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rylev commented Sep 8, 2021

Uh oh!

Uh oh!

rylev commented Sep 8, 2021 •

edited

Loading