Skip to content

Commit

Permalink
Move all bot health images into an images/ directory
Browse files Browse the repository at this point in the history
TBR=eyaich@chromium.org

Bug: 842232
Change-Id: I01d6db483807f9e7840434f586e608b8d92a490a
Reviewed-on: https://chromium-review.googlesource.com/1062285
Reviewed-by: Charlie Andrews <charliea@chromium.org>
Commit-Queue: Charlie Andrews <charliea@chromium.org>
Cr-Commit-Position: refs/heads/master@{#559191}
  • Loading branch information
Charlie Andrews authored and Commit Bot committed May 16, 2018
1 parent 47cf703 commit 3156913
Show file tree
Hide file tree
Showing 17 changed files with 13 additions and 13 deletions.
12 changes: 6 additions & 6 deletions docs/speed/bot_health_sheriffing/how_to_access_test_logs.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,29 +10,29 @@ When trying to understand a failure, it can be useful to inspect the test logs w

Usually, Sheriff-o-matic will include a link directly in the alert to the most recent instance of the test failure. Just click on "shard #0 (failed)".

![Sheriff-o-matic shard #0 failed link](som_shard_0.png)
![Sheriff-o-matic shard #0 failed link](images/som_shard_0.png)

### Accessing the logs for older failures

From the Sheriff-o-matic alert, click the "Examine" link to access a list of recent runs on the given bot.

![Sheriff-o-matic open examine pane](som_examine.png)
![Sheriff-o-matic open examine pane](images/som_examine.png)

If the failure spans multiple platforms, you can select which platform you care about from the top of the new pane.

![Sheriff-o-matic choose bot from examine pane](som_examine_choose_bot.png)
![Sheriff-o-matic choose bot from examine pane](images/som_examine_choose_bot.png)

This new pane shows all recent runs, not just runs where this benchmark failed, so it's often useful to use Ctrl+F and search for the failing benchmark name to highlight these runs.Next, click the build number for the run for which you want to access the logs. Once you've found the run you want the logs for, click the build number for the run for which you want to access the logs.

![Sheriff-o-matic choose build number](som_choose_build_number.png)
![Sheriff-o-matic choose build number](images/som_choose_build_number.png)

Once at the build page listing all steps, we need to find the test step that failed. Start by selecting to show "Non-Green Only" steps at the top of the page.

![Sheriff-o-matic choose non-green only](som_choose_non_green_only.png)
![Sheriff-o-matic choose non-green only](images/som_choose_non_green_only.png)

After doing this, search for your benchmark's name (in this case, "v8.browsing_desktop") until you find a step with a name like "<benchmark_name> on <platform>". Click the "shard #0 (failed)" link below that step to open the logs.

![Sheriff-o-matic choose shard #0 failed link from test steps](som_test_steps_shard_0.png)
![Sheriff-o-matic choose shard #0 failed link from test steps](images/som_test_steps_shard_0.png)

## Navigating log files

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,17 @@ First, decide which of your duplicate alerts is going to be your "main" alert. I

Next, select all of the other alerts that you want to merge into this main alert. If there are any existing alert groups among these other alerts, you need to click "Ungroup all" and check all of the boxes in the dialog to ensure that the alerts are out of their existing groups.

![Sheriff-o-matic ungroup all](ungroup_all.png)
![Sheriff-o-matic ungroup all](images/ungroup_all.png)

![Sheriff-o-matic bulk ungroup dialog](bulk_ungroup.png)
![Sheriff-o-matic bulk ungroup dialog](images/bulk_ungroup.png)

Once you've ungrouped all the alerts being merged into the main alert and selected all alerts being merged, including your primary alert, click the "Group all" button at the top of the screen.

![Sheriff-o-matic group all](group_all.png)
![Sheriff-o-matic group all](images/group_all.png)

If the alert group still has an auto-generated name, it's a good idea to give it a name that clarifies the problem.

![Changing the group name in Sheriff-o-matic](change_group_name.png)
![Changing the group name in Sheriff-o-matic](images/change_group_name.png)

Lastly, it may be necessary to broaden the scope of the existing bug. If, for example, the previous alert had the name "system_health.common_desktop failing on Mac Retina Perf" and you merged in a duplicate alert on Mac Air Perf, you should change the bug and alert names to "system_health.common_desktop failing on multiple Mac builders" and simultaneously add a comment to the bug to the effect of "I'm broadening this bug because the issue is also causing failures on Mac Air Perf".

4 changes: 2 additions & 2 deletions docs/speed/bot_health_sheriffing/how_to_snooze_an_alert.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ After addressing an alert, the next step is to snooze it. Snoozing an alert hi

To snooze an alert, click the "Snooze" button for the specified alert and then enter the number of minutes you'd like to snooze the alert for.

![Snooze an alert](snooze_alert.png)
![Snooze an alert](images/snooze_alert.png)

![Alert snooze dialog](snooze_alert_dialog.png)
![Alert snooze dialog](images/snooze_alert_dialog.png)

Generally, you should snooze an alert for however long you think it will take for your fix to take effect and a successful run to complete, removing the alert from Sheriff-o-matic. If you're unsure of how long this should be, 24 hours (1440 minutes) is usually a safe default.
Expand Down
2 changes: 1 addition & 1 deletion docs/speed/bot_health_sheriffing/what_test_is_failing.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ The first step in addressing a test failure is to identify what stories are fail

The easiest way to identify these is to use the [Flakiness dashboard](https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=blink_perf.canvas), which is a high-level dashboard showing test passes and failures. (Sheriff-o-matic tries to automatically identify the failing stories, but is often incorrect and therefore can't be trusted.) Open up the flakiness dashboard and select the benchmark and platform in question (pulled from the SOM alert) from the "Test type" and "Builder" dropdowns. You should see a view like this:

![The flakiness dashboard](flakiness_dashboard.png)
![The flakiness dashboard](images/flakiness_dashboard.png)

Each row represents a particular story and each column represents a recent run, listed with the most recent run on the left. If the cell is green, then the story passed; if it's red, then it failed. Only stories that have failed at least once will be listed. You can click on a particular cell to see more information like revision ranges (useful for launching bisects) and logs.

Expand Down

0 comments on commit 3156913

Please sign in to comment.