Commit c7bc05f5 authored by Dave Tu's avatar Dave Tu Committed by Commit Bot

Update speed docs for Pinpoint.

Bug: catapult:#4159, catapult:#4162
Change-Id: I074cada6d35fa3b0908e8a8d9451e228746db47f
Reviewed-on: https://chromium-review.googlesource.com/922701
Commit-Queue: David Tu <dtu@chromium.org>
Reviewed-by: default avatarAnnie Sullivan <sullivan@chromium.org>
Reviewed-by: default avatarSimon Hatch <simonhatch@chromium.org>
Cr-Commit-Position: refs/heads/master@{#537409}
parent 7dfb9b6d
# Bisecting Performance Regressions
[TOC]
## What are performance bisects?
The perf tests on chromium's continuous build are very long-running, so we
cannot run them on every revision. Further, separate repositories like v8
and skia sometimes roll multiple performance-sensitive changes into chromium
at once. For these reasons, we need a tool that can bisect the root cause of
performance regressions over a CL range, descending into third_party
repositories as necessary. This is what the performance bisect bots do.
repositories as necessary. The name of the service that does this is called
[Pinpoint](https://pinpoint-dot-chromeperf.appspot.com/).
The team is currently working on a new version of performance biscect called
[pinpoint](https://docs.google.com/document/d/1FKPRNU2kbPJ15p6XHO0itCjYtfvCpGt2IHblriTX1tg/edit)
[TOC]
## Starting a perf bisect
Performance bisects are tightly integrated with the
Performance bisects are integrated with the
[Chrome Performance Dashboard](https://chromeperf.appspot.com/alerts) and
[monorail](https://bugs.chromium.org/p/chromium/issues/list). Users kick off
perf bisects on the perf dashboard and view results in monorail.
You can kick off a perf bisect anywhere you see a performance graph on the perf
dashboard (except for some tests which don't bisect, because they do not run on
the [chromium.perf waterfall](https://build.chromium.org/p/chromium.perf/waterfall)).
You can kick off perf bisect from performance graphs on the perf dashboard for
any test that runs on the
[chromium.perf waterfall](https://ci.chromium.org/p/chromium/g/chromium.perf/builders).
### To get to a graph, use one of the following methods:
......@@ -37,90 +33,43 @@ the [chromium.perf waterfall](https://build.chromium.org/p/chromium.perf/waterfa
### To kick off a bisect from the graph:
![Bisecting on a performance graph](images/bisect_graph.png)
![The bisect dialog](images/bisect_dialog.png)
1. Click on a data point in the graph.
2. In the tooltip that shows up, click the `BISECT` button.
3. Make sure to enter a Bug ID in the dialog that comes up.
4. Click the `START BISECT` button.
4. Click the `CREATE` button.
![Bisecting on a performance graph](images/bisect_graph.png)
![The bisect dialog](images/bisect_dialog.png =100)
### What are all the boxes in the form?
* **Bisect bot**: The name of the configuration in the perf lab to bisect on.
This has been prefilled to match the bot that generated the graph as
closely as possible.
* **Metric**: The metric of the performance test to bisect. This defaults to
the metric shown on the graph. It shows a list of other related metrics
(for example, if average page load time increased, the drop down will show
a list of individual pages which were measured).
* **Story filter**: This is a flag specific to
[telemetry](https://github.com/catapult-project/catapult/blob/master/telemetry/README.md).
It tells telemetry to only run a specific test case, instead of running all
the test cases in the suite. This dramatically reduces bisect time for
large test suites. The dashboard will prefill this box based on the graph
you clicked on. If you suspect that test cases in the benchmark are not
independent, you can try bisecting with this box cleared.
* **Bug ID**: The bug number in monorail. It's very important to fill in
this field, as this is where bisect results will be posted.
* **Earlier revision**: The chromium commit pos to start bisecting from. This
* **Start commit**: The chromium commit pos to start bisecting from. This
is prefilled by the dashboard to the start of the revision range for the
point you clicked on. You can set it to an earlier commit position to
bisect a larger range.
* **Later revision**: The chromium commit pos to bisect to. This is prefilled
* **End commit**: The chromium commit pos to bisect to. This is prefilled
by the dashboard to the end of the revision range for the point you clicked
on. You can set it to a later commit pos to bisect a larger range.
* **Launch on staging bots**: This is an internal feature, which allows the
bisect team to launch a bisect on a test configuration. You likely don't
want to check this box unless instructed by the bisect team.
* **Bisect mode**: use "mean" to bisect the mean value of the performance
test. See below for "return_code".
## Bisecting test failures
The perf bisect bots can also be used to bisect performance test failures.
See details in [Triaging Data Stoppage Alerts](triaging_data_stoppage_alerts.md).
* **Story Filter**: This is a flag specific to
[telemetry](https://github.com/catapult-project/catapult/blob/master/telemetry/README.md).
It tells telemetry to only run a specific test case, instead of running all
the test cases in the suite. This dramatically reduces bisect time for
large test suites. The dashboard will prefill this box based on the graph
you clicked on. If you suspect that test cases in the benchmark are not
independent, you can try bisecting with this box cleared.
* **Performance or functional**: use "performance" to bisect on a performance
metric, or "functional" to bisect on a test failure or flake.
## Interpreting the results
The bisect bot will output a comment on the bug you input into the dialog when
bisection is complete. See the
The bisect bot will output a comment on the bug when the bisection is complete. See
[Understanding the Bisect Results](addressing_performance_regressions.md#Understanding-the-bisect-results)
section of the Addressing Performance Regressions doc for details on how to
interpret the results.
## Getting more debugging data
The bisect outputs some additional data which might be useful for really tough
regressions or confusing results.
### Traces
Chrome traces are generated by most bisects and uploaded to cloud storage, but
they're not very visible in the UI. We plan to address this in
[pinpoint](https://docs.google.com/document/d/1FKPRNU2kbPJ15p6XHO0itCjYtfvCpGt2IHblriTX1tg/edit),
but in the short term here are the steps to get the traces from a bisect:
1. The bisect comment should have a "Debug Info" link that looks like this:
`https://chromeperf.appspot.com/buildbucket_job_status/8980436717323504240`
Click it.
2. In the debug info, you should see a "Buildbot link" that looks like this:
`https://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus7_perf_bisect/builds/4097`
Click it.
3. There will be several steps on the buildbot status page named "Bisecting
Revision". Each has an annotation like "Revision: chromium@474894" so you
can tell which revision it ran. Pick the commit position you want the
trace from (usually the one at your CL and the one immediately before).
Click the arrow by "> Nested step(s) for: Bisecting revision..." on those
steps.
4. In the nested steps, you'll see several steps titled "Bisecting
revision.Performance Test X of Y". These are the actual perf test runs.
Click the "stdout" link for one of these steps.
5. In the output, do a text search for "View generated trace files online"
and you'll see a link to a trace that looks like this:
`https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_0-2017-05-05_05-41-49-83206.html`
Here are some screenshots showing what to click on:
![Finding the Bisecting Revision Step](images/bisecting_revision_step.png)
![Getting to the stdout](images/bisect_stdout.png)
\ No newline at end of file
for details on how to interpret the results.
### Traces and stdout
On the Job result page, there is a line chart. Each dot represents a commit. The bisect culprits are represented by flashing dots. Clicking on a dot reveals some colored bars; each box represents one benchmark run. Click on one of the runs to see trace links. Click on the `task_id` link to see the stdout.
![Trace links](images/pinpoint-trace-links.png)
docs/speed/images/bisect_dialog.png

48.1 KB | W: | H:

docs/speed/images/bisect_dialog.png

48.3 KB | W: | H:

docs/speed/images/bisect_dialog.png
docs/speed/images/bisect_dialog.png
docs/speed/images/bisect_dialog.png
docs/speed/images/bisect_dialog.png
  • 2-up
  • Swipe
  • Onion skin
# Perf Try Bots
[TOC]
## What is a perf try job?
Chrome has a performance lab with dozens of device and OS configurations. You
can run performance tests on an unsubmitted CL on these devices using Pinpoint. The specified CL will be run against tip-of-tree with and without the CL applied.
## Supported platforms
Chrome has a performance lab with dozens of device and OS configurations.
[Pinpoint](https://pinpoint-dot-chromeperf.appspot.com) is the service that lets
you run performance tests in the lab. With Pinpoint, you can run try jobs, which
let you put in a Gerrit patch, and it will run tip-of-tree with and without the
patch applied.
The platforms available in the lab change over time. To see the currently supported platofrms, click the "configuration" dropdown on the dialog.
## Supported benchmarks
All the telemetry benchmarks are supported by the perf trybots. To get a full
list, run `tools/perf/run_benchmark list`.
[TOC]
To learn more about the benchmark, you can read about the
[system health benchmarks](https://docs.google.com/document/d/1BM_6lBrPzpMNMtcyi2NFKGIzmzIQ1oH3OlNG27kDGNU/edit?ts=57e92782),
which test Chrome's performance at a high level, and the
[benchmark harnesses](https://docs.google.com/spreadsheets/d/1ZdQ9OHqEjF5v8dqNjd7lGUjJnK6sgi8MiqO7eZVMgD0/edit#gid=0),
which cover more specific areas.
## Why perf try jobs?
* All of the devices exactly match the hardware and OS versions in the perf
continuous integration suite.
* The devices have the "maintenance mutex" enabled, reducing noise from
background processes.
* The devices are instrumented with BattOrs for power measurements.
* Some regressions take multiple repeats to reproduce, and Pinpoint
automatically runs multiple times and aggregates the results.
* Some regressions reproduce on some devices but not others, and Pinpoint will
run the job on multiple devices.
## Starting a perf try job
![Pinpoint Perf Try Button](images/pinpoint-perf-try-button.png)
Visit [Pinpoint](https://pinpoint-dot-chromeperf.appspot.com) and click the perf try button in the bottom right corner of the screen.
![Pinpoint Perf Try Button](images/pinpoint-perf-try-button.png)
You should see the following dialog popup:
![Perf Try Dialog](images/pinpoint-perf-try-dialog.png)
......@@ -37,20 +33,26 @@ You should see the following dialog popup:
**Build Arguments**| **Description**
--- | ---
Bug Id | (optional) A bug ID.
Gerrit URL | The patch you want to run the benchmark on.
Configuration | The configuration to run the test on.
Browser | (optional) The specific browser to use for the test.
Bug ID | (optional) A bug ID. Pinpoint will post updates on the bug.
Gerrit URL | The patch you want to run the benchmark on. Patches in dependent repos (e.g. v8, skia) are supported.
Bot | The device type to run the test on. All hardware configurations in our perf lab are supported.
<br>
**Test Arguments**| **Description**
--- | ---
Benchmark | A telemetry benchmark, eg. system_health.common_desktop
Benchmark | A telemetry benchmark. E.g. `system_health.common_desktop`<br><br>All the telemetry benchmarks are supported by the perf trybots. To get a full list, run `tools/perf/run_benchmark list`<br><br>To learn more about the benchmarks, you can read about the [system health benchmarks](https://docs.google.com/document/d/1BM_6lBrPzpMNMtcyi2NFKGIzmzIQ1oH3OlNG27kDGNU/edit?ts=57e92782), which test Chrome's performance at a high level, and the [benchmark harnesses](https://docs.google.com/spreadsheets/d/1ZdQ9OHqEjF5v8dqNjd7lGUjJnK6sgi8MiqO7eZVMgD0/edit#gid=0), which cover more specific areas.
Story | (optional) A specific story from the benchmark to run.
Extra Test Arguments | (optional) Extra arguments for the test, eg. --extra-chrome-categories="foo,bar"
Extra Test Arguments | (optional) Extra arguments for the test. E.g. `--extra-chrome-categories="foo,bar"`<br><br>To see all arguments, run `tools/perf/run_benchmark run --help`
**Values Arguments**| **Description**
--- | ---
Chart | (optional) Please ignore.
TIR Label | (optional) Please ignore.
Trace | (optional) Please ignore.
Statistic | (optional) Please ignore.
## Interpreting the results
### Detailed results
On the Job result page, click the "Analyze benchmark results" link at the top. See the [metrics results UI documentation](https://github.com/catapult-project/catapult/blob/master/docs/metrics-results-ui.md) for more details on reading the results.
### Traces
On the Job result page, there is a chart containing two dots. The left dot represents HEAD and the right dot represents the patch. Clicking on the right dot reveals some colored bars; each box represents one benchmark run. Click on one of the runs to see trace links.
![Trace links](images/pinpoint-trace-links.png)
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment