Closed Bug 1150152 Opened 8 years ago Closed 7 years ago

Generate and persist performance series alerts in perfherder

Categories

(Tree Management :: Perfherder, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wlach, Assigned: wlach)

References

Details

Attachments

(3 files, 3 obsolete files)

We should generate summary alerts in treeherder/perfherder. This will allow us to tightly integrate the alerts with various views of the performance data (annotated graphs, etc.) and paves the way towards eventually integrating the alertmanager interface inside treeherder/perfherder itself.

My current plan is to more or less reuse the current graphserver algorithm. For a first iteration, I would be happy providing some kind of JSON feed of alerts, that can be compared against what's currently being produced by graphserver. That will cover the scope of this bug. After that's finished, we can file followups to reach full feature parity with graphserver alerts.
Depends on: 1174877
Commit pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/8baa73b9b4dfa679056516bdfb6b8dabcbd97ed1
Bug 1150152 - Management command for generating perf regression lists in CSV

This is an interim test script to allow us to prove out that we're still
generating the same performance regressions as graphserver (roughly). I
suspect it'll still be useful once we have proper alerts inserted into
the database, however.
I just pushed a testing script that allows us to generate a list of alerts in CSV format. I didn't bother asking for review as it doesn't change any production functionality (and wanted to be able to use it in production to produce testing data for :jmaher asap)
Attached file perfalerts.csv (obsolete) —
Hey Joel, here's the set of summary alerts that are detected by the graphserver code that I imported into the tree. I believe it is using the same settings as graphserver.

It's a CSV file, header information on the fields is on the top. Should be easy to import into a spreadsheet or whatever.
Flags: needinfo?(jmaher)
these are the alerts from alert manager.  I know if I am on us holiday and a weekend, I might not get to this before next Monday.  Of course this could wait until then.

Ideally we need to match the fields up as much as possible.
Flags: needinfo?(jmaher)
Attached file perfalerts.csv (obsolete) —
perfalerts with platform info!! LOL
Attachment #8627810 - Attachment is obsolete: true
a couple things:
1) how to deal with reverse tests?  what is a + or a - in the percent field?


I am not sure the best way to query this.  I am thinking something like this:
for type in [regression, improvement]:
    for branch in branches:
        for revision in branch[revisions]:
            phAlerts = phQueryAlerts(type, branch, revision)
            gsAlerts = gsQueryAlerts(type, branch, revision)
            compareAlerts(phAlerts, gsAlerts)


compareAlerts(phAlerts, gsAlerts):
    for phTest in phAlerts:
        for gsTest in gsAlerts:
            if len(gsTest) != len(phTest):
                print '-- gsTest != phTest' 
            else:
                print '++ gsTest'


essentially comparing on the revision axis vs per test, etc.
waiting for perfherder data to be less duplicated- next week lets regenerate our data and try again.
Attached file alerts-jul14-2015.csv (obsolete) —
Alerts from July 14 with de-duplicated data
Attachment #8628949 - Attachment is obsolete: true
Attached file alerts-jul17.csv
There was yet another case where we could have duplicated datapoints due to the result entries being in a different order (see the followups in bug 1182282). Here's a regenerated set with that fixed.
Attachment #8633530 - Attachment is obsolete: true
looking over this we have:
* half of the alerts are 100% match with graph server
* 30% are just slightly off (in % or revisions)
* 20% are duplicates handled differently or tests which we need to work on

some next steps:
1) verify test by test that we have valid data (look at the summary data).  For example pageloader style tests where we calculate the geomean are not 100% identical- lets understand it
2) fix v8, kraken, canvasmark to use a proper summarization of the subtests
3) resolve xperf and related metrics
4) look at counters- verify we can generate alerts for those

I think once we have done these 4 things, we can mark this bug as done and assume that our data and alert generation code is in parity with graph server.
for the xperf data and counters, we will need to do a small database refactor.  This is on the wlach queue for next week.  The main work here is to define what a subtest is vs a summary.  Currently we only alert on summaries, but counters would either need to have a summary or we need to redefine the data to be alertable.
Blocks: 1184966
No longer depends on: 1184966
No longer blocks: 1184966
Summary: Generate series summary alerts in perfherder → Generate and persist performance series alerts in perfherder
Blocks: 1201154
Attached file PR for persistence
Attachment #8658918 - Flags: feedback?(jmaher)
Comment on attachment 8658918 [details]
PR for persistence

I would like to know if the amount_pct is +-?  Also how do we know if it is reverse or not (e.g. improvement or regression?)

does result_set_id give us the ability to query the revision, branch, platform, buildtype?

lastly will counters fall into the same bucket with suite_name: "tp5o_main_RSS" or something like that?

overall this is a great starting point!
Attachment #8658918 - Flags: feedback?(jmaher) → feedback+
(In reply to Joel Maher (:jmaher) from comment #14)
> Comment on attachment 8658918 [details]
> PR for persistence
> 
> I would like to know if the amount_pct is +-?  Also how do we know if it is
> reverse or not (e.g. improvement or regression?)

Its value is taken directly from the alerts code.

We currently don't keep any metadata on whether higher or lower values are better per performance series, so you need to do client side detection of that. We should fix this, see bug 1158233.

> does result_set_id give us the ability to query the revision, branch,
> platform, buildtype?

Revision, yes. Branch is implied by using the project endpoint. For platform and buildtype, you'd currently need to look that up via the appropriate endpoints (performance signature and option collection)

We could potentially put this information directly in the alerts, at the cost of a bit of extra bandwidth and CPU on the server side.

> lastly will counters fall into the same bucket with suite_name:
> "tp5o_main_RSS" or something like that?

Yeah, we'll just whatever suite they belong to in the alert.
what are the next steps here?
(In reply to Joel Maher (:jmaher) from comment #16)
> what are the next steps here?

There are two ways forward that have been proposed:

1. Try to import this data into alert manager.
2. Try to build a new UI around this API and use it side-by-side with alert manager until we're satisfied with it.

I'm personally kind of inclined to just do (2) right away. I'm not sure if there'd be a lot of value in putting perfherder alerts into alertmanager.
if we have an API then 1 or 2 will be fine- I would be happy to help contribute to a simple UI in perfherder as well.
This part is done, more or less. Just refining the sheriffing workflow now.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.