Perfherder should use (only) full filtered values for subtest summaries



Tree Management
2 years ago
2 years ago


(Reporter: wlach, Assigned: wlach)




(2 attachments, 1 obsolete attachment)

Right now perfherder ingests the subtest values produced from talos that are in some cases only partially filtered.

From the replicates for each subtest, we usually drop the first 5, then come up with a mean/median/max/min from that. Then there's the additional step of producing something called 'filtered', which in most cases is just the median, but in some cases is a custom calculation on the subtest to make it more meaningful (i.e. for v8_7, we run a subset of the suite summarization on the subtest summary). For these cases, the mean/median/max/min won't be very useful, since they'll be on a completely different scale. To make things even more confusing, we also re-store 'filtered' as 'value' for v8_7 and dromaeo. 

After discussing things with :jmaher and :avih, I think the way forward is to produced one single "subtest summary" value from Talos (what we currently called 'filtered') and (only) process that and store it in Perfherder. If people want data on a finer layer of granularity, they should just look at the raw replicate values from the test.

Here's where these numbers are produced in talos:

And processed in treeherder:

Proposed way forward:

1. Modify perfherder to ingest "filtered" and (only) store that per test as a single object "value". This will entail rewriting all the existing subtests data (to extract out "median" and storing that as "value", where it doesn't exist already).
2. Modify talos to only produce "value" for subtests.
I'm going to take a pass at fixing this a bit sooner than expected (i.e. now) for perfherder at least, as it turns out that this change means we can get a way with using a slightly simpler data model, which will simplify bug 1192976 a bit.
Assignee: nobody → wlachance
Created attachment 8656151 [details] [review]
PR for perfherder implementation of fix

This fixes things on the perfherder side-- we probably eventually want to simplify things on the talos end as well (i.e. just producing a "value" summary value for subtests, instead of "filtered" and all the rest).

I've tested this locally in a vagrant environment and it works great.

Note that this will result in performance data that came before it to "disappear" from the graphs and compare view. It's not really gone, it's just not visible because old data doesn't have the "value" property. I considered writing a migration script, but it would be hard to create one that doesn't have race conditions. I think I'd prefer just to convert the old data over as part of the script in bug 1192976, which will be much easier (it's just a small modification to the conversion script I've written). We can then carefully stage landing this *just* before bug 1192976, so people won't know what hit them. ;)
Attachment #8656151 - Flags: review?(jmaher)
Attachment #8656151 - Flags: feedback?(avihpit)
Duplicate of this bug: 1191711
Comment on attachment 8656151 [details] [review]
PR for perfherder implementation of fix

r+ with test files updated.
Attachment #8656151 - Flags: review?(jmaher) → review+

Comment 5

2 years ago
Comment on attachment 8656151 [details] [review]
PR for perfherder implementation of fix

Overall looks good to me.

TBH I'd think it'd been a simpler modification to just add a "filtered" field to all the min/max/etc values, and then later on use this field instead of using "median" (as I believe that's what's been used so far).

But it's fine as is as well.
Attachment #8656151 - Flags: feedback?(avihpit) → feedback+
(In reply to Avi Halachmi (:avih) from comment #5)
> TBH I'd think it'd been a simpler modification to just add a "filtered"
> field to all the min/max/etc values, and then later on use this field
> instead of using "median" (as I believe that's what's been used so far).

If you mean changing what we produce by talos, yes, I agree that's a better approach. We'll do that later, see the bug description.

Comment 7

2 years ago
NO, I meant at your patch. The patch doesn't touch the talos output, but the patch could maybe be simpler if it only added a "filtered" field and use that, while keeping all the other fields as is.

But I trust you to know better than me what's the best change is, code wise.


2 years ago
Depends on: 1192976
Created attachment 8660953 [details] [diff] [review]
Simplify talos output

This should remove a bunch of numbers we're not using from talos output, leaving only 'filtered' and 'value' (filtered is just for backwards compatibility with perfherder, to avoid chicken & egg problem where we can't update perfherder to accept value before talos adds it).
Attachment #8660953 - Flags: review?(jmaher)
try run:
Comment on attachment 8660953 [details] [diff] [review]
Simplify talos output

Review of attachment 8660953 [details] [diff] [review]:

this is great, although in this same area of code we could do the units of measurement as well.
Attachment #8660953 - Flags: review?(jmaher) → review+
Created attachment 8662068 [details] [diff] [review]
Updated commit message

Updated commit message, carrying over r+
Attachment #8660953 - Attachment is obsolete: true
Attachment #8662068 - Flags: review+
See try run in comment 9. Also checked results in perfherder, this looks good to go:[try,bd72d04511c657c5c5040f1633fe73642fcdcb3b,1]&series=[try,6233768d21c48fe3ca4c494be32f40d9c00fdb46,1]&highlightedRevisions=cf586bdbd7d4
Keywords: checkin-needed

Comment 13

2 years ago
Keywords: checkin-needed
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.