Perfherder Compare should not allow (or highly warn) when comparing a try build against a mozilla-central build
Categories
(Tree Management :: Perfherder, enhancement, P1)
Tracking
(Not tracked)
People
(Reporter: jaws, Assigned: bacasandrei)
References
Details
(Keywords: good-first-bug, Whiteboard: [lang=js])
Updated•6 years ago
|
Comment 1•6 years ago
|
||
:jmaher Is there a reason why mozilla-central and try can't be reliably compared?
Comment 2•6 years ago
|
||
There is only one case which is windows xperf fileIO metrics- otherwise IIRC all other data points are comparable.
With that said, I suspect what is going on here is the default compare to the last 2 days on mozilla-central. here what happens is we often have values change (infra, tooling, tests, browser) and when looking at the range of values on mozilla-central it can be misleading vs a single revision that has a much smaller range of normal. For example, lets say a startup test improved 5%, but your try push doesn't have that fix, so now it looks like you have a 5% regression (or something between 0-5 based on how the data points align).
The reason there is a range on m-c is that the more data points we can collect, the more accurate our detection of a regression/improvement becomes. Typically having 6 data points on either side of a compare will result in enough accuracy to detect almost all changes. We recommend this on try, but cheat a bit in compare view when comparing against m-c as we have a ~24 hour view to get 6 data points.
There are 2 alternatives I see to this given the current state of noisy tests:
- do a before/after try push
- retrigger m-c push to have more data points
- we could be smart here and do this on nightlies and at least give people a better representation- this would reduce the effects of the larger ranges over 2 days, but it still will yield some issues.
- being smarter we could analyze each signature or generated alerts to hint that an improvement/regression might be a side effect of another issue on the tree.
Does this help?
Comment 3•6 years ago
|
||
Thanks Joel. It sounds like we'll always get best results when comparing to try with/without the patch, and that maybe a warning as suggested by :jaws would be a good idea.
I wonder if we could provide an option to automatically trigger a secondary try push without the patch(es) applied when performance tests are selected. This could make it much easier for developers wanting to test the impact on performance.
Comment 4•6 years ago
|
||
Hello,
I would like to help with this bug.
Would an alert highlighting the unreliability be sufficient?
It does not break the current flow and still generates the comparison but the user would have to aknowledge that the
upcoming results might be unreliable.
The alert would also point to this bug for more information.
Comment 5•6 years ago
•
|
||
Hi Edouard, this part of the application is currently undergoing a conversion from Angular to React (see bug 1509216) and it would be tricky to coordinate. Is there another good-first-bug you'd be interested in working on instead (that is not for Compare views/routes)?
Comment 6•6 years ago
|
||
Hello Sarah,
No problem. I will search for another one.
What will happen with this bug then?
Thank you.
Comment 7•6 years ago
•
|
||
This bug is marked as a P3 - so it's not a high priority. You can check back in a few weeks to see if bug 1509216 is resolved/fixed, and if so (and it isn't assigned to someone else) then you can ask to work on it.
Comment 8•6 years ago
|
||
I have been just beaten by this issue working on bug 1588710.
Having a simple warning "you should not compare a try job with m-c" would have saved me a bunch of time.
Updated•6 years ago
|
Updated•6 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
| Assignee | ||
Comment 9•5 years ago
|
||
| Assignee | ||
Updated•5 years ago
|
Description
•