Closed
Bug 1284370
Opened 9 years ago
Closed 9 years ago
9.89 - 48.25% tp5o responsiveness (windowsxp) regression on push bb16dc5e693602f19ade6b212d8babd6d038b918 (Thu Jun 30 2016)
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
WONTFIX
Tracking | Status | |
---|---|---|
e10s | ? | --- |
People
(Reporter: wlach, Unassigned)
References
Details
(Keywords: perf, regression, talos-regression)
Hi Nick, could you please have a look at this? I haven't done any retriggers but there don't seem to be any other likely culprits for this regression. Strangely this only seems to occur on Windows XP -- Windows 7 is fine.
I have marked this as blocking 1283245 but have not confirmed that this is related to that, and not one of the other bugs in the push. We can experiment with backing out specific patches and seeing what the effect is but I thought I'd file the bug first to see if you had any initial thoughts.
--
Talos has detected a Firefox performance regression from push bb16dc5e693602f19ade6b212d8babd6d038b918. As author of one of the patches included in that push, we need your help to address this regression.
Summary of tests that regressed:
tp5o responsiveness windowsxp opt e10s - 28.97% worse
tp5o responsiveness windowsxp pgo e10s - 48.25% worse
tp5o responsiveness windowsxp pgo - 9.89% worse
tp5o responsiveness windowsxp opt e10s - 37.76% worse
You can find links to graphs and comparsion views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=1672
On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.
To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests
For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running
*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***
Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Reporter | ||
Updated•9 years ago
|
Flags: needinfo?(nfitzgerald)
Updated•9 years ago
|
tracking-e10s:
--- → ?
Comment 1•9 years ago
|
||
Posting snippets from IRC for posterity while I dig in further:
> 10:31 AM <terrence> fitzgen: I would not assume off hand that the new code is either slower or less fair than our old code
> 10:31 AM <terrence> fitzgen: I think it's quite likely that the test is encoding a dependency on a broken [condition variable] implementation
> 10:32 AM <terrence> fitzgen: our best bat may be to compare actual performance numbers across platforms to see if it's an actual slowdown or just if the wakeup order changed
> 10:34 AM <terrence> fitzgen: note that the regression is from 5 to 7, 7 to 10 and 45 to 49... presumably milliseconds, so in absolute terms it's not a catastrophe for our remaining 2% winxp users even if we do decide to take a real slowdown
> 10:34 AM <terrence> fitzgen: also, it's only on this one test
> 10:35 AM <terrence> fitzgen: given how much we use [condition variables], I'd expect every test to be slower if it was not a wakeup order issue
> 10:35 AM <terrence> fitzgen: and given that every winxp [condition variable] impl I've read has been less fair than our new impl, I highly doubt that the new code is objectively worse
Flags: needinfo?(nfitzgerald)
Comment 2•9 years ago
|
||
Is the definition of tp5o in the separate talos repo/package? The units seem to be ms or latency when switching tabs, or something like that? Can I get access to the talos repo/package?
Comment 3•9 years ago
|
||
(In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #2)
> Is the definition of tp5o in the separate talos repo/package? The units seem
> to be ms or latency when switching tabs, or something like that? Can I get
> access to the talos repo/package?
The definition is at the main repo, but the test files are not. You need tp5n.zip . see https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5n_pages_set
Comment 4•9 years ago
|
||
Looking at [0] and [1], the results seem to be bimodal, which might give credence to the this-could-be-wakeup-ordering-related hypothesis: each mode could be the result of a different wakeup ordering, one which happens to benefit the test more.
[0] https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,22b942243cce9b43b263b83c473af3256a138e58,1,1%5D&series=%5Bfx-team,22b942243cce9b43b263b83c473af3256a138e58,1,1%5D&series=%5Bmozilla-central,22b942243cce9b43b263b83c473af3256a138e58,1,1%5D&selected=%5Bmozilla-inbound,22b942243cce9b43b263b83c473af3256a138e58,33890,31099793,1%5D
[1] https://treeherder.mozilla.org/perf.html#/graphs?timerange=1209600&series=%5Bmozilla-inbound,22b942243cce9b43b263b83c473af3256a138e58,1%5D&series=%5Bmozilla-central,22b942243cce9b43b263b83c473af3256a138e58,0%5D&series=%5Bfx-team,22b942243cce9b43b263b83c473af3256a138e58,0%5D&selected=%5Bmozilla-inbound,22b942243cce9b43b263b83c473af3256a138e58%5D
> 10:32 AM <terrence> fitzgen: our best bat may be to compare actual performance numbers across platforms to see if it's an actual slowdown or just if the wakeup order changed
What follows is a breakdown between winxp and other platforms running these tests.
## tp5o responsiveness opt e10s:
https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-central,22b942243cce9b43b263b83c473af3256a138e58,1,1%5D&series=%5Bmozilla-central,17776dbfe42bfdd7473bf1712f41ff2fdafcf4bb,1,1%5D&series=%5Bmozilla-central,3ece050aac95021a51363205ef7747cce7a62b24,1,1%5D&series=%5Bmozilla-central,bf41e17491286132034748a5f95035a0aaf50458,1,1%5D
winxp looks like it went from in the middle of the herd to the slowest of the platforms running tp5o responsiveness pgo, but by a very thin margin and within others' range of scores.
## tp5o responsiveness pgo:
https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-central,69600f98d411b5fde1ea437bb7e2d19ec9696b62,1,1%5D&series=%5Bmozilla-central,77bb16a7e7821467fe581797abe86744ad1519f7,1,1%5D&series=%5Bmozilla-central,26940bc490dde1433819e0eed0a7a6258cb9bc88,1,1%5D&series=%5Bmozilla-central,7809f0aef254efd820d1d3fdb711262797f85e64,1,1%5D
ditto
## tp5o responsiveness pgo e10s:
https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-central,5ec2754e398f6f440316bd82ff738cb21ba9ff70,1,1%5D&series=%5Bmozilla-central,cb7337b04c6d9c8e9e0c83272887d05ef94c0de6,1,1%5D&series=%5Bmozilla-central,3715ed80f45aebf89c53be5eceb7698f14f9493d,1,1%5D&series=%5Bmozilla-central,3cc5ddc12d27748f13e6ac101f094dfca2df1470,1,1%5D
Same story, but winxp is even closer to the others than before.
Comment 5•9 years ago
|
||
(In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #4)
Given that it seems fairly likely that the tp5o responsiveness tests were sensitive to condition variable wakeup order, and that winxp's performance isn't non-normative compared to other platforms, I think we could take the regression.
Terrence: what do you think?
Flags: needinfo?(terrence)
Comment 6•9 years ago
|
||
FYI, in case this helps: I did some bisection on try for tp5o responsiveness opt e10s, and the results:
https://treeherder.mozilla.org/perf.html#/graphs?series=%5Btry,22b942243cce9b43b263b83c473af3256a138e58,1,1%5D&zoom=1467660068591.113,1467660436000,6.265769992628058,11.622603053274668
Point to this commit (in the changeset of 9 pushes) causing the actual perf regression:
c62dc1a368c8 Nick Fitzgerald — Bug 1283245 - Use js::Mutex and js::ConditionVariable instead of PRLock and PRCondVar in HelperThreads; r=terrence
Comment 7•9 years ago
|
||
(In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #5)
> (In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #4)
>
> Given that it seems fairly likely that the tp5o responsiveness tests were
> sensitive to condition variable wakeup order, and that winxp's performance
> isn't non-normative compared to other platforms, I think we could take the
> regression.
>
> Terrence: what do you think?
The PGO numbers, in particular the PGO e10s number (i.e. what we ship), show winXP ahead of win7 as often as not, full stop. There is no real regression here, so we can close. That said, I also think this level of regression would be worth shipping on a niche platform given the other gains that we are getting from the change.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(terrence)
Resolution: --- → WONTFIX
Comment 8•9 years ago
|
||
thanks for figuring this out and coming to a resolution!
Updated•9 years ago
|
Component: Untriaged → JavaScript Engine
Product: Firefox → Core
Version: unspecified → Trunk
You need to log in
before you can comment on or make changes to this bug.
Description
•