Talos TART regressions from Spidermonkey deadlock fixes

RESOLVED WONTFIX

Status

()

RESOLVED WONTFIX
5 years ago
5 years ago

People

(Reporter: mbrubeck, Assigned: bhackett)

Tracking

({perf, regression})

26 Branch
perf, regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox26 affected, firefox27 affected)

Details

(Reporter)

Description

5 years ago
It looks like this changeset (Bug 916753, Bug 916531, Bug 916504) probably regressed the TART benchmark by about 3%-9% on Windows and Linux:
http://hg.mozilla.org/mozilla-central/rev/4bcf9b261b94

These regressions were seen when the patch first landed on inbound, and also when it was backported to aurora; the changeset above is the only one common to both regression ranges:
https://groups.google.com/d/topic/mozilla.dev.tree-management/0QRQ-VSAlt0/discussion
https://groups.google.com/d/topic/mozilla.dev.tree-management/jvJQv6dvdGI/discussion
Brian - should we backout the landings mentioned in comment 0 or do you have another idea?
Assignee: general → bhackett1024
tracking-firefox26: ? → +
(Reporter)

Comment 2

5 years ago
I don't think we should just back these out; in a tradeoff between stability and a small perf win, we want stability.  We should fix the perf only if it's possible to do it without reintroducing the deadlocks.
(Assignee)

Comment 3

5 years ago
I don't know why that changeset would regress anything.  To try to confirm I did a couple try pushes:

With bug 916753 backed out: https://tbpl.mozilla.org/?tree=Try&rev=60545e8645ff
With all three bugs backed out: https://tbpl.mozilla.org/?tree=Try&rev=22cf26346c80

The datazilla links on the try pushes are all broken, and I don't know how else to see what the tart numbers are.
(Reporter)

Comment 4

5 years ago
You can see the TART result in the footer of TBPL when you click on one of the "s" jobs.  You can use these compare-talos links to compare each of the Try pushes to their parent revision (72681e08a35d):

http://perf.snarkfest.net/compare-talos/index.html?oldRevs=72681e08a35d&newRev=60545e8645ff&submit=true
http://perf.snarkfest.net/compare-talos/index.html?oldRevs=72681e08a35d&newRev=22cf26346c80&submit=true

I retriggered each of the TART jobs so we can get more datapoints.  If you reload the compare-talos links after the new runs are finished (in about 20 minutes), then they will have more useful TART data.
(Reporter)

Comment 5

5 years ago
Backing out bug 916753 alone had no significant effect, but backing out all three bugs improved TART results by about 2-4%.

Comment 6

5 years ago
(In reply to Matt Brubeck (:mbrubeck) from comment #4)
> You can see the TART result in the footer of TBPL when you click on one of
> the "s" jobs.  You can use these compare-talos links to compare each of the
> Try pushes to their parent revision (72681e08a35d):
> 
> http://perf.snarkfest.net/compare-talos/index.
> html?oldRevs=72681e08a35d&newRev=60545e8645ff&submit=true
> http://perf.snarkfest.net/compare-talos/index.
> html?oldRevs=72681e08a35d&newRev=22cf26346c80&submit=true

MattN's got a slightly updated version of compare talos, which shows a bit more info (on the details page - esp. old/new absolute values, with some specific prettyfying of TART results).

Here are the same comparisons on Matt's page:

Backing out bug 916753 only:
http://compare-talos.mattn.ca/?oldRevs=72681e08a35d&newRev=60545e8645ff&server=graphs.mozilla.org&submit=true

Backing out all 3 bugs:
http://compare-talos.mattn.ca/?oldRevs=72681e08a35d&newRev=22cf26346c80&server=graphs.mozilla.org&submit=true

If we look at the highest TART regression on the second link, which is for Ubuntu 64 which shows as 4% improvement (but is our suspected regression since "new" is a backout of all 3 bugs suspected in causing the regression), at the "details" page:

http://compare-talos.mattn.ca/breakdown.html?oldTestIds=29702891,29818013,29818173&newTestIds=29806647,29817955,29818855&testName=tart&osName=Ubuntu%2064&server=graphs.mozilla.org

We see that most changes are at the ".error" subtests, and are of relatively small absolute magnitude. These .error values are the difference between the designated animation duration (250ms) to the actual measured animation duration.

Since all these .error changes are well under 1ms, it makes them very much negligible. Even if they were 5ms I would still consider them not meaningful enough.

The details of the other platform in this comparison (ubuntu 32, win xp/7/8) have similar patterns.

Bottom line: if these comparisons indeed represent the regression which we suspect (do they?), then they're absolutely meaningless and they should not block landing of these bugs.

Comment 7

5 years ago
(In reply to Avi Halachmi (:avih) from comment #6)
> Bottom line: if these comparisons indeed represent the regression which we
> suspect (do they?), then they're absolutely meaningless and they should not
> block landing of these bugs.

WRT TART only. I didn't inspect comparisons of other talos tests.

Comment 8

5 years ago
WONTFIX?
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Flags: needinfo?(mbrubeck)
Resolution: --- → FIXED

Comment 9

5 years ago
Oops, didn't mean to resolve it myself...
(Reporter)

Comment 10

5 years ago
Sounds good.  Thanks for the additional investigation!
Flags: needinfo?(mbrubeck)
Resolution: FIXED → WONTFIX
tracking-firefox26: + → ---
You need to log in before you can comment on or make changes to this bug.