Open Bug 916859 Opened 6 years ago Updated 6 years ago

Find out why UX branch regressed on TART at merge changeset c375e7bc34b3

Categories

(Firefox :: Tabbed Browser, defect)

x86
Windows XP
defect
Not set

Tracking

()

ASSIGNED

People

(Reporter: mconley, Assigned: mconley)

References

(Blocks 1 open bug)

Details

(Whiteboard: [Australis:P-])

Attachments

(3 obsolete files)

I noticed a blip on our backprocessed TART performance data on the UX branch - on May 17th, TART was measured at 3, and on May 18th TART was measured at 3.265.

As usual, the details paint a more interesting story:

http://compare-talos.mattn.ca/breakdown.html?oldTestIds=272&newTestIds=273&testName=tart&osName=Windows%205.1%20%28ix%29&server=graphs.mattn.ca

So something happened in these changesets that hurt TART on the UX branch for XP pretty deeply:

http://hg.mozilla.org/projects/ux/pushloghtml?fromchange=cb56ba326fa7&tochange=19fac4398eb0
Running TART on our loaner XP machines reveals this regression as being caused by this merge changeset:

http://hg.mozilla.org/projects/ux/rev/c375e7bc34b3

The individual changesets in that merge can be viewed here:

http://hg.mozilla.org/projects/ux/pushloghtml?changeset=c375e7bc34b3
Summary: Find out why UX branch regressed on TART between changeset cb56ba326fa7 and 19fac4398eb0 → Find out why UX branch regressed on TART at c375e7bc34b3
Surprise - I was all wrong. I just finished a build of c375e7bc34b3 to start bisecting with it, and it turns out that the regression isn't there.

Which means, incredibly, that it looks like it was introduced in c44ad65f296d, which is very much a UX-only patch:

http://hg.mozilla.org/projects/ux/rev/c44ad65f296d

I'm going to re-run these tests a few more times to confirm.
Summary: Find out why UX branch regressed on TART at c375e7bc34b3 → Find out why UX branch regressed on TART at c44ad65f296d
Attached file dz-tart-c44ad65f296d.txt (obsolete) —
TART results for c44ad65f296d
Attached file dz-tart-c375e7bc34b3.txt (obsolete) —
TART results for c375e7bc34b3
Attached file dz-tart-c44ad65f296d.txt (obsolete) —
Hm, so I think I know what the problem is here. For some reason, the builds on my Windows box are faster than the builds from my laptop's VM. So comparing them against one another isn't useful. :/

These last two logs are from builds both from my Windows Box. I no longer see much of a difference between them. Hrm.
Attachment #806257 - Attachment is obsolete: true
(In reply to Mike Conley (:mconley) from comment #5)
> the builds
> on my Windows box are faster than the builds from my laptop's VM. So
> comparing them against one another isn't useful. :/
> 
> These last two logs are from builds both from my Windows Box. I no longer
> see much of a difference between them. Hrm.

Assuming the VM is of windows XP (and the historic tart data which we collected and shows the regression is also on XP), it might be more than just speed difference between the win box and the VM. E.g. the XP runs could have different gfx configuration which expose some regression which isn't visible on a win7/8 gfx configuration.

So as far as TART runs are concerned, we started with XP, and we should stick to XP.

IIRC normal recent talos tart runs on win7/8 don't show as much as a regression compared to m-c as the XP runs do?
So I'm reverting the bug summary, because a recent set of try pushes[1] and compare-talos[2] is throwing our regression finding into question. Basically, I pushed those two changesets, cb56ba326fa7 and 19fac4398eb0, to try, to verify that the regression exists there.

And we didn't see it. Which is really confusing.

I'm going to rerun the associated Nightly's on the XP loaners with MattN's script about 5 times, and then we'll see if the regression is still visible in the measurements, or if this was a one-off glitch.

[1]: https://tbpl.mozilla.org/?tree=Try&rev=c537e0cea4c3 and https://tbpl.mozilla.org/?tree=Try&rev=75be33752a4f
[2]: http://compare-talos.mattn.ca/?oldRevs=c537e0cea4c3&newRev=75be33752a4f&server=graphs.mozilla.org&submit=true . Check out the details - we're seeing a slight difference in numbers here, but it's really within the noise range (0.3ms). Not nearly as dramatic as the compare-talos details we were seeing in comment 1.
Summary: Find out why UX branch regressed on TART at c44ad65f296d → Find out why UX branch regressed on TART between changeset cb56ba326fa7 and 19fac4398eb0
Attachment #806258 - Attachment is obsolete: true
Attachment #806265 - Attachment is obsolete: true
Ok, Avi and I debugged this.

We got our talos slaves to run the UX Nightly's from May 17th and 18th 5 more times, and send their data to the graph server.  We examined the data from graph server:

http://compare-talos.mattn.ca/?oldRevs=cb56ba326fa7&newRev=19fac4398eb0&server=graphs.mattn.ca&submit=true

and we're seeing something like ~5-6% regression across the board. However, we see a similar regression a day or two earlier from m-c.

It's strange that this regression wasn't more visible with the try pushes. Might be PGO voodoo. Anyhow, we're going to keep this bug open, but try to find other points where *only* UX regressed.
Try results are in.

Was there a regression in this set of patches? Hell yeah, compare-talos confirms when comparing cb56ba326fa7 to 19fac4398eb0.

http://compare-talos.mattn.ca/?oldRevs=1395516e45e3&newRev=7a68e6db8abe&server=graphs.mozilla.org&submit=true

This looks like the big one. Apparently, we're seeing a regression of something like 8% on our total TART score.

Here's the breakdown:

http://compare-talos.mattn.ca/breakdown.html?oldTestIds=29728495&newTestIds=29680005&testName=tart&osName=Windows%20XP&server=graphs.mozilla.org
These try pushes seem to put the blame squarely on c375e7bc34b3, which was the merge changeset we originally suspected.

The changes in that merge landed on m-c between the May 16th and May 17th Nightly. That's visible on this graph here:

http://graphs.mattn.ca/graph-local.html#tests=[[293,1,11],[293,59,11]]&sel=1367661417214.5864,1369990288054.5227,2.9714119183820555,3.3395553353452554&displayrange=365&datatype=running

Then, after the merge from m-c, UX regressed with that changeset, but the change was more dramatic - it put UX's TART performance over m-c's. You can see that in the UX Nightly from May 17 to May 18th.

So the questions are:

Was what caused the regression in m-c the same thing that caused the regression in UX?
  If so, why was UX more affected than m-c, and what can we do about that?
  If not, what in that merge changeset affected UX only?
Summary: Find out why UX branch regressed on TART between changeset cb56ba326fa7 and 19fac4398eb0 → Find out why UX branch regressed on TART at merge changeset c375e7bc34b3
I started to bisect the changesets in the m-c merge (starting at b30552dbb013, and ending at cb242a1cccb2), but strangely, I'm not seeing much in the way of TART regression there...

b30552dbb013: https://tbpl.mozilla.org/?tree=Try&rev=b67026e7dda3
cb242a1cccb2: https://tbpl.mozilla.org/?tree=Try&rev=f9077ab83120

compare-talos: http://compare-talos.mattn.ca/?oldRevs=b67026e7dda3&newRev=f9077ab83120&server=graphs.mozilla.org&submit=true

I'm going to re-check the range.
So I ran 5 more passes of the m-c Nightly's for May 16th, and they seem to support the claim that the regression occurred between May 16th and May 17th:

http://graphs.mattn.ca/graph-local.html#tests=[[293,1,11],[293,59,11]]&sel=1368008792710.6133,1369799924810.1665,1.6176470588235294,3.860294117647059&displayrange=365&datatype=running
You need to log in before you can comment on or make changes to this bug.