Closed Bug 898504 Opened 11 years ago Closed 11 years ago

7% PGO libxul linker memory usage regression in July 2013

Categories

(Core :: General, defect)

25 Branch
x86
Windows 8
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla25

People

(Reporter: mbrubeck, Unassigned)

References

Details

(Keywords: regression)

There was a 7% (230MB) regression in Windows PGO linker memory usage on 2013-07-08.  This cancels out all the improvement from bug 871712. :(  The upward trend since then looks alarming too.

Regression range:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=f521b97fa75f&tochange=d78dd819dcb3

Newsgroup thread:
https://groups.google.com/d/topic/mozilla.dev.tree-management/9kK0DkHU0Sc/discussion
(In reply to Matt Brubeck (:mbrubeck) from comment #0)
> This cancels out all the improvement from bug 871712.

Actually, *this* regression only cancels out about half that improvement, but other smaller regressions since bug 871712 have canceled out the other half.
(In reply to Matt Brubeck (:mbrubeck) from comment #2)
> From combining inbound and mozilla-central data, the regression happened
> either in this range:
> http://hg.mozilla.org/integration/mozilla-inbound/
> pushloghtml?fromchange=e0d965ec66fa&tochange=d87b950c7a6f
> 
> or this one:
> http://hg.mozilla.org/integration/mozilla-inbound/
> pushloghtml?fromchange=f521b97fa75f&tochange=8e1f9400edde

Hmm, nothing in either of these two ranges jumps out as something which could be at fault here.
733698d74ed9 was a bad push.  Here's the correct one:
https://tbpl.mozilla.org/?tree=Try&rev=904165b79f31
Based on the Try pushes, the regression range is narrowed to bug 870407 and bug 886526.  Since the former just removed dead Mac-only code, the latter must be our culprit:

http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=bb4f5079e009%20&tochange=4feb9e5a7b68

Next step: Should we back out bug 886526?  Any theories about why it caused this regression?
Blocks: 886526
If bug 886526 is the culprit, bug 886526 deserves to be backed out on grounds it changed the functionality of the build system (i.e. the patch was wrong), not for regressing PGO perf.

I have a hard time believing bug 886526 could be the culprit. I triggered another build on that push so we have N>1 sample size.
After retriggers on both the "before" and "after" changesets, it still looks like the regression range from comment 6 is correct.

Bug 886526 does not back out cleanly.  It looks like the backout should be trivial to merge, but I don't have time to do it right now.  If someone who knows that code could back out the change, that would be great.
> Bug 886526 does not back out cleanly.  It looks like the backout should be
> trivial to merge, but I don't have time to do it right now.  If someone who
> knows that code could back out the change, that would be great.

it wouldn't suprise me if just backing out isn't enough and we need to add IS_COMPONENT in another place or two because of directory additions, but I can deal with it today.


(In reply to Gregory Szorc [:gps] from comment #7)
> If bug 886526 is the culprit, bug 886526 deserves to be backed out on
> grounds it changed the functionality of the build system (i.e. the patch was
> wrong), not for regressing PGO perf.
> 
> I have a hard time believing bug 886526 could be the culprit. I triggered
> another build on that push so we have N>1 sample size.

Its not immediately obvious to me how, but I suppose its possible we pass a static lib to the linker more than once as a result of this change, which would certainly explain using more memory.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
You need to log in before you can comment on or make changes to this bug.