Closed Bug 1138616 Opened 9 years ago Closed 9 years ago

Don't track addref and release counts in the bloat log

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox38 --- fixed
firefox39 --- fixed

People

(Reporter: mccr8, Assigned: mccr8)

References

Details

(Keywords: dev-doc-complete)

Attachments

(3 files, 3 obsolete files)

On OSX debug builds, according to bug 1137963 we spend at least 75% of our time getting locks for the bloat log.  There's some work there to make the locks themselves faster, but even with that, it seems absurd to track every single AddRef and Release.  It shouldn't be too hard to create a new logging mode that only logs the creation and destruction, and it would be much, much faster.
Really, we should just get rid of these addrefed and released counts.  I can't imagine anybody has cared about them in quite some time.
The important part of the addref/release data was being able to do refcount balancing by logging the stacks of the addrefs/releases to find out where the missing release was (or sometimes, the missing addref). That's less needed now that we have a lot less manual refcounting in our tree, but I doubt it's gone completely.
Oh, I am well-familiar with refcount logging.  I've used it just last week.  I'm just talking about mNewStats.mAddRefs and mNewStats.mReleases, which tracks the total number of addrefs and releases for each class, and is totally separate from refcount logging.
Assignee: nobody → continuation
If I'm breaking the bloat log format anyways, I might as well do bug 1073594 first to clear out the cruft.
Depends on: 1073594
Looking at the profiles in the other bug, it is actually only NS_LogCtor and NS_LogDtor that show up in the log, so speeding up NS_LogAddRef and NS_LogRelease won't help.  I guess there's just some non-refcounted classes we just create and destroy a lot.  Still, it would probably be good to improve this.
In some unscientific local OSX testing running the XPConnect mochitest-plain tests, eliminating the holding of the log for the non-ctor/dtor cases in NS_LogArref/Release reduced the running time from 18s to 15s, which isn't too bad.  I'm not entirely sure this is worth the complexity, if speeding up the locks closes most of this gap.
Summary: Implement a lightweight bloat log that does not track every addref and release → Don't track addref and release counts in the bloat log
I'm not sure if I want to land this or not, but I might as well upload what I have for posterity.
Unscientifically, based on my single OSX 10.6 run against a trunk a few days out of date to what I'm comparing it to, it seems to reduce most of the Mochitest runs by about 10 minutes each, as well as R.  No apparent affect on C or J.  No apparent effect on Linux32.

10.6: https://treeherder.mozilla.org/#/jobs?repo=try&revision=62cb20f5fb98
L32: https://treeherder.mozilla.org/#/jobs?repo=try&revision=40493ce7e99b
See Also: → 1137397
I guess I do maybe see the OSX 10.6 C time go from 16 minutes to 12 minutes, so that's pretty good.

One drawback here is that we'll lose the detection of classes that do weird things with their refcounts, like if they release the object a bunch of extra times.  But I think that's probably better handled on an individual object basis rather than class-wide.  Bug 1062479 is an intermittent case of something going weird with refcounting, that is only being detected by this.  But it hasn't really lead to anything useful.
Attachment #8571618 - Attachment is obsolete: true
Attachment #8571619 - Attachment is obsolete: true
Attachment #8571620 - Attachment is obsolete: true
Attachment #8572768 - Flags: review?(dbaron)
Checking the refcounts first is slightly more efficient, plus it matches what we do for the other logs.
Attachment #8572770 - Flags: review?(dbaron)
This is a little weird, but I don't think it is too bad.
Attachment #8572772 - Flags: review?(dbaron)
Comment on attachment 8572772 [details] [diff] [review]
part 3 - Add a faster bloatlog-only mode.

I think the NS_LogCOMPtr{AddRef,Release} tests could be testing against FullLogging, and that probably makes more sense given that NS_LogAddRef and NS_LogRelease are usually testing against FullLogging.

r=dbaron with that
Attachment #8572772 - Flags: review?(dbaron) → review+
(In reply to David Baron [:dbaron] (UTC-8) from comment #14)
> I think the NS_LogCOMPtr{AddRef,Release} tests could be testing against
> FullLogging, and that probably makes more sense given that NS_LogAddRef and
> NS_LogRelease are usually testing against FullLogging.
Good point.  I didn't notice they weren't even using the bloat log.  Fixed.

Thanks for the fast review.  Hopefully this will help a bit with the infra load issues we've been having this week.

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/9f2706b1e676
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/000bf130c3cf
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/dfaf613b7be5
Please can we uplift this anywhere and everywhere it applies to?
OSX 10.6 debug, times in minutes on a few pushes before and a few after my landing:
  without: M1 38, M2 34, M3 70
  with: M1 27, M2 24, M3 43

M4 and M5 seem similar.  For the other tests, the times were too inconsistent to really draw any conclusion.
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #17)
> Please can we uplift this anywhere and everywhere it applies to?

Yeah, I'll let it sit for another day.  There's a number of patches here (and in the bug it blocks) but it should only affect debug builds so it shouldn't be super risky.
Flags: needinfo?(continuation)
Comment on attachment 8572772 [details] [diff] [review]
part 3 - Add a faster bloatlog-only mode.

This approval request is for all three patches in this bug.

Approval Request Comment
[Feature/regressing bug #]: N/A
[User impact if declined]: No direct user impact. This speeds up OSX debug builds, which will help with our OSX tester capacity issues.  I think it saves like an hour of machine time across all our tests on a single platform.
[Describe test coverage new/current, TreeHerder]: This code runs very frequently, which is why making it faster helps. ;)
[Risks and why]: Low risk.  This only affects debug builds, and it has been on trunk for 4 days.  It is less of a big deal to get it on beta, though if that's turning into a long-lived b2g branch that would be good.
[String/UUID change made/needed]: None.
Flags: needinfo?(continuation)
Attachment #8572772 - Flags: approval-mozilla-beta?
Attachment #8572772 - Flags: approval-mozilla-aurora?
Depends on: 1122249
Comment on attachment 8572772 [details] [diff] [review]
part 3 - Add a faster bloatlog-only mode.

Cancelling for now due to bug 1122249.
Attachment #8572772 - Flags: approval-mozilla-beta?
Attachment #8572772 - Flags: approval-mozilla-aurora?
Blocks: 1125998
I notice that this bug is closed but there are still outstanding requests to uplift the patch to m-b and m-a.  I understand from talking to RyanVM that the intermittent oranges in bug 1122249 are restricted to 10.8.  So could this patch be uploaded to m-a and I can enable 10.10 tests on that branch as well?  At the same time I'll disable the 10.8 tests on m-a.
Flags: needinfo?(ryanvm)
Bug resolution tracks landing on mozilla-central. Based on the Try run I ran today in bug 1122249, I think we can at least go ahead and request Aurora approval on this again so we're not blocking 10.10 switching on it. I don't think we can uplift this to beta until the leaks are figured out, however.
Flags: needinfo?(ryanvm)
It is extremely unlikely that the leaks are due to this bug.  It is more of a tree cleanliness issue.
Comment on attachment 8572772 [details] [diff] [review]
part 3 - Add a faster bloatlog-only mode.

Approval Request Comment
[Feature/regressing bug #]: N/A
[User impact if declined]: This makes OSX debug tests much faster, which is needed to enable 10.10 tests on Aurora.
[Describe test coverage new/current, TreeHerder]: This code runs extremely frequently in all debug builds.
[Risks and why]: Low.  It caused an intermittent leak, but I'm pretty sure that is just exposing an existing issue.
[String/UUID change made/needed]: none
Attachment #8572772 - Flags: approval-mozilla-aurora?
Oh, and that approval request is for all of the patches.
Blocks: 1139002
Comment on attachment 8572772 [details] [diff] [review]
part 3 - Add a faster bloatlog-only mode.

Approving for uplift to aurora since this is relatively low risk and it should go along with bug 1073594.
Attachment #8572772 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
I need to update the docs about this still.
Flags: needinfo?(continuation)
I removed the obsolete stuff and updated the example to something current.
https://developer.mozilla.org/en-US/docs/Mozilla/Performance/BloatView
Flags: needinfo?(continuation)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: