XPCOM refcount logging still doesn't work (in debug builds only)

RESOLVED DUPLICATE of bug 1307961

Status

()

Core
XPCOM
RESOLVED DUPLICATE of bug 1307961
a year ago
a year ago

People

(Reporter: Away for a while, Assigned: Away for a while)

Tracking

({regression})

52 Branch
regression
Points:
---

Firefox Tracking Flags

(firefox50 unaffected, firefox51 unaffected, firefox52 fix-optional, firefox53 fix-optional)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

a year ago
Bug 1316527 wasn't a complete fix.

STR:

1. export XPCOM_MEM_LOG_CLASSES=Timeout
2. Run a mochitest.

Result:
Assertion failure: !aCreate (If an object already has a serial number, we should be destroying it.), at /Users/ehsan/moz/src/xpcom/base/nsTraceRefcnt.cpp:584
Process 68766 stopped
* thread #1: tid = 0x1b7b93f, 0x0000000101d38020 XUL`GetSerialNumber(aPtr=0x0000000191c08b00, aCreate=true) + 160 at nsTraceRefcnt.cpp:584, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000101d38020 XUL`GetSerialNumber(aPtr=0x0000000191c08b00, aCreate=true) + 160 at nsTraceRefcnt.cpp:584
   581                                              HashNumber(aPtr),
   582                                              aPtr);
   583    if (hep && *hep) {
-> 584      MOZ_RELEASE_ASSERT(!aCreate, "If an object already has a serial number, we should be destroying it.");
   585      return static_cast<SerialNumberRecord*>((*hep)->value)->serialNumber;
   586    }
   587
(Assignee)

Comment 1

a year ago
Timeout uses MOZ_COUNT_CTOR: <http://searchfox.org/mozilla-central/rev/ef3cede9a5b050fffade9055ca274654f2bc719e/dom/base/Timeout.cpp#27>

NS_LogCtor() puts the pointer inside the gSerialNumbers hashtable.  Then NS_LogAddRef() gets called, and it asserts that the pointer cannot be found in the hashtable.

This whole notion that the first call to NS_LogAddRef() for an object can assert anything about whether the object exists is bogus due to MOZ_COUNT_CTOR, and we have no way of knowing whether the object is using MOZ_COUNT_CTOR or not.
(Assignee)

Comment 2

a year ago
Created attachment 8815810 [details] [diff] [review]
Make sure the XPCOM refcount logging deals with refcounted objects which use MOZ_COUNT_CTOR correctly

We really have three distinct cases here:
  * Where the caller knows a new serial no shouldn't be created.
  * Where the caller knows a new serial number should be created.
  * Where the caller can guess that a new serial no should probably be created.

When NS_LogAddRef() sees a refcount of 1, it should assume the third case, since
a previous call to NS_LogCtor() may have created the serial no before the first
call to AddRef().
Attachment #8815810 - Flags: review?(nfroyd)
(Assignee)

Comment 3

a year ago
Hmm, another issue is the reverse problem in NS_LogDtor.  It's possible for the last call to NS_LogRelease() to free up the serial number.
Comment on attachment 8815810 [details] [diff] [review]
Make sure the XPCOM refcount logging deals with refcounted objects which use MOZ_COUNT_CTOR correctly

Review of attachment 8815810 [details] [diff] [review]:
-----------------------------------------------------------------

The correct thing to do here is to change Timeout so that it doesn't use MOZ_COUNT_{D,C}TOR, since it already has leak logging through being reference counted.  We have a bug open on MOZ_COUNT_{D,C}TOR erroring when you use it in a refcounted class.
Attachment #8815810 - Flags: review?(nfroyd)
(Assignee)

Comment 5

a year ago
Created attachment 8815812 [details] [diff] [review]
Make sure the XPCOM refcount logging deals with refcounted objects which use MOZ_COUNT_CTOR correctly

We really have three distinct cases here:
  * Where the caller knows a new serial no shouldn't be created.
  * Where the caller knows a new serial number should be created.
  * Where the caller can guess that a new serial no should probably be created.

When NS_LogAddRef() sees a refcount of 1, it should assume the third case, since
a previous call to NS_LogCtor() may have created the serial no before the first
call to AddRef().
Attachment #8815812 - Flags: review?(nfroyd)
(Assignee)

Updated

a year ago
Attachment #8815810 - Attachment is obsolete: true
(Assignee)

Comment 6

a year ago
Is there a good reason to make people's debugging sessions crash while that bug is being fixed?

(And do you have a bug # handy by any chance?  Thanks!)
Flags: needinfo?(nfroyd)
I started writing a patch so that refcounted classes don't use MOZ_COUNT_CTOR, but there were like 100 classes or more classes that did that so I wasn't entirely sure I was right that that's the way things should be done.
(Assignee)

Comment 8

a year ago
(In reply to Andrew McCreight [:mccr8] from comment #7)
> I started writing a patch so that refcounted classes don't use
> MOZ_COUNT_CTOR, but there were like 100 classes or more classes that did
> that so I wasn't entirely sure I was right that that's the way things should
> be done.

I'm not disputing that this is the Right Way (TM) to do this.  I'm just saying that given the current reality where things use MOZ_COUNT_CTOR when they shouldn't, the assertions in the code are going to cause crashes for every single instance of someone doing something like comment 0.

(FWIW we already have code in the clang plugin for detecting refcounted objects, so making an analysis which would make such MOZ_COUNT_CTOR's not compile is super simple once we've fixed the occurrences.  Or perhaps we can employ a simpler static_assert based check to make sure the object doesn't have an AddRef/Release pair?)
status-firefox50: --- → unaffected
status-firefox51: --- → unaffected
status-firefox52: --- → affected
status-firefox53: --- → affected
Version: unspecified → 52 Branch
(In reply to :Ehsan Akhgari from comment #6)
> Is there a good reason to make people's debugging sessions crash while that
> bug is being fixed?

So they can fix their buggy code?</glib-answer>

> (And do you have a bug # handy by any chance?  Thanks!)

Bug 1307961.
Flags: needinfo?(nfroyd)
(Assignee)

Updated

a year ago
Summary: XPCOM refcount logging still doesn't work → XPCOM refcount logging still doesn't work (in debug builds only)
Ehsan told me we'd rather fix bug 1307961.
Per https://bugzilla.mozilla.org/show_bug.cgi?id=1307961#c21 and the fact that this regression is not affecting users, I'm marking this fix-optional for 52 and 53.
status-firefox52: affected → fix-optional
status-firefox53: affected → fix-optional
(Assignee)

Comment 12

a year ago
Let's just dupe this!
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1307961
You need to log in before you can comment on or make changes to this bug.