Closed Bug 1457606 Opened 6 years ago Closed 6 years ago

Intermittent GECKO(9832) | Assertion failure: aInstanceSize == 0 || entry->GetClassSize() == aInstanceSize (Mismatched sizes were recorded in the memory leak logging table. The usual cause of this is having a templated class that uses MOZ_COUNT_{C,D}TOR

Categories

(Core :: XPCOM, defect, P5)

defect

Tracking

()

RESOLVED FIXED
mozilla62
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- wontfix
firefox60 --- wontfix
firefox61 --- wontfix
firefox62 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: froydnj)

References

(Blocks 1 open bug)

Details

(Keywords: assertion, intermittent-failure)

Attachments

(1 file)

Filed by: ncsoregi [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=175982603&repo=mozilla-inbound

https://queue.taskcluster.net/v1/task/c7Et-XhrSF-ESFeOHRXAPg/runs/0/artifacts/public/logs/live_backing.log

16:31:36     INFO -  GECKO(9832) | [Parent 9832, Main Thread] WARNING: NS_ENSURE_TRUE(maybeContext) failed: file z:/build/build/src/xpcom/threads/nsThread.cpp, line 805
16:31:36     INFO -  GECKO(9832) | JavaScript error: resource://gre/modules/TelemetryController.jsm, line 461: Error: submitExternalPing - Submission is not allowed after shutdown, discarding ping of type: modules
16:31:37     INFO -  GECKO(9832) | Assertion failure: aInstanceSize == 0 || entry->GetClassSize() == aInstanceSize (Mismatched sizes were recorded in the memory leak logging table. The usual cause of this is having a templated class that uses MOZ_COUNT_{C,D}TOR in the constructor or destructor, respectively. As a workaround, the MOZ_COUNT_{C,D}TOR calls can be moved to a non-templated base class. Another possible cause is a runnable with an mName that matches another refcounted class.), at z:/build/build/src/xpcom/base/nsTraceRefcnt.cpp:461
16:31:37     INFO -  GECKO(9832) | #01: NS_LogAddRef [xpcom/base/nsTraceRefcnt.cpp:1086]
16:31:37     INFO -  GECKO(9832) | #02: mozilla::Runnable::AddRef() [xpcom/threads/nsThreadUtils.cpp:49]
16:31:37     INFO -  GECKO(9832) | #03: detail::ProxyRelease<nsISupports>(char const *,nsIEventTarget *,already_AddRefed<nsISupports>,bool) [xpcom/threads/nsProxyRelease.h:84]
16:31:37     INFO -  GECKO(9832) | #04: detail::ProxyReleaseChooser<1>::ProxyReleaseISupports(char const *,nsIEventTarget *,nsISupports *,bool) [xpcom/threads/nsProxyRelease.cpp:20]
16:31:37     INFO -  GECKO(9832) | #05: nsMainThreadPtrHolder<mozilla::dom::Promise>::~nsMainThreadPtrHolder<mozilla::dom::Promise>() [xpcom/threads/nsProxyRelease.h:293]
16:31:37     INFO -  GECKO(9832) | #06: nsMainThreadPtrHolder<mozilla::dom::Promise>::Release() [xpcom/threads/nsProxyRelease.h:315]
16:31:37     INFO -  GECKO(9832) | #07: `anonymous namespace'::GetLoadedModulesRunnable::`scalar deleting destructor'(unsigned int)
16:31:37     INFO -  GECKO(9832) | #08: mozilla::Runnable::Release() [xpcom/threads/nsThreadUtils.cpp:50]
16:31:37     INFO -  GECKO(9832) | #09: nsThread::ProcessNextEvent(bool,bool *) [xpcom/threads/nsThread.cpp:1117]
16:31:37     INFO -  GECKO(9832) | #10: NS_ProcessNextEvent(nsIThread *,bool) [xpcom/threads/nsThreadUtils.cpp:519]
16:31:37     INFO -  GECKO(9832) | #11: mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate *) [ipc/glue/MessagePump.cpp:364]
16:31:37     INFO -  GECKO(9832) | #12: MessageLoop::RunHandler() [ipc/chromium/src/base/message_loop.cc:320]
16:31:37     INFO -  GECKO(9832) | #13: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:300]
16:31:37     INFO -  GECKO(9832) | #14: nsThread::ThreadFunc(void *) [xpcom/threads/nsThread.cpp:427]
16:31:37     INFO -  GECKO(9832) | #15: PR_NativeRunThread [nsprpub/pr/src/threads/combined/pruthr.c:406]
16:31:37     INFO -  GECKO(9832) | #16: static unsigned int pr_root(void *) [nsprpub/pr/src/md/windows/w95thred.c:138]
16:31:37     INFO -  GECKO(9832) | #17: ucrtbase.dll + 0x20369
16:31:37     INFO -  GECKO(9832) | #18: KERNEL32.DLL + 0x12774
16:31:37     INFO -  GECKO(9832) | #19: ntdll.dll + 0x70d51
16:31:37     INFO -  TEST-INFO | Main app process: exit 1
16:31:37     INFO -  Buffered messages finished
16:31:37    ERROR -  1063 ERROR TEST-UNEXPECTED-FAIL | dom/canvas/test/webgl-conf/generated/test_conformance__glsl__implicit__divide_int_mat3.vert.html | application terminated with exit code 1
(In reply to Robert O'Callahan (:roc) (email my personal email if necessary) from comment #3)
> We caught this bug in the Pernosco debugger:
> https://pernos.co/debug/lCJedFnbwZ_UqEv779KFZw/index.html
> Access instructions here:
> https://mana.mozilla.org/wiki/display/TS/Debugging+with+Pernosco

So what I've gotten out of this so far:

We're in the function ProxyRelease (called from NS_ProxyRelease<mozilla::dom::Promise>), assigning a newly-constructed ProxyReleaseEvent<T> to an nsCOMPtr<nsIRunnable> ev.  The nsCOMPtr calls mozilla::CancelableRunnable::AddRef, which calls mozilla::Runnable::AddRef, which is implemented using NS_IMPL_NAMED_ADDREF(mName).

mName in this case is "Promise".  That on its own seems suspicious since it seems likely that something else uses that name.  I also haven't looked at why that's the name in this case.

It seems like either (a) the ProxyReleaseEvent constructor shouldn't pass its own aName straight through to the CancelableRunnable constructor, but should instead pass "ProxyReleaseEvent" or some compound of "ProxyReleaseEvent" and aName, or (b) if we want that name for the telemetry, we shouldn't also use it for the refcount logging.

Beyond that, I still need to figure out Pernosco better, but I think the stack and data at the time of the assertion is probably sufficient to understand what's going on here.
Based on dbaron's comment, I used SearchFox to look for the string "Promise" in .cpp files, which revealed TelemetryImpl::GetLoadedModules():
  new nsMainThreadPtrHolder<Promise>("Promise", promise));

I guess that's where it is from. The dtor for nsMainThreadPtrHolder does calls |NS_ProxyRelease(mName, ...)|, which is bad because mName is the name of a type! (As dbaron points out, this call site is in the stack in comment 0.) I'm not sure how this doesn't fail all of the time.

I'm not sure what a "good" fix for this is. Maybe the fix is to pass in "nsMainThreadPtrHolder::mMainThreadEventTarget" instead of mName.

I have no idea what recent change could have caused this.
Component: Canvas: WebGL → XPCOM
Maybe this only fails when we use the main thread ptr holder off of the main thread, and this only happens very rarely? I'd be surprised by that, though.
And for the record, the original (and correct) Promise entry was created at the stack:

mozilla::dom::Promise::Create (aGlobal=0x70159c80, aRv@0x7ffca34fd820.TErrorResult<mozilla::binding_danger::AssertAndSuppressCleanupPolicy>mResult=NS_BINDING_SUCCEEDED, mExtramMessage=0x714319c42224, mJSExceptionasBits_=124533009031716, asDouble_=0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000615274815358086, debugView_payload47_=124533009031716, tag_=0, s_.payload_i32_=432284196, u32_=432284196, why_=432284196, mDOMExceptionInfo=0x714319c42224
, mMightHaveUnreportedJSException=false, mUnionState=HasNothing, _mOwningThread.mThread=0x66350c8844c0) at /builds/worker/workspace/build/src/dom/promise/Promise.cpp:96

RefPtr<mozilla::dom::Promise>::RefPtr (this=0x7ffca34fd468
, aRawPtr=0x147d717f3a30
) at /builds/worker/workspace/build/src/obj-firefox/dist/include/mozilla/RefPtr.h:112

RefPtr<mozilla::dom::Promise>::ConstRemovingRefPtrTraits<mozilla::dom::Promise>::AddRef (aPtr=0x147d717f3a30
) at /builds/worker/workspace/build/src/obj-firefox/dist/include/mozilla/RefPtr.h:395

mozilla::RefPtrTraits<mozilla::dom::Promise>::AddRef (aPtr=0x147d717f3a30
) at /builds/worker/workspace/build/src/obj-firefox/dist/include/mozilla/RefPtr.h:38

mozilla::dom::Promise::AddRef (this=0x147d717f3a30
) at /builds/worker/workspace/build/src/dom/promise/Promise.cpp:66

NS_LogAddRef (aPtr=0x147d717f3a30, aRefcnt=1, aClass=0x71431aa9571f
, aClassSize=48) at /builds/worker/workspace/build/src/xpcom/base/nsTraceRefcnt.cpp:1085

GetBloatEntry (aTypeName=0x71431aa9571f
, aInstanceSize=48) at /builds/worker/workspace/build/src/xpcom/base/nsTraceRefcnt.cpp:448

BloatEntry::BloatEntry (this=0x147d717f3a90
, aClassName=0x71431aa9571f, aClassSize=48) at /builds/worker/workspace/build/src/xpcom/base/nsTraceRefcnt.cpp:307
(In reply to Andrew McCreight [:mccr8] from comment #6)
> Based on dbaron's comment, I used SearchFox to look for the string "Promise"
> in .cpp files, which revealed TelemetryImpl::GetLoadedModules():
>   new nsMainThreadPtrHolder<Promise>("Promise", promise));

I used Pernosco to confirm that this *is* the source of the sequence of asynchronous events that leads to the assertion.
And the sequence of data flow for the stored name was:

GetLoadedModules
nsMainThreadPtrHolder<mozilla::dom::Promise>::nsMainThreadPtrHolder
..
nsMainThreadPtrHolder<mozilla::dom::Promise>::~nsMainThreadPtrHolder
NS_ProxyRelease
ProxyReleaseChooser<true>::ProxyRelease<mozilla::dom::Promise>
ProxyReleaseChooser<true>::ProxyReleaseISupports
detail::ProxyRelease<nsISupports>
ProxyReleaseEvent::ProxyReleaseEvent
..
CancelableRunnable::CancelableRunnable
..
Runnable::Runnable
It might be better to address this in nsMainThreadPtrHolder, but it's
also really helpful to have nsMainThreadPtrHolder provide meaningful
names for the main thread runnables.  So we'll address this in
TelemetryImpl itself by providing a more-unique name.
Attachment #8985015 - Flags: review?(alessio.placitelli)
Comment on attachment 8985015 [details] [diff] [review]
use an unambiguous name for GetLoadedModule's promise

Review of attachment 8985015 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you!
Attachment #8985015 - Flags: review?(alessio.placitelli) → review+
Pushed by nfroyd@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b398f60e4bb6
use an unambiguous name for GetLoadedModule's promise; r=Dexter
https://hg.mozilla.org/mozilla-central/rev/b398f60e4bb6
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Assignee: nobody → nfroyd
Keywords: assertion
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: