Closed
Bug 956284
Opened 10 years ago
Closed 10 years ago
Fault in cycle collector: overflowing refcount
Categories
(Core :: DOM: Workers, defect)
Tracking
()
VERIFIED
FIXED
mozilla29
Tracking | Status | |
---|---|---|
firefox26 | --- | unaffected |
firefox27 | --- | unaffected |
firefox28 | + | verified |
firefox29 | + | verified |
firefox-esr24 | --- | unaffected |
b2g18 | --- | unaffected |
b2g-v1.1hd | --- | unaffected |
b2g-v1.2 | --- | unaffected |
b2g-v1.3 | --- | fixed |
b2g-v1.3T | --- | fixed |
b2g-v1.4 | --- | fixed |
People
(Reporter: whimboo, Assigned: ttaubert)
References
Details
(5 keywords, Whiteboard: [mozmill])
Attachments
(1 file)
726 bytes,
patch
|
khuey
:
review+
bajaj
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
By running our Mozmill tests for Firefox Aurora we noticed a lot of aborts caused by overflowing refcounts in Firefox: Report: http://mm-ci-master.qa.scl3.mozilla.com:8080/job/mozilla-aurora_functional/16324/console Message: Fault in cycle collector: overflowing refcount (ptr: 0x10fd748b0) ###!!! ABORT: cycle collector fault: file ../../../../xpcom/base/nsCycleCollector.cpp, line 1226 ###!!! ABORT: cycle collector fault: file ../../../../xpcom/base/nsCycleCollector.cpp, line 1226 So far this problem only seems to exist on OS X across different platform versions and always seem to happen on exit.
Reporter | ||
Updated•10 years ago
|
Whiteboard: [mozmill]
Comment 1•10 years ago
|
||
My guess is that we end up having refcnt 0 - 1 for some reason. Don't or didn't we have some worker related bug for this?
Reporter | ||
Comment 2•10 years ago
|
||
Also happens for holly branch builds with slightly different lines of code: http://mm-ci-master.qa.scl3.mozilla.com:8080/job/project_functional/233/console Fault in cycle collector: overflowing refcount (ptr: 0x10de89870) ###!!! ABORT: cycle collector fault: file ../../../../xpcom/base/nsCycleCollector.cpp, line 1248 ###!!! ABORT: cycle collector fault: file ../../../../xpcom/base/nsCycleCollector.cpp, line 1248
status-firefox27:
--- → ?
status-firefox28:
--- → affected
status-firefox29:
--- → affected
tracking-firefox28:
--- → ?
tracking-firefox29:
--- → ?
There's bug 937220.
Reporter | ||
Comment 4•10 years ago
|
||
Finally I got a crash report for the holly build: bp-fe34fe81-c94a-4e50-a419-62d612140103 Stack: 0 libmozalloc.dylib mozalloc_abort(char const*) memory/mozalloc/mozalloc_abort.cpp 1 XUL Abort /builds/slave/m-in-osx64-0000000000000000000/build/obj-firefox/x86_64/xpcom/base/../../../../xpcom/base/nsDebugImpl.cpp 2 XUL NS_DebugBreak /builds/slave/m-in-osx64-0000000000000000000/build/obj-firefox/x86_64/xpcom/base/../../../../xpcom/base/nsDebugImpl.cpp 3 XUL GCGraphBuilder::DescribeRefCountedNode(unsigned int, char const*) xpcom/base/nsCycleCollector.cpp 4 XUL nsDOMEventTargetHelper::cycleCollection::Traverse(void*, nsCycleCollectionTraversalCallback&) content/events/src/nsDOMEventTargetHelper.cpp 5 XUL nsXHREventTarget::cycleCollection::Traverse(void*, nsCycleCollectionTraversalCallback&) content/base/src/nsXMLHttpRequest.cpp 6 XUL mozilla::dom::workers::XMLHttpRequest::cycleCollection::Traverse(void*, nsCycleCollectionTraversalCallback&) dom/workers/XMLHttpRequest.cpp 7 XUL GCGraphBuilder::Traverse(PtrInfo*) xpcom/base/nsCycleCollector.cpp 8 XUL nsCycleCollector::MarkRoots(js::SliceBudget&) xpcom/base/nsCycleCollector.cpp 9 XUL nsCycleCollector::Collect(ccType, js::SliceBudget&, nsICycleCollectorListener*) xpcom/base/nsCycleCollector.cpp 10 XUL nsCycleCollector_collect(nsICycleCollectorListener*) xpcom/base/nsCycleCollector.cpp 11 XUL mozilla::CycleCollectedJSRuntime::OnGC(JSGCStatus) xpcom/base/CycleCollectedJSRuntime.cpp 12 XUL Collect /builds/slave/m-in-osx64-0000000000000000000/build/obj-firefox/x86_64/js/src/../../../../js/src/jsgc.cpp 13 XUL mozilla::dom::workers::WorkerPrivate::GarbageCollectInternal(JSContext*, bool, bool) dom/workers/WorkerPrivate.cpp 14 XUL (anonymous namespace)::GarbageCollectRunnable::WorkerRun(JSContext*, mozilla::dom::workers::WorkerPrivate*) dom/workers/WorkerPrivate.cpp 15 XUL mozilla::dom::workers::WorkerRunnable::Run() dom/workers/WorkerRunnable.cpp 16 XUL mozilla::dom::workers::WorkerPrivate::ProcessAllControlRunnablesLocked() dom/workers/WorkerPrivate.cpp 17 XUL mozilla::dom::workers::WorkerPrivate::DoRunLoop(JSContext*) dom/workers/WorkerPrivate.cpp 18 XUL (anonymous namespace)::WorkerThreadPrimaryRunnable::Run() dom/workers/RuntimeService.cpp 19 XUL nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp 20 XUL NS_ProcessNextEvent(nsIThread*, bool) /builds/slave/m-in-osx64-0000000000000000000/build/xpcom/glue/nsThreadUtils.cpp 21 XUL mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /builds/slave/m-in-osx64-0000000000000000000/build/obj-firefox/x86_64/ipc/glue/../../../../ipc/glue/MessagePump.cpp 22 XUL MessageLoop::Run() /builds/slave/m-in-osx64-0000000000000000000/build/obj-firefox/x86_64/ipc/chromium/../../../../ipc/chromium/src/base/message_loop.cc 23 XUL nsThread::ThreadFunc(void*) xpcom/threads/nsThread.cpp 24 libnss3.dylib _pt_root 25 libsystem_pthread.dylib libsystem_pthread.dylib@0x1899 26 libsystem_pthread.dylib libsystem_pthread.dylib@0x172a 27 libsystem_pthread.dylib libsystem_pthread.dylib@0x5fc9 28 libnss3.dylib null_signal_handler
Keywords: crashreportid
Reporter | ||
Comment 5•10 years ago
|
||
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #3) > There's bug 937220. So does it mean it is a dupe? Given the stack I would say so. If it helps I could try to find a pattern to get it reproduced.
Comment 6•10 years ago
|
||
This actually also looks like it could be a top mac crash, as it shows up overall 24 in the top crash list, which has to be fairly bad as it is OSX only. https://crash-stats.mozilla.com/report/index/ba3b3008-31b7-4310-b521-2e9d32140102
Depends on: 937220
Comment 7•10 years ago
|
||
Oops, the other bug talks about that already, never mind.
Updated•10 years ago
|
Component: XUL → DOM: Workers
Comment 8•10 years ago
|
||
What are the steps to reproduce this? Maybe that would help with a fix.
Group: core-security
Comment 9•10 years ago
|
||
Probably just a dupe, so I'm going to set it as sec-other.
Keywords: sec-other
Reporter | ||
Comment 10•10 years ago
|
||
I can see this on and off during our Mozmill tests. I don't have a pattern yet but I will try to nail it down, and hopefully get manual steps.
Reporter | ||
Comment 13•10 years ago
|
||
No,sorry. I'm completely filled up with work on bug 956315. I asked Cosmin to do the investigation on the test side via bug 959562. You might want to trigger him.
Flags: needinfo?(hskupin)
Reporter | ||
Comment 14•10 years ago
|
||
Could this bug in some way be related to bug 937220 which also has been started with Firefox 28?
Comment 15•10 years ago
|
||
I think there's a reasonable chance it is the same issue.
Comment 16•10 years ago
|
||
Henrik, have you made any headway with the investigation here? If not, can this be offloaded to someone else on your team?
Reporter | ||
Comment 17•10 years ago
|
||
Nope, sorry. I was hoping to get an update from bug 937220. Cosmin, would you be able to continue here in finding a minimzed testcase for our testrun?
Flags: needinfo?(cosmin.malutan)
Assignee | ||
Comment 18•10 years ago
|
||
I tried to bisect this and it looks like https://hg.mozilla.org/mozilla-central/rev/8196cbbcadcd from bug 935032 might be the culprit. Does that sound plausible or is my testcase broken?
Comment 19•10 years ago
|
||
That seems a little unlikely to me.
This is almost certainly a regression from bug 928312.
Assignee | ||
Comment 21•10 years ago
|
||
The refcount overflows for a XMLHttpRequest instance that tries to request "resource://gre/modules/workers/lz4_internal.js".
Assignee | ||
Comment 22•10 years ago
|
||
It looks like XMLHttpRequest::Unpin() is getting called even when XMLHttpRequest::MaybePin() fails. Should Unpin() check mRooted? MaybePin() bails out after calling mWorkerPrivate->AddFeature() unsuccessfully.
(In reply to Tim Taubert [:ttaubert] from comment #22) > It looks like XMLHttpRequest::Unpin() is getting called even when > XMLHttpRequest::MaybePin() fails. Should Unpin() check mRooted? > > MaybePin() bails out after calling mWorkerPrivate->AddFeature() > unsuccessfully. Mmm, that was something I had convinced myself couldn't happen. AddFeature fails because the worker is shutting down, presumably?
I think the fix is probably to change http://mxr.mozilla.org/mozilla-central/source/dom/workers/XMLHttpRequest.cpp#1642 to set aRv to a failure code before returning.
Assignee | ||
Comment 25•10 years ago
|
||
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #23) > Mmm, that was something I had convinced myself couldn't happen. AddFeature > fails because the worker is shutting down, presumably? Yes, this is all happening on shutdown. (In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #24) > I think the fix is probably to change > http://mxr.mozilla.org/mozilla-central/source/dom/workers/XMLHttpRequest. > cpp#1642 to set aRv to a failure code before returning. Yeah, I wondered why we check aRv even though MaybePin() doesn't touch it.
Great debugging work btw.
Assignee | ||
Comment 27•10 years ago
|
||
Let's give it a shot.
Attachment #8366748 -
Flags: review?(khuey)
Attachment #8366748 -
Flags: review?(khuey) → review+
Assignee | ||
Comment 28•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/7393aa7b33ad
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → ttaubert
Status: NEW → ASSIGNED
Flags: needinfo?(cosmin.malutan)
Reporter | ||
Updated•10 years ago
|
Comment 29•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/7393aa7b33ad
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Comment 30•10 years ago
|
||
I checked the latest tinderbox build and passed our tests. It didn't failed in 30 runs so I can say is fixed.
Reporter | ||
Comment 31•10 years ago
|
||
Thanks Cosmin! Lets see how the Nightly build from today will behave. If all is fine please mark this bug as verified, and also update status-firefox29 to verified.
Assignee | ||
Updated•10 years ago
|
Assignee | ||
Comment 32•10 years ago
|
||
Comment on attachment 8366748 [details] [diff] [review] 0001-Bug-956284-Set-error-code-for-MaybePin-if-AddFeature.patch [Approval Request Comment] Bug caused by (feature/regressing bug #): bug 928312 User impact if declined: Possible crash on shutdown. Testing completed (on m-c, etc.): First reports after landing are looking good. Risk to taking this patch (and alternatives if risky): Small patch with a low risk. String or IDL/UUID changes made by this patch: None.
Attachment #8366748 -
Flags: approval-mozilla-aurora?
Comment 33•10 years ago
|
||
I ran the tests that failed against Nightly build for 50 times and it passes.
Comment 34•10 years ago
|
||
This is where the work actually happened, so I'll set this to sec-high.
Updated•10 years ago
|
Keywords: regression
Updated•10 years ago
|
Attachment #8366748 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 35•10 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/c6ff9eed5e83
Updated•10 years ago
|
status-b2g18:
--- → unaffected
status-b2g-v1.1hd:
--- → unaffected
status-b2g-v1.2:
--- → unaffected
status-b2g-v1.3:
--- → fixed
status-b2g-v1.4:
--- → fixed
Updated•10 years ago
|
status-firefox-esr24:
--- → unaffected
Reporter | ||
Comment 36•10 years ago
|
||
No more crashes happened today for our Mozmill testruns for Aurora builds on OS X. So all is working fine!
Status: RESOLVED → VERIFIED
Updated•10 years ago
|
status-b2g-v1.3T:
--- → fixed
Updated•9 years ago
|
Group: core-security
Updated•8 years ago
|
Keywords: csectype-intoverflow
You need to log in
before you can comment on or make changes to this bug.
Description
•