Closed
Bug 1117977
Opened 10 years ago
Closed 10 years ago
nsScriptLoader still not thread-safe
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
RESOLVED
FIXED
mozilla38
People
(Reporter: mccr8, Assigned: smaug)
Details
(Keywords: sec-moderate, Whiteboard: [adv-main38+])
Crash Data
Attachments
(2 files)
1.67 KB,
patch
|
mccr8
:
review+
|
Details | Diff | Splinter Review |
2.17 KB,
patch
|
Details | Diff | Splinter Review |
This is some kind of variant of bug 1111737. I get this in a local debug build from today's trunk.
STR:
1. Go to http://beta.unity3d.com/jonas/DT2/
2. While the Unity engine loading bar thing is still loading, exit the browser.
result: thread safety assertion.
[68842] WARNING: '!mMainThread', file /Users/amccreight/mc/xpcom/threads/nsThreadManager.cpp, line 404
[68842] WARNING: 'NS_FAILED(rv)', file /Users/amccreight/mc/xpcom/glue/nsThreadUtils.cpp, line 174
Hit MOZ_CRASH(nsScriptLoader not thread-safe) at /Users/amccreight/mc/dom/base/nsScriptLoader.cpp:178
#01: nsRunnable::Release()[/Users/amccreight/mc/obj-dbg.noindex/dist/NightlyDebug.app/Contents/MacOS/XUL +0xfc33e]
#02: js::HelperThread::handleParseWorkload()[/Users/amccreight/mc/obj-dbg.noindex/dist/NightlyDebug.app/Contents/MacOS/XUL +0x3926523]
#03: js::HelperThread::threadLoop()[/Users/amccreight/mc/obj-dbg.noindex/dist/NightlyDebug.app/Contents/MacOS/XUL +0x3925201]
#04: _pt_root[/Users/amccreight/mc/obj-dbg.noindex/dist/NightlyDebug.app/Contents/MacOS/libnss3.dylib +0x33c199]
#05: _pthread_body[/usr/lib/system/libsystem_pthread.dylib +0x1899]
#06: _pthread_struct_init[/usr/lib/system/libsystem_pthread.dylib +0x172a]
Process 68842 stopped
* thread #16: tid = 0x61e4bb, 0x00000001024792d5 XUL`nsScriptLoader::Release(this=<unavailable>) + 213 at nsScriptLoader.cpp:178, name = 'Analysis Helper', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
(lldb) bt
* thread #16: tid = 0x61e4bb, 0x00000001024792d5 XUL`nsScriptLoader::Release(this=<unavailable>) + 213 at nsScriptLoader.cpp:178, name = 'Analysis Helper', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x00000001024792d5 XUL`nsScriptLoader::Release(this=<unavailable>) + 213 at nsScriptLoader.cpp:178
frame #1: 0x000000010249f1a1 XUL`(anonymous namespace)::NotifyOffThreadScriptLoadCompletedRunnable::~NotifyOffThreadScriptLoadCompletedRunnable() [inlined] nsRefPtr<nsScriptLoader>::~nsRefPtr() + 33 at nsRefPtr.h:59
frame #2: 0x000000010249f193 XUL`(anonymous namespace)::NotifyOffThreadScriptLoadCompletedRunnable::~NotifyOffThreadScriptLoadCompletedRunnable() [inlined] nsRefPtr<nsScriptLoader>::~nsRefPtr() at nsRefPtr.h:57
frame #3: 0x000000010249f193 XUL`(anonymous namespace)::NotifyOffThreadScriptLoadCompletedRunnable::~NotifyOffThreadScriptLoadCompletedRunnable() [inlined] (anonymous namespace)::NotifyOffThreadScriptLoadCompletedRunnable::~NotifyOffThreadScriptLoadCompletedRunnable(this=0x0000000114495d00) + 10 at nsScriptLoader.cpp:787
frame #4: 0x000000010249f189 XUL`(anonymous namespace)::NotifyOffThreadScriptLoadCompletedRunnable::~NotifyOffThreadScriptLoadCompletedRunnable() [inlined] (anonymous namespace)::NotifyOffThreadScriptLoadCompletedRunnable::~NotifyOffThreadScriptLoadCompletedRunnable(this=0x0000000114495d00, ptr=<unavailable>) at nsScriptLoader.cpp:787
frame #5: 0x000000010249f189 XUL`(anonymous namespace)::NotifyOffThreadScriptLoadCompletedRunnable::~NotifyOffThreadScriptLoadCompletedRunnable(this=0x0000000114495d00) + 9 at nsScriptLoader.cpp:787
frame #6: 0x000000010161433e XUL`nsRunnable::Release(this=<unavailable>) + 94 at nsThreadUtils.cpp:32
frame #7: 0x0000000104e3e523 XUL`js::HelperThread::handleParseWorkload(this=0x000000010bceed80) + 547 at HelperThreads.cpp:1190
frame #8: 0x0000000104e3d201 XUL`js::HelperThread::threadLoop(this=0x000000010bceed80) + 721 at HelperThreads.cpp:1373
frame #9: 0x000000010133c199 libnss3.dylib`_pt_root(arg=0x000000010032a370) + 281 at ptthread.c:212
frame #10: 0x00007fff88f6d899 libsystem_pthread.dylib`_pthread_body + 138
frame #11: 0x00007fff88f6d72a libsystem_pthread.dylib`_pthread_start + 137
frame #12: 0x00007fff88f71fc9 libsystem_pthread.dylib`thread_start + 13
Reporter | ||
Comment 1•10 years ago
|
||
from bug 1111737:
(In reply to Boris Zbarsky [:bz] from comment #6)
> What if the runnable never runs?
Maybe this is what happens? ;)
Reporter | ||
Comment 2•10 years ago
|
||
The NS_FAILED from the above log is in NS_DispatchToMainThread, so maybe OffThreadScriptLoaderCallback is calling NS_DispatchToMainThread, but it is too late in shutdown?
The line prior to the one I posted is
WARNING: YOU ARE LEAKING THE WORLD (at least one JSRuntime and everything alive inside it, that is) AT JS_ShutDown TIME. FIX THIS!
so we're somewhere in JS_ShutDown when the failure happens, probably in the part where it shuts down the threads.
Reporter | ||
Comment 3•10 years ago
|
||
Yeah, that failure is indeed in OffThreadScriptLoaderCallback. So we're shutting down the JS helper threads after the point where we can call NS_DispatchToMainThread. Yeah, okay, it looks like the failure is happening in nsThreadManager::GetMainThread, which is what that prior warning is. I'm not sure when that goes bad.
Reporter | ||
Comment 4•10 years ago
|
||
It looks like mMainThread gets cleared on this line in ShutdownXPCOM:
nsThreadManager::get()->Shutdown()
...then a bunch of lines later we call JS_ShutDown, which I'm guessing shuts down the helper threads, and then kablooey.
Reporter | ||
Comment 5•10 years ago
|
||
jst said if nothing else we could just intentionally leak if NS_DispatchToMainThread fails. Though we may have other lurking issues if we're shutting down JS threads so much after we're shutting down XPCOM thread utils.
Comment 6•10 years ago
|
||
fyi, bughunter has seen this crash for some time and it is in fact the top crash for bughunter. It isn't amenable to test case reduction however. I've been retesting all of the 43,000+ urls where bughunter reproduced a crash from October-December and have seen 3 sets of stacks similar to:
nsScriptLoader::Release nsRefPtr<nsScriptLoader>::~nsRefPtr nsRefPtr<nsScriptLoader>::~nsRefPtr 10743 times on osx 10.6, 10.8, 10.9 for beta,aurora,nightly for 612 urls. osx testing is only about 2/3 complete so the final number will end up being higher.
nsScriptLoader::Release nsRefPtr<nsScriptLoader>::~nsRefPtr<nsScriptLoader> `anonymous namespace'::NotifyOffThreadScriptLoadCompletedRunnable::~NotifyOffThreadScriptLoadCompletedRunnable `anonymous namespace'::NotifyOffThreadScriptLoadCompletedRunnable::`scalar deleting destructor' nsRunnable::Release 4707 times on windows 7 32bit and 64bit for beta,aurora,nightly for 867 urls. windows testing is complete.
nsScriptLoader::Release NotifyOffThreadScriptLoadCompletedRunnable::~NotifyOffThreadScriptLoadCompletedRunnable nsRunnable::Release OffThreadScriptLoaderCallback js::HelperThread::handleParseWorkload 179 times on rhel6 32bit and 64bit for beta, aurora, nighty for 179 urls. linux testing is complete.
If you would like the crash reports from minidump_stackwalk, please let me know and I'll attach them. I'm not sure how useful the urls would be since I've tried to reproduce this in the past and failed.
Assignee | ||
Comment 7•10 years ago
|
||
Bob, are those crashes shutdown crashes, or happening during normal browsing too?
Flags: needinfo?(bob)
Comment 8•10 years ago
|
||
Looking at the logs, it appeared that there were both shutdown and other crashes. I could reproduce about 25% of the time with a url on Windows 7 using the Spider extension I use to load urls. The bughunter logs originally showed a crash before attempting to shutdown with this url but when I tried it, the crashes were all on shutdown. I couldn't reproduce it running the page manually without the Spider extension though. So maybe this is an artifact from using the extension.
STR:
1. Install http://bclary.com/2004/07/10/spider/spider.xpi into a profile, e.g. test-profile
2. for i in {1..20}; do echo $i; if firefox -P test-profile -spider -url "http://www.fashion-woman.com/stil-i-moda/tendencii/modnaya-obuv-osen-zima-2014-2015/" -depth 0 -timeout 60 -wait 0 -start -quit 2>&1 | tee /tmp/crash.log | grep MOZ_CRASH; then break; fi; done
reproduced the crash on beta, aurora, nightly on windows 7 and osx 10.9 for me.
Flags: needinfo?(bob)
Assignee | ||
Comment 9•10 years ago
|
||
Wait, which crash happens on beta? I thought we had the thread safety check only on nightly.
Assignee | ||
Comment 10•10 years ago
|
||
Assignee: nobody → bugs
Attachment #8566172 -
Flags: review?(continuation)
Assignee | ||
Comment 11•10 years ago
|
||
Reporter | ||
Updated•10 years ago
|
Attachment #8566172 -
Flags: review?(continuation) → review+
Assignee | ||
Comment 12•10 years ago
|
||
Comment 13•10 years ago
|
||
Backed out for Nexus 5-L bustage. Of particular note, it's the only Android L-based build we have in automation.
https://hg.mozilla.org/integration/mozilla-inbound/rev/2b2f1d308c3a
https://treeherder.mozilla.org/logviewer.html#?job_id=6747192&repo=mozilla-inbound
Assignee | ||
Comment 14•10 years ago
|
||
bah, sorry.
Assignee | ||
Comment 15•10 years ago
|
||
Assignee | ||
Comment 16•10 years ago
|
||
Comment 17•10 years ago
|
||
(In reply to Olli Pettay [:smaug] (high review load) from comment #9)
> Wait, which crash happens on beta? I thought we had the thread safety check
> only on nightly.
I hit the same MOZ_CRASH on beta and aurora as well as nightly.
Comment 18•10 years ago
|
||
Status: NEW → RESOLVED
Closed: 10 years ago
status-firefox38:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Comment 19•10 years ago
|
||
How far back does this go? Should this land on at least Beta?
status-b2g-v2.1:
--- → ?
status-b2g-v2.1S:
--- → ?
status-b2g-v2.2:
--- → affected
status-b2g-master:
--- → fixed
status-firefox37:
--- → ?
status-firefox-esr31:
--- → ?
Flags: needinfo?(bugs)
Assignee | ||
Comment 20•10 years ago
|
||
I think this goes all the way to Bug 906371. Have we seen any real world non-shutdown crashes related to this?
Flags: needinfo?(bugs)
Comment 21•10 years ago
|
||
I'm not sure who would answer that.
status-b2g-v1.4:
--- → wontfix
status-b2g-v2.0:
--- → wontfix
status-b2g-v2.0M:
--- → wontfix
status-firefox36:
--- → wontfix
Reporter | ||
Comment 23•10 years ago
|
||
Well, in comment 8, bc said that some of them showed up not at shutdown, but only with the addon, so maybe the addon was doing something weird. I think we can just let this one go.
Flags: needinfo?(continuation)
Updated•10 years ago
|
Reporter | ||
Comment 24•10 years ago
|
||
I came across a few crash report for this on crash-stats (for Aurora, where this did not land):
https://crash-stats.mozilla.com/report/index/294af7d6-85f3-43bd-8edd-0cf9e2150319
https://crash-stats.mozilla.com/report/index/a0f270f7-0eeb-45cc-b457-6e3652150322
https://crash-stats.mozilla.com/report/index/d5f0d2b2-400b-493e-9326-448612150317
They are, as expected, happening during XPCOM shutdown.
Crash Signature: [@ NS_CycleCollectorSuspect3 ]
Updated•10 years ago
|
Whiteboard: [adv-main38+]
Updated•9 years ago
|
Group: core-security → core-security-release
Updated•8 years ago
|
Group: core-security-release
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•