Closed
Bug 1336478
Opened 8 years ago
Closed 4 years ago
Crash in [@ mozilla::CycleCollectedJSContext::ProcessMetastableStateQueue]
Categories
(Core :: XPCOM, defect, P3)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: philipp, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: crash, regression)
Crash Data
This bug was filed from the Socorro interface and is
report bp-eac07850-af35-4e34-9040-963d72170203.
=============================================================
Crashing Thread (74)
Frame Module Signature Source
0 xul.dll mozilla::CycleCollectedJSContext::ProcessMetastableStateQueue(unsigned int) xpcom/base/CycleCollectedJSContext.cpp:1351
1 xul.dll mozilla::CycleCollectedJSContext::AfterProcessTask(unsigned int) xpcom/base/CycleCollectedJSContext.cpp:1387
2 xul.dll nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp:1083
3 xul.dll NS_ProcessNextEvent(nsIThread*, bool) xpcom/glue/nsThreadUtils.cpp:311
4 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) ipc/glue/MessagePump.cpp:338
5 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc:225
6 xul.dll MessageLoop::Run() ipc/chromium/src/base/message_loop.cc:205
7 xul.dll nsThread::ThreadFunc(void*) xpcom/threads/nsThread.cpp:465
8 nss3.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:397
9 nss3.dll pr_root nsprpub/pr/src/md/windows/w95thred.c:95
10 ucrtbase.dll _o__CIpow
11 kernel32.dll BaseThreadInitThunk
12 ntdll.dll __RtlUserThreadStart
13 ntdll.dll _RtlUserThreadStart
this cross-platform crash signature is regressing since firefox 51 and later with MOZ_RELEASE_ASSERT(!mDoingStableStates) that was added with bug 1292892. it's a rather low volume crash though (<0.1% of crashes in 51.0.1).
Correlations for Firefox Release
(99.12% in signature vs 00.08% overall) moz_crash_reason = MOZ_RELEASE_ASSERT(!mDoingStableStates)
(95.15% in signature vs 36.54% overall) reason = EXCEPTION_BREAKPOINT
(17.18% in signature vs 00.24% overall) Addon "ascsurfingprotectionnew@iobit.com" = true [61.90% vs 00.32% if startup_crash = 0]
(15.42% in signature vs 00.23% overall) Addon "ascsurfingprotectionnew@iobit.com" Version = 2.1.3 [55.56% vs 00.30% if startup_crash = 0]
(33.92% in signature vs 02.95% overall) GFX_ERROR "Failed 2 buffer db=" = true [46.95% vs 05.29% if process_type = content]
(25.11% in signature vs 00.58% overall) GFX_ERROR "Failed to create DIB section for a bitmap of size " = true [34.76% vs 01.89% if process_type = content]
(79.74% in signature vs 28.82% overall) Module "qasf.dll" = true [72.22% vs 37.14% if platform_pretty_version = Windows 10]
(31.28% in signature vs 03.12% overall) contains_memory_report = 1
Comment 1•8 years ago
|
||
(In reply to [:philipp] from comment #0)
> this cross-platform crash signature is regressing since firefox 51 and later
> with MOZ_RELEASE_ASSERT(!mDoingStableStates) that was added with bug
> 1292892. it's a rather low volume crash though (<0.1% of crashes in 51.0.1).
Bug 1292892 did not add that assert. CC'ing some people who may know more.
bz, smaug: see also bug 1312623.
Comment 3•8 years ago
|
||
I don't understand the stack trace from the initial comment. We don't have ProcessMetastableStateQueue more than once on stack there.
Bug 1312623 looks like the issue which was fixed in
http://searchfox.org/mozilla-central/rev/f5077ad52f8b90183e73038869f6140f0afbf427/dom/media/MediaStreamGraph.cpp#1720-1723
Reporter | ||
Comment 5•8 years ago
|
||
this crash signature jumped up in volume again after the march release date. is there anything more we could do about it?
Comment 6•8 years ago
|
||
Mass wontfix for bugs affecting firefox 52.
Comment 7•8 years ago
|
||
Nathan, can you help get this to someone who can investigate? This is a pretty dramatic jump, I think in 52.
Too late to fix for 53 but we could still shoot for 54.
You can see the graph here https://crash-stats.mozilla.com/signature/?signature=mozilla%3A%3ACycleCollectedJSContext%3A%3AProcessMetastableStateQueue&date=%3E%3D2017-01-15T02%3A34%3A06.000Z&date=%3C2017-04-15T02%3A34%3A06.000Z#graphs
Flags: needinfo?(nfroyd)
![]() |
||
Comment 8•8 years ago
|
||
Andrew, do you have any ideas here? I'm kind of with smaug's comment 3: I don't understand how mDoingStableStates can be true here...at least, assuming that the stack is correct. We're not doing something weird, like using the same JS context on multiple threads, are we?
If you don't have any ideas, I guess I'll try looking at minidumps to see whether that stack is reasonable and what the value of mDoingStableStates might be. Maybe we're just looking at some weird memory corruption...?
Flags: needinfo?(nfroyd) → needinfo?(continuation)
Comment 9•8 years ago
|
||
Sorry, I don't know anything about this metastable state stuff.
Flags: needinfo?(continuation)
![]() |
||
Comment 10•8 years ago
|
||
OK, I took at look at the crash from comment 0. The unwound stack looks reasonable from the stack memory in the crashdump, but I can't examine what the CycleCollectedJSContext looks like, as the crashdump doesn't have the range of memory containing the JS context.
I also took a look at some other crashes, which looked a little more promising:
https://crash-stats.mozilla.com/report/index/6acb7112-e2d3-41ac-9294-80c672170411
https://crash-stats.mozilla.com/report/index/54daad5e-bbf8-4c9c-8997-ac58f2170412
https://crash-stats.mozilla.com/report/index/d3d99b23-19d2-4969-afaa-4a6642170417
where it looks like we're processing from microtask checkpoints. Unfortunately, we're getting called from JIT code, so we can't see full stacks, but is it possible this code is getting called reentrantly?
Flags: needinfo?(bugs)
Comment 11•8 years ago
|
||
That is what the crash seems to hint. Media code had recently a bug where it re-entered metastable state handling.
But the stack traces look still odd. What is that @0x0
Flags: needinfo?(bugs)
Comment 12•8 years ago
|
||
I can currently trigger this consistently in 53.0.2 by trying to open a link to thewrap.com from r/movies
bp-96f7ea7e-8fed-421c-aed1-9c0a51170515
bp-4f6d823c-843b-4015-a544-6cd5a0170515
bp-06ddbcf1-05e1-478d-bbc6-6ebf00170515
Anything I can do to help debug?
Flags: needinfo?(nfroyd)
Flags: needinfo?(bugs)
Comment 14•8 years ago
|
||
Do you have _exact_ steps to reproduce?
What does r/movies mean?
So far I haven't managed to reproduce using FF 53.0.2 on linux
Comment 15•8 years ago
|
||
You do seem to have quite a few addons. Can you reproduce the issue without those addons?
![]() |
||
Comment 16•8 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #14)
> What does r/movies mean?
Presumably he means https://www.reddit.com/r/movies/
Knowing if the issue reproduces without addons would be helpful.
Flags: needinfo?(nfroyd) → needinfo?(moz-ian)
Comment 17•8 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #14)
> Do you have _exact_ steps to reproduce?
>
> What does r/movies mean?
>
> So far I haven't managed to reproduce using FF 53.0.2 on linux
0. Have 53.0.2 with NoScript installed.
1. Go to https://www.reddit.com/r/movies/
2. (Middle) Click on the current link to http://www.thewrap.com/powers-boothe-emmy-winning-character-actor-dead-68/
(In reply to Olli Pettay [:smaug] from comment #15)
> You do seem to have quite a few addons. Can you reproduce the issue without
> those addons?
Safe-mode: cannot reproduce.
Disable just NoScript: cannot reproduce.
In fact just allowing scripts globally with NoScript still enabled seems to stop it happening.
The experience is basically identical to bug 1235183, which used to mean opening a facebook (or gfycat?) tab would crash the browser, and again seemed to be NoScript related.
Flags: needinfo?(moz-ian)
Comment 18•8 years ago
|
||
Still no luck reproducing, FF 53.0.2 on linux + NoScript.
Comment 19•8 years ago
|
||
oh, but I see it in the stack.
http://searchfox.org/mozilla-central/rev/484d2b7f51b7aed035147bbb4a565061659d9278/netwerk/protocol/http/nsHttpChannel.cpp#6268
Looks like NoScript spins event loop during http-on-modify-request. That is not good, at all.
Comment 20•8 years ago
|
||
Giorgio, this looks like a NoScript bug to me.
Is it spinning event loop at unsafe time?
Flags: needinfo?(g.maone)
Comment 21•8 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #20)
> Giorgio, this looks like a NoScript bug to me.
> Is it spinning event loop at unsafe time?
No it should not, unless very obscure, deprecated (disabled by default of course and going away very soon) ABE options are enabled.
Reporter, could you please try disabling ABE and/or sending me privately your NoScript Options>Export file?
Thanks.
Flags: needinfo?(g.maone)
Comment 22•8 years ago
|
||
(In reply to Giorgio Maone [:mao] from comment #21)
> (In reply to Olli Pettay [:smaug] from comment #20)
> > Giorgio, this looks like a NoScript bug to me.
> > Is it spinning event loop at unsafe time?
>
> No it should not, unless very obscure, deprecated (disabled by default of
> course and going away very soon) ABE options are enabled.
Ah. That sounds very much like it. I have... a few ABE rules.
> Reporter, could you please try disabling ABE and/or sending me privately
> your NoScript Options>Export file?
> Thanks.
Disabling ABE does seem to stop it, though re-enabling it doesn't bring the crash back (unless I restart the browser. Same thing happened when enabling scripts globally).
Emailed you the file.
Comment 23•8 years ago
|
||
(In reply to Ian Moody [:Kwan] from comment #22)
> Emailed you the file.
Thank you, please reset your "abe.siteEnabled" about:config preference (AKA NoScript Options>Advanced>ABE>Allow sites to push their own rulesets) to its default "false" value.
BTW, I don't think any production website has ever implemented its own server-side dynamic ABE rulesets, and that's exactly the deprecated option I was talking about ;)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•8 years ago
|
Comment 27•7 years ago
|
||
I don't see any crashes on 57 in the past 2 weeks... and like, 12 for 56.
Comment 28•7 years ago
|
||
(In reply to Giorgio Maone [:mao] from comment #23)
> (In reply to Ian Moody [:Kwan] from comment #22)
>
> > Emailed you the file.
>
> Thank you, please reset your "abe.siteEnabled" about:config preference (AKA
> NoScript Options>Advanced>ABE>Allow sites to push their own rulesets) to its
> default "false" value.
> BTW, I don't think any production website has ever implemented its own
> server-side dynamic ABE rulesets, and that's exactly the deprecated option I
> was talking about ;)
This did seem to fix it for the record, though as the STR stopped working it was hard to be certain. But when I turned it back on again for a while I'd occasionally get the crash, and then when back off again never did.
Not entirely sure what my reasoning would have been behind turning it on in the first place.
![]() |
||
Updated•7 years ago
|
Severity: critical → normal
Priority: -- → P3
Comment 29•7 years ago
|
||
Signature report for mozilla::CycleCollectedJSContext::ProcessMetastableStateQueue
Showing results from 7 days ago
Windows 10 957 30.9%
Windows 7 951 30.7%
Windows Vista 630 20.4%
Windows XP 497 16.1%
Windows 8.1 39 1.3%
Windows Serv03 10 0.3%
Linux 4 0.1%
OS X 10.13 2 0.1%
Windows 8 2 0.1%
Android 1 0.0%
Firefox 52.7.3esr 2212 71.4% 1960
Firefox 52.6.0esr 385 12.4% 284
Firefox 52.7.2esr 144 4.6% 116
Firefox 52.3.0esr 49 1.6% 17
Firefox 52.7.0esr 37 1.2% 37
Firefox 52.5.0esr 29 0.9% 22
Firefox 52.4.0esr 20 0.6% 13
Firefox 52.5.3esr 19 0.6% 19
Firefox 52.0.2esr 15 0.5% 6
Firefox 52.1.2esr 15 0.5% 13
Firefox 52.2.0esr 15 0.5% 13
Firefox 59.0.2 3 0.1% 3
Firefox 56.0.2 3 0.1% 3
Firefox 56.0.1 1 0.0% 1
Firefox 56.0 1 0.0% 1
Firefox 55.0.3 1 0.0% 1
Firefox 54.0b6 1 0.0% 1
Firefox 54.0.1 4 0.1% 3
Firefox 54.0 1 0.0% 1
Firefox 53.0b8 3 0.1% 1
Firefox 53.0b2 1 0.0% 1
FennecAndroid 58.0.1 1 0.0% 1
Uptime
> 1 hour 2351 76.0%
15-60 min 480 15.5%
5-15 min 168 5.4%
1-5 min 61 2.0%
< 1 min 33 1.1%
Architecture
x86 3077 99.5%
amd64 15 0.5%
arm 1 0.0%
status-firefox59:
--- → affected
Summary: Crash in mozilla::CycleCollectedJSContext::ProcessMetastableStateQueue → Crash in [@ mozilla::CycleCollectedJSContext::ProcessMetastableStateQueue]
Updated•7 years ago
|
Blocks: RunWatchdogShutdownhang
Comment 30•7 years ago
|
||
This appears to still be lurking around on release at an extremely lower frequency. Bug 1452416 should take care of the high crash rate on ESR52.
status-firefox60:
--- → ?
status-firefox61:
--- → ?
Comment 31•4 years ago
|
||
Closing because no crashes reported for 12 weeks.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•