Crash in mozilla::MediaShutdownManager::EnsureCorrectShutdownObserverState

RESOLVED WORKSFORME

Status

()

defect
P2
critical
RESOLVED WORKSFORME
3 years ago
Last year

People

(Reporter: philipp, Assigned: jwwang)

Tracking

({crash, regression, stale-bug})

49 Branch
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox50 wontfix, firefox51 wontfix, firefox52 wontfix, firefox-esr52 fix-optional, firefox53 unaffected)

Details

(crash signature)

Attachments

(1 attachment)

Reporter

Description

3 years ago
This bug was filed from the Socorro interface and is 
report bp-917383b6-2233-44e4-8a9f-a74d52161125.
=============================================================
Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	xul.dll 	mozilla::MediaShutdownManager::EnsureCorrectShutdownObserverState() 	dom/media/MediaShutdownManager.cpp:77
1 	xul.dll 	mozilla::MediaDecoder::MediaDecoder(mozilla::MediaDecoderOwner*) 	dom/media/MediaDecoder.cpp:621
2 	xul.dll 	mozilla::OggDecoder::OggDecoder(mozilla::MediaDecoderOwner*) 	obj-firefox/dist/include/OggDecoder.h:20
3 	xul.dll 	mozilla::InstantiateDecoder 	dom/media/DecoderTraits.cpp:590
4 	xul.dll 	mozilla::dom::HTMLMediaElement::InitializeDecoderForChannel(nsIChannel*, nsIStreamListener**) 	dom/html/HTMLMediaElement.cpp:3302
5 	xul.dll 	nsCOMPtr<nsIChannel>::nsCOMPtr<nsIChannel>(nsQueryInterface) 	obj-firefox/dist/include/nsCOMPtr.h:504
6 	mozglue.dll 	arena_dalloc_small 	memory/mozjemalloc/jemalloc.c:4609
7 	mozglue.dll 	je_free 	memory/mozjemalloc/jemalloc.c:6479
8 	xul.dll 	nsCOMPtr_base::~nsCOMPtr_base() 	obj-firefox/dist/include/nsCOMPtr.h:295
9 	nss3.dll 	md_UnlockAndPostNotifies 	nsprpub/pr/src/md/windows/w95cv.c:137
10 	nss3.dll 	PR_ExitMonitor 	nsprpub/pr/src/threads/prmon.c:215
11 	xul.dll 	mozilla::net::HttpChannelChild::DoOnStartRequest(nsIRequest*, nsISupports*) 	netwerk/protocol/http/HttpChannelChild.cpp:532
12 	xul.dll 	mozilla::net::HttpChannelChild::OnStartRequest(nsresult const&, mozilla::net::nsHttpResponseHead const&, bool const&, mozilla::net::nsHttpHeaderArray const&, bool const&, bool const&, unsigned int const&, nsCString const&, nsCString const&, mozilla::net::NetAddr const&, mozilla::net::NetAddr const&, unsigned int const&) 	netwerk/protocol/http/HttpChannelChild.cpp:463
13 	xul.dll 	mozilla::net::StartRequestEvent::Run() 	netwerk/protocol/http/HttpChannelChild.cpp:336
14 	xul.dll 	mozilla::net::ChannelEventQueue::RunOrEnqueue(mozilla::net::ChannelEvent*, bool) 	obj-firefox/dist/include/mozilla/net/ChannelEventQueue.h:133
15 	xul.dll 	mozilla::net::HttpChannelChild::RecvOnStartRequest(nsresult const&, mozilla::net::nsHttpResponseHead const&, bool const&, mozilla::net::nsHttpHeaderArray const&, bool const&, bool const&, unsigned int const&, nsCString const&, nsCString const&, mozilla::net::NetAddr const&, mozilla::net::NetAddr const&, short const&, unsigned int const&) 	netwerk/protocol/http/HttpChannelChild.cpp:384
16 	xul.dll 	mozilla::net::PHttpChannelChild::OnMessageReceived(IPC::Message const&) 	obj-firefox/ipc/ipdl/PHttpChannelChild.cpp:653
17 	xul.dll 	mozilla::dom::PContentChild::OnMessageReceived(IPC::Message const&) 	obj-firefox/ipc/ipdl/PContentChild.cpp:7392
18 	xul.dll 	mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&) 	ipc/glue/MessageChannel.cpp:1661
19 	xul.dll 	mozilla::ipc::MessageChannel::DispatchMessageW(IPC::Message&&) 	ipc/glue/MessageChannel.cpp:1599
20 	xul.dll 	mozilla::ipc::MessageChannel::OnMaybeDequeueOne() 	ipc/glue/MessageChannel.cpp:1566
21 	xul.dll 	mozilla::detail::RunnableMethodImpl<bool ( mozilla::ipc::MessageChannel::*)(void), 0, 1>::Run() 	obj-firefox/dist/include/nsThreadUtils.h:764
22 	xul.dll 	mozilla::ipc::MessageChannel::DequeueTask::Run() 	obj-firefox/dist/include/mozilla/ipc/MessageChannel.h:569
23 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:1076

this is a cross platform crash with low volume regressing since firefox 49 - it looks like it's crashing in a codepath last touched in bug 1274522.
reports are marked with MOZ_RELEASE_ASSERT(((bool)(!!(!NS_FAILED_impl(rv))))) on windows and MOZ_RELEASE_ASSERT(((bool)(__builtin_expect(!!(!NS_FAILED_impl(rv)), 1)))) on linux/android.
Comment hidden (mozreview-request)

Comment 2

3 years ago
mozreview-review
Comment on attachment 8815145 [details]
Bug 1320346 - dump |rv| when AddBlocker() fails to get more debugging info.

https://reviewboard.mozilla.org/r/96158/#review96316
Attachment #8815145 - Flags: review?(cpearce) → review+
Assignee

Comment 3

3 years ago
Thanks!
Assignee: nobody → jwwang
Keywords: leave-open

Comment 4

3 years ago
Pushed by jwwang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2e51142370f3
dump |rv| when AddBlocker() fails to get more debugging info. r=cpearce

Comment 5

3 years ago
Backout by cbook@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/51303e0e6ddf
Backed out changeset 2e51142370f3 for bustage
Comment hidden (mozreview-request)

Comment 8

3 years ago
Pushed by jwwang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/19d1b588f308
dump |rv| when AddBlocker() fails to get more debugging info. r=cpearce
Assignee

Updated

3 years ago
Flags: needinfo?(jwwang)
very low volume crash, wontfix for 50.1.0 release
Maybe we should uplift the debugging patch to get more information sooner?
Flags: needinfo?(jwwang)
Assignee

Comment 12

3 years ago
Comment on attachment 8815145 [details]
Bug 1320346 - dump |rv| when AddBlocker() fails to get more debugging info.

Approval Request Comment
[Feature/Bug causing the regression]:1320346
[User impact if declined]:this is a debugging patch to get more info about the crash.
[Is this code covered by automated tests?]:yes
[Has the fix been verified in Nightly?]:this is not a fix but a debugging patch.
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]:none
[Is the change risky?]:no
[Why is the change risky/not risky?]:this change is as simple as dumping some debugging info when crash happens.
[String changes made/needed]:none
Flags: needinfo?(jwwang)
Attachment #8815145 - Flags: approval-mozilla-aurora?
Comment on attachment 8815145 [details]
Bug 1320346 - dump |rv| when AddBlocker() fails to get more debugging info.

help with debugging in aurora52
Attachment #8815145 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee

Comment 15

2 years ago
https://crash-stats.mozilla.com/report/index/e503f6aa-a573-40ff-8b0a-f64832170118
MOZ_CRASH Reason = Failed to add shutdown blocker! rv=80570021

Not sure if it is a normal case that AddBlocker() could return rv=80570021?
Flags: needinfo?(nfroyd)
(In reply to JW Wang [:jwwang] [:jw_wang] from comment #15)
> https://crash-stats.mozilla.com/report/index/e503f6aa-a573-40ff-8b0a-
> f64832170118
> MOZ_CRASH Reason = Failed to add shutdown blocker! rv=80570021
> 
> Not sure if it is a normal case that AddBlocker() could return rv=80570021?

It's not a normal case; nsresult 0x80570021 is NS_ERROR_XPC_CI_RETURNED_FAILURE:

http://dxr.mozilla.org/mozilla-central/source/xpcom/base/ErrorList.h#642

which means that some createInstance call (likely somewhere in js-land) failed.  That's, uh, pretty weird.  I think the usual case for that happening is that you're trying to call createInstance at the wrong time, like during shutdown:

http://dxr.mozilla.org/mozilla-central/source/xpcom/components/nsComponentManager.cpp#1016

But AFAICT from the stacks in that crash report, that's not what's happening.

I don't have any good theories.  The only ones I can come up with are:

* Bad memory that bit-flipped part of the contract ID or similar, so we're not looking up the correct thing inside the component manager somehow.
* Some sort of corrupt installation where the async shutdown bits are gone...but I would think the crash would show up more often, or we'd run into other problems elsewhere.
Flags: needinfo?(nfroyd)
Assignee

Updated

2 years ago
Depends on: 1336345
Assignee

Comment 17

2 years ago
More error codes:

https://crash-stats.mozilla.com/report/index/6e946282-b89b-4ea9-91e4-193da2170127
Failed to add shutdown blocker! rv=8057001e (NS_ERROR_XPC_JS_THREW_STRING)

https://crash-stats.mozilla.com/report/index/6b1d821f-9f62-4076-a3f6-acc4c2170128
Failed to add shutdown blocker! rv=80520012 (NS_ERROR_FILE_NOT_FOUND)

The error code doesn't make sense at all. It looks like some memory corruption.
From the correlations I wonder if this is add-on related,

(100.0% in signature vs 00.00% overall) moz_crash_reason = MOZ_RELEASE_ASSERT(((bool)(!!(!NS_FAILED_impl(rv)))))
(100.0% in signature vs 00.04% overall) Addon "微信网页版助手" = true
(100.0% in signature vs 00.04% overall) Addon "China Edition Tab Tweak" = true
(100.0% in signature vs 00.05% overall) Addon "China Edition Addons Manager" = true
(100.0% in signature vs 00.05% overall) Addon "Easy Screenshot" = true
(100.0% in signature vs 00.06% overall) useragent_locale = zh-CN
jw - see comment 18.  Perhaps that helps... this crashes across a variety of revisions and OS versions, so it's not all one or even a couple users.
Flags: needinfo?(jwwang)
Assignee

Comment 20

2 years ago
https://crash-stats.mozilla.com/report/index/6e946282-b89b-4ea9-91e4-193da2170127#tab-extensions

Is there any way to find an addon by its Extension Id? This crash doesn't seem to have any of above addons installed. So there might be more than one addon causing memory corruption.
Flags: needinfo?(jwwang)
Mass wontfix for bugs affecting firefox 52.
Not seeing any hits with this signature in the last 3 months for 53+. ESR52 remains affected, albeit low-volume. I don't think this is worth tracking at this point.
This is an assigned P1 bug without activity in two weeks. 

If you intend to continue working on this bug for the current release/iteration/sprint, remove the 'stale-bug' keyword.

Otherwise we'll reset the priority of the bug back to '--' on Monday, August 28th.
Keywords: stale-bug
Mass change P1->P2 to align with new Mozilla triage process
Priority: P1 → P2
The most recent build that this occurred in was 20170324081508.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.