Firefox 53 crashes in [@ mozilla::ipc::MessageChannel::AssertWorkerThread | mozilla::ipc::MessageChannel::CxxStackFrame::CxxStackFrame | mozilla::ipc::MessageChannel::Send | mozilla::layers::PLayerTransactionChild::SendShutdown ]

VERIFIED FIXED in Firefox 53

Status

()

Core
Audio/Video: Playback
--
critical
VERIFIED FIXED
2 years ago
a year ago

People

(Reporter: Virtual, Assigned: dvander)

Tracking

(5 keywords)

53 Branch
mozilla54
x86_64
Windows
crash, crashreportid, multiprocess, nightly-community, regression
Points:
---

Firefox Tracking Flags

(firefox-esr45 unaffected, firefox50 unaffected, firefox51 unaffected, firefox52 unaffected, firefox53+ verified, firefox54+ verified)

Details

(crash signature)

Attachments

(1 attachment)

I got crash in [@ mozilla::ipc::MessageChannel::AssertWorkerThread | mozilla::ipc::MessageChannel::CxxStackFrame::CxxStackFrame | mozilla::ipc::MessageChannel::Send | mozilla::layers::PLayerTransactionChild::SendShutdown ] when I was opening some amount (about 10) movies in tabs, only one was playing, other were inactive yet, after some time going to another tabs to see how movie downloading going, one of it crashes with the signature written above.

I want to add that this is with:
- e10s enabled
- GPU process (layers) enabled
- GPU process (media decoder) disabled [since I'm under diagnosing it for a other issue)


Crashlog report:
https://crash-stats.mozilla.com/report/index/58f9d934-644b-4777-85d1-5c9112170104


[Tracking Requested - why for this release]: Regression
Crash Signature: [@ mozilla::ipc::MessageChannel::AssertWorkerThread | mozilla::ipc::MessageChannel::CxxStackFrame::CxxStackFrame | mozilla::ipc::MessageChannel::Send | mozilla::layers::PLayerTransactionChild::SendShutdown ]
Has STR: --- → yes

Updated

2 years ago
Component: IPC → Graphics: Layers
Started with 20161221030226 build. Possible regression range: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=567894f026558e6dada617a3998f29aed06ac7d8&tochange=c36fbe84042debef0a5d58b7fc88185b401762ce

New crash regression that we will track for 53.
tracking-firefox53: ? → +
From the stack, it looks like MediaFormatReader is releasing some object off the main thread which causes a ShadowLayerForwarder to be destroyed, which tries to send a message, but it can't, because it is on the wrong thread.
Component: Graphics: Layers → Audio/Video: Playback
David, anything related to your recent work, or something that will get better with your recent work?
Flags: needinfo?(dvander)
In the commit range from comment 1, bug 1319992 looks like it touched related code. jya, could this crash be a regression from that bug? Thanks.

(In reply to Milan Sreckovic [:milan] from comment #3)
> David, anything related to your recent work, or something that will get
> better with your recent work?

I think the bug here is media code releasing a main thread object on some random other thread, rather than layers code failing to handle some case. That's just a hunch though.
Flags: needinfo?(jyavenard)

Updated

2 years ago
status-firefox50: --- → unaffected
status-firefox51: --- → unaffected
status-firefox-esr45: --- → unaffected
(In reply to Andrew McCreight [:mccr8] from comment #4)
> In the commit range from comment 1, bug 1319992 looks like it touched
> related code. jya, could this crash be a regression from that bug? Thanks.
> 
> (In reply to Milan Sreckovic [:milan] from comment #3)
> > David, anything related to your recent work, or something that will get
> > better with your recent work?
> 
> I think the bug here is media code releasing a main thread object on some
> random other thread, rather than layers code failing to handle some case.
> That's just a hunch though.

Agree, that's exactly what's happening. It's weird that the layers ref was "leaked" on the main thread without having forcefully called Disconnect. We try pretty hard not to do that.
Flags: needinfo?(dvander)
(In reply to Andrew McCreight [:mccr8] from comment #4)
> I think the bug here is media code releasing a main thread object on some
> random other thread, rather than layers code failing to handle some case.
> That's just a hunch though.

Which object must be released on the main thread? Is it possible to add some assertions to catch that?
It doesn't have to be released on the main thread, though that would fix this bug if media can do that. The intent is that the main thread has to call Destroy() before it releases its last reference.
https://hg.mozilla.org/mozilla-central/annotate/57ac9f63fc69/gfx/layers/ipc/LayerTransactionChild.cpp#l37

Do you mean LayerTransactionChild::Destroy() should be called on the main thread?
Yeah exactly. Very strange that it's not.
http://searchfox.org/mozilla-central/rev/8144fcc62054e278fe5db9666815cc13188c96fa/dom/media/MediaFormatReader.h#520

mKnowsCompositor will be released off the main thread if MediaFormatReader is deleted off the main thread.

http://searchfox.org/mozilla-central/search?q=RefPtr%3Clayers%3A%3AKnowsCompositor%3E+mKnowsCompositor%3B&case=false&regexp=false&path=

Seeing there are several places in media code that hold reference to KnowsCompositor, I would suggest ShadowLayerForwarder to provide a custom release function (NS_IMPL_RELEASE_WITH_DESTROY) to ensure the object is deleted on the main thread.
Keywords: crash
Flags: needinfo?(jyavenard)
Tracking 54+ for the same reason in Comment 1.
tracking-firefox54: ? → +
dvander - can you handle this, perhaps with jwwang's idea from the comment above?  THanks
Flags: needinfo?(dvander)
Matt should weigh in on that approach.
Flags: needinfo?(dvander) → needinfo?(matt.woodrow)

Updated

2 years ago
See Also: → bug 1334216
Assignee: nobody → dvander
Status: NEW → ASSIGNED
Flags: needinfo?(matt.woodrow)
Attachment #8833598 - Flags: review?(matt.woodrow) → review+

Comment 16

2 years ago
Pushed by danderson@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/843e450763b6
Only call LayerTransactionChild::Destroy on the main thread. (bug 1328633, r=mattwoodrow)

Comment 17

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/843e450763b6
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
status-firefox54: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Please request Aurora approval on this when you get a chance.
Flags: needinfo?(dvander)
Comment on attachment 8833598 [details] [diff] [review]
fix

Approval Request Comment
[Feature/Bug causing the regression]: bug 1319992
[User impact if declined]: potential crashes
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: can't see any new nightly crashes on crash-stats so far
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: no
[Is the change risky?]: no
[Why is the change risky/not risky?]: This simply delays a resource being freed, so that it happens on the correct thread. Can't think of any potential risks.
[String changes made/needed]:
Flags: needinfo?(dvander)
Attachment #8833598 - Flags: approval-mozilla-aurora?
Comment on attachment 8833598 [details] [diff] [review]
fix

Fix a crash. Aurora53+.
Attachment #8833598 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Comment 21

2 years ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-aurora/rev/9ee3f47f2a10
status-firefox53: affected → fixed
No crashes since Firefox 54.0a1 (2017-02-09) & Firefox 53 (2017-02-11) with [@ mozilla::dom::IdleRequest::IdleRun ] signature,
so I'm marking this as VERIFIED.
Status: RESOLVED → VERIFIED
status-firefox53: fixed → verified
status-firefox54: fixed → verified
You need to log in before you can comment on or make changes to this bug.