Crash in [@ mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | <name omitted> | mozilla::ipc::WriteIPDLParam<T>]

RESOLVED FIXED in Firefox 67

Status

()

defect
P2
critical
Rank:
15
RESOLVED FIXED
6 months ago
5 months ago

People

(Reporter: gsvelto, Assigned: mjf)

Tracking

({crash, regression})

unspecified
mozilla68
x86_64
All
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr60 unaffected, firefox65 unaffected, firefox66 unaffected, firefox67 fixed, firefox68 fixed)

Details

(crash signature)

Attachments

(1 attachment)

This bug is for crash report bp-44b669cb-4552-4215-804c-262c90190224.

Top 10 frames of crashing thread:

0 XUL mozilla::ipc::FatalError ipc/glue/ProtocolUtils.cpp:259
1 XUL mozilla::ipc::IProtocol::HandleFatalError const ipc/glue/ProtocolUtils.cpp:440
2 XUL <name omitted> ipc/ipdl/LayersSurfaces.cpp:2872
3 XUL void mozilla::ipc::WriteIPDLParam<mozilla::RemoteVideoDataIPDL const&> ipc/ipdl/PRemoteDecoder.cpp:150
4 XUL void mozilla::ipc::WriteIPDLParam<mozilla::DecodedOutputIPDL const&> ipc/ipdl/PRemoteDecoder.cpp:630
5 XUL mozilla::PRemoteDecoderParent::SendOutput ipc/ipdl/PRemoteDecoderParent.cpp:119
6 XUL mozilla::RemoteVideoDecoderParent::ProcessDecodedData dom/media/ipc/RemoteVideoDecoder.cpp:199
7 XUL mozilla::MozPromise<nsTArray<RefPtr<mozilla::MediaData> >, mozilla::MediaResult, true>::ThenValue<mozilla::RemoteDecoderParent::RecvInput dom/media/ipc/RemoteDecoderParent.cpp:89
8 XUL mozilla::MozPromise<nsTArray<RefPtr<mozilla::MediaData> >, mozilla::MediaResult, true>::ThenValueBase::ResolveOrRejectRunnable::Run xpcom/threads/MozPromise.h:392
9 XUL mozilla::TaskQueue::Runner::Run xpcom/threads/TaskQueue.cpp:199

This seems like a new crash and there's some RDD entries on the stack. Tentatively filing it under Audio/Video: Playback.

Rank: 15
Priority: -- → P2

The crash reason for these is:
MOZ_CRASH(IPC FatalError in the parent process!)

I'm not sure what that means.

It looks like this crash has become more frequently recently. It is the top OSX crash on the 3/13 Nightlies.

Most of the crashes have the IPCFatalErrorMsg annotation set to "unknown union type". It's coming from here:

https://searchfox.org/mozilla-central/source/__GENERATED__/ipc/ipdl/PRemoteDecoder.cpp#618

The DecoderOutputIPDL union in the message must contain an unknown type, maybe it's corrupted?

I've seen something like that before when somebody sent a union that wasn't fully initialized, but the only two places I see that create a DecoderOutputIPDL look okay to me.

This looks like a similar crash on Windows, with a more detailed stack:
bp-2ba68443-2d66-4afc-beee-30f5a0190312#tab-details

DecodedOutputIPDL is also on the stack, but there's actually some nested type BufferDescriptor that maybe it is hitting the error for. That's a field of the type SurfaceDescriptorBuffer.

The code around the SendOutput that is crashing is this:

SurfaceDescriptorBuffer sdBuffer;
Shmem buffer;
if (AllocShmem(image->GetDataSize(), Shmem::SharedMemory::TYPE_BASIC,
               &buffer) &&
    image->GetDataSize() == buffer.Size<uint8_t>()) {
   ...
}

RemoteVideoDataIPDL output(
    ...,
    video->mDisplay, image->GetSize(), sdBuffer, video->mFrameID);
Unused << SendOutput(output);

It looks like that if the |if (...)| code fails, then we send an uninitialized SurfaceDescriptorBuffer, which contains an uninitialized BufferDescriptor, which I think would hit the unknown type error. Failing to allocate shmem doesn't seem so unlikely.

Could you please take a look at this, Michael? It looks like a regression from bug 1500596. Thanks.

Blocks: 1500596
Flags: needinfo?(mfroman)
Keywords: regression
OS: macOS → All

Adding a Windows signature.

Crash Signature: [@ mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | <name omitted> | mozilla::ipc::WriteIPDLParam<T>] → [@ mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | <name omitted> | mozilla::ipc::WriteIPDLParam<T>] [@ mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | mozilla::ipc::WriteIPDLParam<T> ]

I thought I checked during initial dev work that we could send an empty SurfaceDescriptorBuffer, but maybe not. I will take a look.

Flags: needinfo?(mfroman)

Also strange that the spike started around the beginning of March when bug 1500596 landed mid-Feb.

(In reply to Michael Froman [:mjf] from comment #6)

I thought I checked during initial dev work that we could send an empty SurfaceDescriptorBuffer, but maybe not. I will take a look.

Confirmed that sending an empty SurfaceDescriptorBuffer will crash. Let me see what I can do to fix it.

Assignee: nobody → mfroman
  • Modify ProcessDecodedData to return a MediaResult.
  • RemoteDecoderParent::RecvInput and RemoteDecoderParent::RecvDrain
    both use error returned from ProcessDecodedData to call SendError.
  • RemoteVideoDecoderParent and RemoteAudioDecoderParent both return
    MediaResult with NS_ERROR_OUT_OF_MEMORY if AllocShmem fails in
    ProcessDecodedData (or if the returned buffer size is less than
    the requested size).

Turns out this is not a regression from Bug 1500596. This was lurking since the original RDD code landed. It would be interesting to know why we're seeing more frequent failures to alloc Shmem.

Pushed by mfroman@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/beaf8e80224d
handle failure to alloc Shmem in RemoteVideoDecoderParent. r=jya
Status: NEW → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68

Reports of this crash appear to have stopped after build id 20190319095054, so I'll file an uplift request.

Comment on attachment 9051896 [details]
Bug 1530305 - handle failure to alloc Shmem in RemoteVideoDecoderParent. r?jya!

Beta/Release Uplift Approval Request

  • Feature/Bug causing the regression: Bug 1471535
  • User impact if declined: Crash in RDD process when AllocShmem fails.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This is a fairly small change (localized to 6 files): Adds a return param to ProcessDecodedData and then returns an error if AllocShmem fails to avoid returning an uninitialized SurfaceDescriptorBuffer down in DecodedOutputIPDL.
  • String changes made/needed: none
Attachment #9051896 - Flags: approval-mozilla-beta?

Comment on attachment 9051896 [details]
Bug 1530305 - handle failure to alloc Shmem in RemoteVideoDecoderParent. r?jya!

Fix to our Vorbis support causing crashes which landed a week ago on Nightly which seems to have fixed the crashes since then. It's a low frequency crash but we are early in the Beta cycle and the patch is delimited and well understood, uplift approved for 67 beta 5, thanks.

Attachment #9051896 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.