Closed Bug 1365009 Opened 7 years ago Closed 7 years ago

Crash in [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ]

Categories

(Core :: Graphics: WebRender, defect, P3)

55 Branch
x86_64
Linux
defect

Tracking

()

VERIFIED FIXED
mozilla55
Tracking Status
firefox-esr52 --- unaffected
firefox53 --- unaffected
firefox54 --- unaffected
firefox55 --- fixed

People

(Reporter: jan, Assigned: aosmond)

References

Details

(Keywords: crash, Whiteboard: gfx-noted)

Crash Data

Attachments

(1 file, 1 obsolete file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0 Build ID: 20170515100238 Steps to reproduce: today's 20170515100238: same STR as bug 1360613: set gfx.webrender.enabled to true Watch a youtube video, open new tab, switch back. crash. BUT new is that only the tab is crashing and not the whole browser. One crash, but 5 crash reports from the same minute: bp-e14faa3b-81d0-461c-9a25-c673e0170515 [@ mozalloc_abort | abort | webrender::frame::Frame::flatten_items ] = bug 1363347 = Servo bug bp-43795bf0-fe45-40f6-a9d4-0c5c90170515 [@ mozilla::layers::CompositorBridgeParent::RootLayerTreeId ] = bug 1263200, fixed a year ago (!) bp-f9902f96-04f4-4a69-8330-f14880170515 [@ mozilla::layers::PWebRenderBridgeChild::SendCreate ] = bug 1350408, there is no activity on this bug, but there are still a couple of crashes with that signature bp-9d8637ac-7557-4cfc-8c70-703f20170515 [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] = this contains my comment which I typed in the crashed tab: "watch youtube video, open tab, switch back. (bug 1360613) BUT NEW IS THAT ONLY THE TAB CRASHES AND NOT THE WHOLE BROWSER. this is very cool"
Crash Signature: [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ]
Has STR: --- → yes
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
After this crash, I saw on about:support that my gpu process was disabled by runtime, so I set layers.gpu-process.force-enabled to true. Now I am going to vising more video sites, but it seems that this is perfectly reproducible.
Summary: Crash in mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations → Crash in [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ]
I had layers.gpu-process.enabled;true , so I got this tab-only crash. With webrender enabled + a removed layers.gpu-process.enabled pref (is it disabled by default if there is not gpu-process pref on about:config?) my whole browser crashed (all 3 reports belong to bug 1363347). So I think it was good to manually add the layers.gpu-process.enabled;true pref because only the tab crashed then.
I was able to reproduce this on Windows. Looks to be another fairly straightforward stale IPC problem.
Status: UNCONFIRMED → NEW
Crash Signature: [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] → [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] [ mozilla::ipc::MessageChannel::AssertWorkerThread | mozilla::ipc::MessageChannel::CxxStackFrame::CxxStackFrame | mozilla::ipc::MessageChannel::Send | mozill…
Ever confirmed: true
Crash Signature: [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] [ mozilla::ipc::MessageChannel::AssertWorkerThread | mozilla::ipc::MessageChannel::CxxStackFrame::CxxStackFrame | mozilla::ipc::MessageChannel::Send | → [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] [@ mozilla::ipc::MessageChannel::AssertWorkerThread | mozilla::ipc::MessageChannel::CxxStackFrame::CxxStackFrame | mozilla::ipc::MessageChannel::Send |
Assignee: nobody → aosmond
Status: NEW → ASSIGNED
Attachment #8867981 - Flags: review?(sotaro.ikeda.g)
Keywords: crash
Priority: -- → P3
Whiteboard: gfx-noted
Attachment #8867981 - Flags: review?(sotaro.ikeda.g) → review+
Should have moved the array clearing outside the if, since there is no point keeping them around if we can't send it.
Attachment #8867981 - Attachment is obsolete: true
Attachment #8868097 - Flags: review+
Pushed by aosmond@gmail.com: https://hg.mozilla.org/projects/graphics/rev/4e3d5e3fb742 Only discard images and compositor animations if GPU process is still available. r=sotaro
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
For the record, I believe the sequence of events that happened here is this: (In reply to Darkspirit from comment #0) > One crash, but 5 crash reports from the same minute: > > bp-e14faa3b-81d0-461c-9a25-c673e0170515 [@ mozalloc_abort | abort | > webrender::frame::Frame::flatten_items ] > = bug 1363347 = Servo bug This was the first crash, root cause is a bug in our webrender code. GPU process dies, and the content process tries to reinitialize rendering. You can see that in the stack in bp-f9902f96-04f4-4a69-8330-f14880170515 (the third crash from comment 0). > bp-43795bf0-fe45-40f6-a9d4-0c5c90170515 [@ > mozilla::layers::CompositorBridgeParent::RootLayerTreeId ] > = bug 1263200, fixed a year ago (!) When the content process tries to reinitialize, it uses the same layers id that it had before, but the GPU process has restarted and lost that state. So when it tries to use the CompositorParentBridge for the layers id, there isn't one, and it causes this second crash (effectively a null pointer to `this` inside RootLayerTreeId()). This is also a crash in the GPU process. > bp-f9902f96-04f4-4a69-8330-f14880170515 [@ > mozilla::layers::PWebRenderBridgeChild::SendCreate ] > = bug 1350408, there is no activity on this bug, but there are still a > couple of crashes with that signature When the GPU process crashes the second time, the content process is blocked waiting for the reinit. The content process can't handle the GPU process crashing at this time, because it doesn't have any fallback mechanism, so it crashes too. That's what we see in this third crash, which is a content process crash during webrender reinit. > bp-9d8637ac-7557-4cfc-8c70-703f20170515 [@ libxul.so@0xc4c762 | > mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] > = this contains my comment which I typed in the crashed tab: "watch youtube > video, open tab, switch back. (bug 1360613) BUT NEW IS THAT ONLY THE TAB > CRASHES AND NOT THE WHOLE BROWSER. this is very cool" Not sure how exactly this one comes about, but my guess is that after the above crashes, we disable the GPU process and fall back to main-process webrender. The remaining content processes are told to reinit, and this one has some destroy calls that it tries to do over the old (now torn-down) IPC channel to the now-dead GPU process. Hence it crashes too. So overall the actual bugs appear to be (A) the root cause of the chain, which is tracked as bug 1363347, (B) lack of proper fallback handling, which is not yet filed, but related to bug 1343345, and (C) one of a variety of bugs such as the one this patch fixed, where we make assumptions that don't hold during reinit.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #7) > (B) lack of proper fallback handling, which > is not yet filed, but related to bug 1343345 I filed bug 1365264 for this.
Target Milestone: --- → mozilla55
today's 20170517100341. gpu-process was and is enabled. enabled webrender. Ctrl+F2: restart. Tried to make a youtube video fullscreen. Tab crash. closed some tabs. Meldungs-ID Sendedatum bp-9ca5d42d-5261-407c-b0a9-ab4600170517 17.05.17 19:18 [@ mozilla::ipc::IPCResult::Fail ] = bug 1354198 = close tabs fast bp-5fa075c9-daf8-466d-a4ba-d7b490170517 17.05.17 19:13 [@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendReleaseCompositable ] = has no bug bp-e6e7dfe8-44d1-4a9c-aa8e-d13ba0170517 17.05.17 19:13 [@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] = has no bug bp-fe45459a-6a44-4807-bb81-7a8be0170517 17.05.17 19:13 [@ mozalloc_abort | abort | webrender::frame::Frame::flatten_items ] = bug 1363347 = patch in progress bp-4f1f7650-fef2-4a40-941a-f6cc50170517 17.05.17 19:13 [@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] = has no bug, but the second part of the signature is identical to this bug bp-fff4ffa2-a71a-4b54-a554-9c5f30170517 17.05.17 19:11 [@ libX11.so.6.3.0@0x4259b ] = has no bug -> The 2nd one should be this bug, or not? Should I create a new bug for them / are they fixed with this bug and just not integrated yet / is the patch from this bug integrated but has a problem? [@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendReleaseCompositable ] [@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] -> [@ libX11.so.6.3.0@0x4259b ] is new and I had the only crash, should I directly file a bug for it?
[@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendReleaseCompositable ] [@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] These two have already been fixed on the graphics branch. I'll try to get them merged to mozilla-central tomorrow. [@ libX11.so.6.3.0@0x4259b ] Not sure about this, the crash stack looks weird. Feel free to file a bug, at least we'll see if other people run into it as well.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: