Crash in [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ]

VERIFIED FIXED in Firefox 55

Status

()

P3
normal
VERIFIED FIXED
a year ago
a year ago

People

(Reporter: darkspirit, Assigned: aosmond)

Tracking

(Blocks: 1 bug, {crash})

55 Branch
mozilla55
x86_64
Linux
crash
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox53 unaffected, firefox54 unaffected, firefox55 fixed)

Details

(Whiteboard: gfx-noted, crash signature)

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

a year ago
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0
Build ID: 20170515100238

Steps to reproduce:

today's 20170515100238:
same STR as bug 1360613: 
set gfx.webrender.enabled to true
Watch a youtube video, open new tab, switch back. crash. BUT new is that only the tab is crashing and not the whole browser.
One crash, but 5 crash reports from the same minute:

bp-e14faa3b-81d0-461c-9a25-c673e0170515 [@ mozalloc_abort | abort | webrender::frame::Frame::flatten_items ]
= bug 1363347 = Servo bug

bp-43795bf0-fe45-40f6-a9d4-0c5c90170515 [@ mozilla::layers::CompositorBridgeParent::RootLayerTreeId ]
= bug 1263200, fixed a year ago (!)

bp-f9902f96-04f4-4a69-8330-f14880170515 [@ mozilla::layers::PWebRenderBridgeChild::SendCreate ]
= bug 1350408, there is no activity on this bug, but there are still a couple of crashes with that signature

bp-9d8637ac-7557-4cfc-8c70-703f20170515 [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ]
= this contains my comment which I typed in the crashed tab: "watch youtube video, open tab, switch back. (bug 1360613) BUT NEW IS THAT ONLY THE TAB CRASHES AND NOT THE WHOLE BROWSER. this is very cool"
(Reporter)

Updated

a year ago
Blocks: 1311790, 1360613
Crash Signature: [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ]
Has STR: --- → yes
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
(Reporter)

Comment 1

a year ago
After this crash, I saw on about:support that my gpu process was disabled by runtime, so I set layers.gpu-process.force-enabled to true. Now I am going to vising more video sites, but it seems that this is perfectly reproducible.
Summary: Crash in mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations → Crash in [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ]
(Reporter)

Comment 2

a year ago
I had layers.gpu-process.enabled;true , so I got this tab-only crash.

With webrender enabled + a removed layers.gpu-process.enabled pref (is it disabled by default if there is not gpu-process pref on about:config?) my whole browser crashed (all 3 reports belong to bug 1363347).

So I think it was good to manually add the layers.gpu-process.enabled;true pref because only the tab crashed then.
(Assignee)

Comment 3

a year ago
I was able to reproduce this on Windows. Looks to be another fairly straightforward stale IPC problem.
Status: UNCONFIRMED → NEW
Crash Signature: [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] → [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] [ mozilla::ipc::MessageChannel::AssertWorkerThread | mozilla::ipc::MessageChannel::CxxStackFrame::CxxStackFrame | mozilla::ipc::MessageChannel::Send | mozill…
Ever confirmed: true
(Assignee)

Updated

a year ago
Crash Signature: [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] [ mozilla::ipc::MessageChannel::AssertWorkerThread | mozilla::ipc::MessageChannel::CxxStackFrame::CxxStackFrame | mozilla::ipc::MessageChannel::Send | mozill… → [@ libxul.so@0xc4c762 | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ] [@ mozilla::ipc::MessageChannel::AssertWorkerThread | mozilla::ipc::MessageChannel::CxxStackFrame::CxxStackFrame | mozilla::ipc::MessageChannel::Send | mozil…
(Assignee)

Comment 4

a year ago
Created attachment 8867981 [details] [diff] [review]
Only discard images and compositor animations if GPU process is still available, v1
Assignee: nobody → aosmond
Status: NEW → ASSIGNED
Attachment #8867981 - Flags: review?(sotaro.ikeda.g)
(Assignee)

Updated

a year ago
Keywords: crash
Priority: -- → P3
Whiteboard: gfx-noted
Attachment #8867981 - Flags: review?(sotaro.ikeda.g) → review+
(Assignee)

Comment 5

a year ago
Created attachment 8868097 [details] [diff] [review]
Only discard images and compositor animations if GPU process is still available, v2

Should have moved the array clearing outside the if, since there is no point keeping them around if we can't send it.
Attachment #8867981 - Attachment is obsolete: true
Attachment #8868097 - Flags: review+

Comment 6

a year ago
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/projects/graphics/rev/4e3d5e3fb742
Only discard images and compositor animations if GPU process is still available. r=sotaro
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
For the record, I believe the sequence of events that happened here is this:

(In reply to Darkspirit from comment #0)
> One crash, but 5 crash reports from the same minute:
> 
> bp-e14faa3b-81d0-461c-9a25-c673e0170515 [@ mozalloc_abort | abort |
> webrender::frame::Frame::flatten_items ]
> = bug 1363347 = Servo bug

This was the first crash, root cause is a bug in our webrender code. GPU process dies, and the content process tries to reinitialize rendering. You can see that in the stack in bp-f9902f96-04f4-4a69-8330-f14880170515 (the third crash from comment 0).

> bp-43795bf0-fe45-40f6-a9d4-0c5c90170515 [@
> mozilla::layers::CompositorBridgeParent::RootLayerTreeId ]
> = bug 1263200, fixed a year ago (!)

When the content process tries to reinitialize, it uses the same layers id that it had before, but the GPU process has restarted and lost that state. So when it tries to use the CompositorParentBridge for the layers id, there isn't one, and it causes this second crash (effectively a null pointer to `this` inside RootLayerTreeId()). This is also a crash in the GPU process.

> bp-f9902f96-04f4-4a69-8330-f14880170515 [@
> mozilla::layers::PWebRenderBridgeChild::SendCreate ]
> = bug 1350408, there is no activity on this bug, but there are still a
> couple of crashes with that signature

When the GPU process crashes the second time, the content process is blocked waiting for the reinit. The content process can't handle the GPU process crashing at this time, because it doesn't have any fallback mechanism, so it crashes too. That's what we see in this third crash, which is a content process crash during webrender reinit.

> bp-9d8637ac-7557-4cfc-8c70-703f20170515 [@ libxul.so@0xc4c762 |
> mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ]
> = this contains my comment which I typed in the crashed tab: "watch youtube
> video, open tab, switch back. (bug 1360613) BUT NEW IS THAT ONLY THE TAB
> CRASHES AND NOT THE WHOLE BROWSER. this is very cool"

Not sure how exactly this one comes about, but my guess is that after the above crashes, we disable the GPU process and fall back to main-process webrender. The remaining content processes are told to reinit, and this one has some destroy calls that it tries to do over the old (now torn-down) IPC channel to the now-dead GPU process. Hence it crashes too.

So overall the actual bugs appear to be (A) the root cause of the chain, which is tracked as bug 1363347, (B) lack of proper fallback handling, which is not yet filed, but related to bug 1343345, and (C) one of a variety of bugs such as the one this patch fixed, where we make assumptions that don't hold during reinit.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #7)
> (B) lack of proper fallback handling, which
> is not yet filed, but related to bug 1343345

I filed bug 1365264 for this.
status-firefox53: --- → unaffected
status-firefox54: --- → unaffected
status-firefox55: --- → fixed
status-firefox-esr52: --- → unaffected
Target Milestone: --- → mozilla55
(Reporter)

Comment 9

a year ago
today's 20170517100341.
gpu-process was and is enabled. enabled webrender. Ctrl+F2: restart. Tried to make a youtube video fullscreen. Tab crash. closed some tabs.

Meldungs-ID 	 	 	 	 	Sendedatum
bp-9ca5d42d-5261-407c-b0a9-ab4600170517		17.05.17 19:18
[@ mozilla::ipc::IPCResult::Fail ]
= bug 1354198 = close tabs fast

bp-5fa075c9-daf8-466d-a4ba-d7b490170517		17.05.17 19:13
[@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendReleaseCompositable ]
= has no bug

bp-e6e7dfe8-44d1-4a9c-aa8e-d13ba0170517		17.05.17 19:13
[@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ]
= has no bug

bp-fe45459a-6a44-4807-bb81-7a8be0170517		17.05.17 19:13
[@ mozalloc_abort | abort | webrender::frame::Frame::flatten_items ]
= bug 1363347 = patch in progress

bp-4f1f7650-fef2-4a40-941a-f6cc50170517		17.05.17 19:13
[@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ]
= has no bug, but the second part of the signature is identical to this bug

bp-fff4ffa2-a71a-4b54-a554-9c5f30170517		17.05.17 19:11
[@ libX11.so.6.3.0@0x4259b ]
= has no bug

-> The 2nd one should be this bug, or not? Should I create a new bug for them / are they fixed with this bug and just not integrated yet / is the patch from this bug integrated but has a problem?
[@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendReleaseCompositable ]
[@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ]

-> [@ libX11.so.6.3.0@0x4259b ] is new and I had the only crash, should I directly file a bug for it?
[@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendReleaseCompositable ]
[@ libxul.so@0xc4869d | mozilla::layers::PWebRenderBridgeChild::SendDeleteCompositorAnimations ]

These two have already been fixed on the graphics branch. I'll try to get them merged to mozilla-central tomorrow.

[@ libX11.so.6.3.0@0x4259b ]

Not sure about this, the crash stack looks weird. Feel free to file a bug, at least we'll see if other people run into it as well.
(Reporter)

Updated

a year ago
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.