If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Crash in [@ mozilla::layers::WebRenderLayerManager::SetLayerObserverEpoch ]

NEW
Assigned to

Status

()

Core
Graphics: WebRender
P3
critical
9 days ago
2 days ago

People

(Reporter: darkspirit, Assigned: sotaro)

Tracking

(Depends on: 1 bug, Blocks: 2 bugs, {crash, nightly-community, regression})

Trunk
x86_64
Linux
crash, nightly-community, regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox55 unaffected, firefox56 unaffected, firefox57 unaffected)

Details

(Whiteboard: [gfx-noted], crash signature)

Attachments

(1 attachment)

(Reporter)

Description

9 days ago
Nightly 57 x64 20170914100122 de_DE @ Debian Testing
Webrender + Layersfree + (sorry) GPU-Process

I get this crash signature (bp-ab3b5e99-2529-4ed7-9efd-b29450170914 for example) between bug 1397407 crashs.
Because I don't know if the GPU process (which is indispensable for the moment) is the cause, I file this bug.
There are 25 crashes (from 6 installations) in nightly 57 starting with buildid 20170907220212.
:nical, could you investigate please ?
status-firefox55: --- → unaffected
status-firefox57: unaffected → affected
status-firefox-esr52: --- → unaffected
Flags: needinfo?(nical.bugzilla)
Keywords: regression
Most likely what's happening is the call at [1] is being made on a WebRenderLayerManager that was just created in the preceding call to mPuppetWidget->GetLayerManager(), and so the layer manager doesn't have a mWrChild created yet. That only happens when Initialize() is called on the WebRenderLayerManager.

This code has been problematic for a long time now, because InternalSetDocShellIsActive often gets called pretty early during tab creation, before it has a layer manager properly hooked up. I think the best way to resolve this once and for all is to add a method on PuppetWidget to get the layer manager if it exists, *without* creating it. And then in InternalSetDocShellIsActive we add a null guard.

[1] http://searchfox.org/mozilla-central/rev/6326724982c66aaeaf70bb7c7ee170f7a38ca226/dom/ipc/TabChild.cpp#2567
Flags: needinfo?(nical.bugzilla)
Assignee: nobody → bugmail
Blocks: 1386665
Whiteboard: [gfx-noted][wr-mvp][triage]
Status: NEW → ASSIGNED
Priority: -- → P1
Whiteboard: [gfx-noted][wr-mvp][triage] → [wr-mvp] [gfx-noted]
Target Milestone: --- → mozilla57
Actually this doesn't make sense. The SetLayerObserverEpoch() call happens inside an |if (mCompositorOptions)| block but both the places that set mCompositorOptions also initialize the layer manager. So something else must be going on here.
Comment hidden (off-topic)
mystor, you've touched the docshell activation code recently. Any theories? If not I can put together a diagnostic patch to see if somehow we're getting mCompositorOptions set to something without the layer manager being created.
Oh, I see. All the crash reports have gfxCriticalNote annotations for [1]. Which means the WebRenderLayerManager initialization is failing (probably a dead/failed GPU process) but we carry on anyway, resulting in a fail when the docshell is activated.

[1] http://searchfox.org/mozilla-central/rev/6326724982c66aaeaf70bb7c7ee170f7a38ca226/gfx/layers/wr/WebRenderLayerManager.cpp#75
For now this is Linux only so dropping the priority.
Blocks: 1386674
No longer blocks: 1386665
status-firefox57: affected → unaffected
Whiteboard: [wr-mvp] [gfx-noted] → [gfx-noted]
Target Milestone: mozilla57 → ---
Assignee: bugmail → nobody
Status: ASSIGNED → NEW
Priority: P1 → P3
(Assignee)

Updated

3 days ago
Assignee: nobody → sotaro.ikeda.g
(Assignee)

Comment 8

3 days ago
All the crashe reports had the following error.

>  "GraphicsCriticalError 	|[C0][GFX1-]: Failed to create WebRenderBridgeChild. (t=0.345005) "
(Assignee)

Comment 9

3 days ago
By comment 8, there is a case that WebRenderLayerManager::Initialize() failed to create WebRenderBridgeChild.

  https://dxr.mozilla.org/mozilla-central/source/gfx/layers/wr/WebRenderLayerManager.cpp#71
(Assignee)

Comment 10

3 days ago
Created attachment 8910199 [details] [diff] [review]
patch - Add WrBridge() checks in WebRenderLayerManager
(Assignee)

Updated

3 days ago
See Also: → bug 1392316
(Assignee)

Updated

3 days ago
See Also: → bug 1391262
(Assignee)

Comment 11

3 days ago
bug 1391262 seems like dup of this bug. bug 1391262 might have a bit different signature.
(Assignee)

Updated

3 days ago
Attachment #8910199 - Flags: review?(bugmail)
Comment on attachment 8910199 [details] [diff] [review]
patch - Add WrBridge() checks in WebRenderLayerManager

Review of attachment 8910199 [details] [diff] [review]:
-----------------------------------------------------------------

Do you understand why this crash might be happening? I feel like we should try and address the root cause - your patch will fix this particular instance of the crash but it will probably just crash later. If we actually have a tab that we're using with a WebRenderLayerManager on the content side but it's not attached to anything on the parent side that's kind of bad, and we should fall back to not using WebRender at all. But I don't know what kind of circumstances triggers this. It's possible somebody is just fiddling with random prefs on Linux and getting themselves in a bad state, which is why I didn't want to spend more time on this.
(Assignee)

Comment 13

2 days ago
Okey, I am going to looking into more why the crash happened.
(Assignee)

Updated

2 days ago
Attachment #8910199 - Flags: review?(bugmail)
Depends on: 1401849
You need to log in before you can comment on or make changes to this bug.