Closed Bug 1399850 Opened 4 years ago Closed 4 years ago

Crash in [@ mozilla::layers::WebRenderLayerManager::SetLayerObserverEpoch ]

Categories

(Core :: Graphics: WebRender, defect, P3)

x86_64
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 1391262
Tracking Status
firefox-esr52 --- unaffected
firefox55 --- unaffected
firefox56 --- unaffected
firefox57 --- unaffected

People

(Reporter: jan, Assigned: sotaro)

References

(Blocks 2 open bugs)

Details

(Keywords: crash, nightly-community, regression, Whiteboard: [gfx-noted])

Crash Data

Attachments

(1 file)

Nightly 57 x64 20170914100122 de_DE @ Debian Testing
Webrender + Layersfree + (sorry) GPU-Process

I get this crash signature (bp-ab3b5e99-2529-4ed7-9efd-b29450170914 for example) between bug 1397407 crashs.
Because I don't know if the GPU process (which is indispensable for the moment) is the cause, I file this bug.
There are 25 crashes (from 6 installations) in nightly 57 starting with buildid 20170907220212.
:nical, could you investigate please ?
Flags: needinfo?(nical.bugzilla)
Keywords: regression
Most likely what's happening is the call at [1] is being made on a WebRenderLayerManager that was just created in the preceding call to mPuppetWidget->GetLayerManager(), and so the layer manager doesn't have a mWrChild created yet. That only happens when Initialize() is called on the WebRenderLayerManager.

This code has been problematic for a long time now, because InternalSetDocShellIsActive often gets called pretty early during tab creation, before it has a layer manager properly hooked up. I think the best way to resolve this once and for all is to add a method on PuppetWidget to get the layer manager if it exists, *without* creating it. And then in InternalSetDocShellIsActive we add a null guard.

[1] http://searchfox.org/mozilla-central/rev/6326724982c66aaeaf70bb7c7ee170f7a38ca226/dom/ipc/TabChild.cpp#2567
Flags: needinfo?(nical.bugzilla)
Assignee: nobody → bugmail
Whiteboard: [gfx-noted][wr-mvp][triage]
Status: NEW → ASSIGNED
Priority: -- → P1
Whiteboard: [gfx-noted][wr-mvp][triage] → [wr-mvp] [gfx-noted]
Target Milestone: --- → mozilla57
Actually this doesn't make sense. The SetLayerObserverEpoch() call happens inside an |if (mCompositorOptions)| block but both the places that set mCompositorOptions also initialize the layer manager. So something else must be going on here.
mystor, you've touched the docshell activation code recently. Any theories? If not I can put together a diagnostic patch to see if somehow we're getting mCompositorOptions set to something without the layer manager being created.
Oh, I see. All the crash reports have gfxCriticalNote annotations for [1]. Which means the WebRenderLayerManager initialization is failing (probably a dead/failed GPU process) but we carry on anyway, resulting in a fail when the docshell is activated.

[1] http://searchfox.org/mozilla-central/rev/6326724982c66aaeaf70bb7c7ee170f7a38ca226/gfx/layers/wr/WebRenderLayerManager.cpp#75
For now this is Linux only so dropping the priority.
Blocks: stage-wr-next
No longer blocks: stage-wr-nightly
Whiteboard: [wr-mvp] [gfx-noted] → [gfx-noted]
Target Milestone: mozilla57 → ---
Assignee: bugmail → nobody
Status: ASSIGNED → NEW
Priority: P1 → P3
Assignee: nobody → sotaro.ikeda.g
All the crashe reports had the following error.

>  "GraphicsCriticalError 	|[C0][GFX1-]: Failed to create WebRenderBridgeChild. (t=0.345005) "
By comment 8, there is a case that WebRenderLayerManager::Initialize() failed to create WebRenderBridgeChild.

  https://dxr.mozilla.org/mozilla-central/source/gfx/layers/wr/WebRenderLayerManager.cpp#71
See Also: → 1392316
See Also: → 1391262
bug 1391262 seems like dup of this bug. bug 1391262 might have a bit different signature.
Attachment #8910199 - Flags: review?(bugmail)
Comment on attachment 8910199 [details] [diff] [review]
patch - Add WrBridge() checks in WebRenderLayerManager

Review of attachment 8910199 [details] [diff] [review]:
-----------------------------------------------------------------

Do you understand why this crash might be happening? I feel like we should try and address the root cause - your patch will fix this particular instance of the crash but it will probably just crash later. If we actually have a tab that we're using with a WebRenderLayerManager on the content side but it's not attached to anything on the parent side that's kind of bad, and we should fall back to not using WebRender at all. But I don't know what kind of circumstances triggers this. It's possible somebody is just fiddling with random prefs on Linux and getting themselves in a bad state, which is why I didn't want to spend more time on this.
Okey, I am going to looking into more why the crash happened.
Attachment #8910199 - Flags: review?(bugmail)
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1391262
You need to log in before you can comment on or make changes to this bug.