Closed
Bug 1350408
Opened 8 years ago
Closed 7 years ago
Crash in mozilla::layers::PWebRenderBridgeChild::SendCreate
Categories
(Core :: Graphics: WebRender, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla56
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox54 | --- | unaffected |
firefox55 | - | disabled |
firefox56 | --- | fixed |
People
(Reporter: marcia, Assigned: aosmond)
References
(Depends on 1 open bug)
Details
(Keywords: crash, Whiteboard: [gfx-noted])
Crash Data
Attachments
(1 file, 1 obsolete file)
This bug was filed from the Socorro interface and is report bp-d5f3289f-d7d3-48a5-9d63-591ef2170324. ============================================================= Seen while looking at crash stats - crashes started using 20170324030205: http://bit.ly/2mZGT8I. Possible regression range based on crash stats: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=7513b3f42058e9bcf9950d4acf4647d4ad2240f0&tochange=01d1dedf400d4be413b1a0d48090dca7acf29637
Comment 1•8 years ago
|
||
I enabled gfx.webrender.enabled in today's Nightly and here are the results: 1. The browser doesn't start with NVIDIA 540m graphics. It shows the small crash-box during startup. Intel HD 3000 iGPU boots up fine. 2. The browser crashes when it's minimized then maximized. When I maximize it, the whole browser turns white and can only be stopped from the Task Manager. 3. Scrolling has jank. 4. Switching tabs is slow. When I open a new tab, sometimes it displays a spinning icon in center. 5. When I open a new tab successfully, the web page becomes completely red for a second before displaying the page. Here is the full list of crashes. There are two types of errors: mozilla::layers::PWebRenderBridgeChild::SendCreate and IPCError-browser | ShutDownKill bp-88eee366-3e02-41b1-92d5-0679d2170324 bp-265fc923-18f0-4f93-9c44-4a4e22170324 bp-9464269a-0d2a-4c7b-9de3-cb58c2170324 bp-69669d9a-0ca3-48aa-a187-f969a2170324 bp-d822ccc9-666c-4e1b-91f6-0f9ac2170324 bp-b93d58c5-bad0-413f-9c32-4420c2170324 bp-da697adb-8bc2-4825-b51f-cfffe2170324 bp-eb9d4e16-8b2c-487d-ac7c-d9b6e2170324 bp-188b9131-851b-401c-ad83-88f9d2170324 bp-e313eda1-172c-4929-9257-2c2682170324 bp-cec2f82a-b7b6-438e-96b9-14b872170324 bp-bdfd5cdd-f617-403c-8113-958f32170324 bp-cd12ffad-a54e-42e3-a82f-c1b292170324 bp-3ab09a72-4306-43cc-be6a-c82e12170324 bp-dc4e3911-ff85-4510-9225-c3b7c2170324 bp-0a857074-5a01-4230-ba0b-5e71b2170324 bp-f08e9d77-9e86-4ec9-8fc1-cdb882170324 bp-b549be60-c59c-46b7-8973-5a4192170324 bp-badf260a-a7bd-4eb3-836f-bb6aa2170324
Comment 2•8 years ago
|
||
As with bug 1350404, we can probably do a better job here to handle this failure. It looks like WrBridge() at [1] is null, which means the SendPWebRenderBridgeConstructor call returned null. [1] http://searchfox.org/mozilla-central/rev/7419b368156a6efa24777b21b0e5706be89a9c2f/gfx/layers/wr/WebRenderLayerManager.cpp#217
Comment 3•8 years ago
|
||
(In reply to smartfon.reddit from comment #1) > I enabled gfx.webrender.enabled in today's Nightly and here are the results: Thanks for the feedback. I'll (eventually) file separate bugs for these things. Right now we're still getting things stabilized.
Comment 4•8 years ago
|
||
In Comment 1 I made a mistake. The reverse is true. Nightly starts fine with NVIDIA but crashes during startup with Intel HD 3000. The top portion of these characters are not being rendered: f l 5 0
Updated•8 years ago
|
Assignee: bugmail → nobody
I'm also seeing these kind of crashes as secondary crash in the content process after the gfx process crashed without crash reports being created, it spawns winfault.exe instead.
Updated•8 years ago
|
Depends on: wr-stability
Comment 6•7 years ago
|
||
STR on Windows 10 x64 (no bug on Debian Testing): 1. about:preferences > "when Nightly starts: Show home page" 2. browser.tabs.warnOnClose;true (=default) 3. you have multiple tabs open 4. gpu process force-enabled, webrender enabled, hardware accelleration (force-)enabled where possible 5. close window with X 6. the popup which asks you if you want to close all tabs is completely black inside, it seems the gpu process crashes and restarts, you may press enter to proceed and close those tabs and the browser. Meldungs-ID Sendedatum bp-0d5df607-92be-423f-bc09-02a2b0170521 21.05.2017 15:01 bp-fc2d5f64-4374-4ddd-90cb-30d580170521 21.05.2017 15:01 = [@ mozilla::layers::PWebRenderBridgeChild::SendCreate ] = this bug bp-e5c01613-c63b-42e5-8adf-c28ff0170521 21.05.2017 15:01 = [@ mozilla::layers::CrossProcessCompositorBridgeParent::AllocPWebRenderBridgeParent ] = bug 1350404 bp-2fe29d7c-5579-4a47-a986-f75860170521 21.05.2017 15:01 = this bug
Comment 7•7 years ago
|
||
Another case where I saw this crash signature: Windows 10 x64: crash while doing a Radeon RX480 driver update. (After my whole desktop turned black for a second when windows activated the new driver). gpu process force-enabled, webrender enabled, hardware accelleration (force-)enabled where possible Meldungs-ID Sendedatum bp-897e9aee-9f80-435d-a90f-9acf90170521 21.05.2017 15:09 = [@ mozilla::gl::GLContextEGLFactory::Create ] = bug 1341478 <- Please consider to reopen. bp-8177c9fe-a27e-4313-9276-30e500170521 21.05.2017 15:08 = [@ mozilla::layers::PWebRenderBridgeChild::SendCreate ] = this bug
Comment 8•7 years ago
|
||
May 15: 4 crashes May 17: 2 crashes May 18: 1 crash May 19: 11 crashes May 20: 1 crash May 21: 25 crashes (so far) https://crash-stats.mozilla.com/signature/?signature=mozilla%3A%3Alayers%3A%3APWebRenderBridgeChild%3A%3ASendCreate&date=%3E%3D2017-05-14T16%3A11%3A00.000Z&date=%3C2017-05-21T16%3A11%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_sort=-date&page=1#graphs
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → aosmond
Comment 9•7 years ago
|
||
We've had a big spike in Nightly 20170602030204. 87 crashes, making it the #3 Window topcrash. Crash address is always 0x8, which suggests a null pointer dereference.
Comment 10•7 years ago
|
||
This is sort of a generic crash signature, it happens if the GPU process crashes and then the content process tries to reinit and then crashes itself. I described this flow in bug 1365009 comment 7 for anybody interested. I suspect the root cause of this big spike is the shader compilation failure in bug 1370104 which will crash the GPU process.
Comment 11•7 years ago
|
||
Actually upon further investigation I don't see why the content process would fail in this way. I tried killing the GPU process on linux+webrender and it reinitialized fine. I'll try some more things.
Assignee | ||
Comment 12•7 years ago
|
||
At least some of them are related to this: https://crash-stats.mozilla.com/report/index/1b833ecd-df38-45c7-9937-a72dd0170604 Where the parent compositor is missing from the table. What I was thinking on trying is making the content process reinit fail gracefully on the IPC call itself, and reinitialize *that* process as well. But it doesn't fix why the parent compositor is missing, it *should* be added and from all of my local testing, it is.
Comment 13•7 years ago
|
||
(In reply to Andrew Osmond [:aosmond] from comment #12) > At least some of them are related to this: > > https://crash-stats.mozilla.com/report/index/1b833ecd-df38-45c7-9937- > a72dd0170604 > > Where the parent compositor is missing from the table. What I was thinking > on trying is making the content process reinit fail gracefully on the IPC > call itself, and reinitialize *that* process as well. But it doesn't fix why > the parent compositor is missing, it *should* be added and from all of my > local testing, it is. Agreed. I also tried on the latest windows nightly - I got GPU crashes from the shader compilation failures but then Firefox fell back to basic compositing and recovered. about:crashes lists 4 GPU process crashes and nothing else. So most likely it is a race condition somewhere that is causing the parent compositor to be missing from the table.
Comment 14•7 years ago
|
||
According to the documentation at [1] the compositor should get instantiated as part of the SendEnsureLayersConnected codepath. However RenderFrameParent::EnsureLayersConnected has a bunch of early exits and I'm wondering if we're bailing out one of them. If so then there's no fallback to ensure the compositor gets created. [1] http://searchfox.org/mozilla-central/rev/507743376d1ba753d14ab6b9305b7c6358570be8/gfx/ipc/GPUProcessManager.cpp#453-455
Comment 15•7 years ago
|
||
Tracking as a recent regression in 55.
status-firefox54:
--- → unaffected
tracking-firefox55:
--- → +
Comment 16•7 years ago
|
||
Note that this is an opt-in feature that's disabled by default on Nightly and cannot be enabled on any other channel. So feel free to track it, but these crashes should not jeopardize the upcoming merge. If you want to eliminate the crashes anyway to prevent it masking other problems, we can make it so WebRender remains off regardless of the pref, but I would prefer to not do that.
Comment 17•7 years ago
|
||
Crashes together with bug 1372880 when clicking on an webextension icon inside the address bar with extensions.webextensions.remote;true.
Assignee | ||
Comment 18•7 years ago
|
||
[Tracking Requested - why for this release]: This should only affect WebRender builds. Given WebRender will not be enabled in this release (nor 56), I don't think we should be tracking this for 55. I scanned the crash reports, and there are no cases of this happening on 55 beta (as one would expect).
Comment 19•7 years ago
|
||
This is the #2 topcrash on Linux in Nightly 20170623100152. Over the past 7 days there have been 76 occurrences across 9 installations.
Updated•7 years ago
|
Assignee | ||
Comment 20•7 years ago
|
||
Attachment #8886672 -
Flags: review?(bugmail)
Assignee | ||
Comment 21•7 years ago
|
||
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=17beaa4233670c38f3cb37efb7b36b96133a1de6
Assignee | ||
Updated•7 years ago
|
Status: NEW → ASSIGNED
Priority: -- → P3
Whiteboard: [gfx-noted]
Comment 22•7 years ago
|
||
Comment on attachment 8886672 [details] [diff] [review] Gracefully handle failures in WebRenderLayerManager::Initialize to allow reinit., v1 Review of attachment 8886672 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/layers/wr/WebRenderLayerManager.cpp @@ +709,5 @@ > > void > WebRenderLayerManager::SetLayerObserverEpoch(uint64_t aLayerObserverEpoch) > { > + if (WrBridge()->IPCOpen()) { Don't you need to check if (WrBridge()) also here? ::: widget/PuppetWidget.h @@ +182,5 @@ > LayersBackend aBackendHint = mozilla::layers::LayersBackend::LAYERS_NONE, > LayerManagerPersistence aPersistence = LAYER_MANAGER_CURRENT) override; > > // This is used after a compositor reset. > + bool RecreateLayerManager(const std::function<bool(LayerManager*)>& aFunc); This needs some documentation as to what aFunc is for. A better name might be good too.
Attachment #8886672 -
Flags: review?(bugmail) → review+
Assignee | ||
Comment 23•7 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #22) > Comment on attachment 8886672 [details] [diff] [review] > Gracefully handle failures in WebRenderLayerManager::Initialize to allow > reinit., v1 > > Review of attachment 8886672 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: gfx/layers/wr/WebRenderLayerManager.cpp > @@ +709,5 @@ > > > > void > > WebRenderLayerManager::SetLayerObserverEpoch(uint64_t aLayerObserverEpoch) > > { > > + if (WrBridge()->IPCOpen()) { > > Don't you need to check if (WrBridge()) also here? > No, but maybe yes if other things break in the future :). With this patch, we keep the old WebRenderLayerManager around longer than we used to, that's why I had to check if the IPC is still open here -- but we still have mWrChild, just a stale/closed one. > ::: widget/PuppetWidget.h > @@ +182,5 @@ > > LayersBackend aBackendHint = mozilla::layers::LayersBackend::LAYERS_NONE, > > LayerManagerPersistence aPersistence = LAYER_MANAGER_CURRENT) override; > > > > // This is used after a compositor reset. > > + bool RecreateLayerManager(const std::function<bool(LayerManager*)>& aFunc); > > This needs some documentation as to what aFunc is for. A better name might > be good too. Done.
Attachment #8886672 -
Attachment is obsolete: true
Attachment #8887466 -
Flags: review+
Comment 24•7 years ago
|
||
Pushed by aosmond@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/649a35887b30 Gracefully handle failures in WebRenderLayerManager::Initialize to allow reinit. r=kats
Comment 25•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/649a35887b30
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Updated•7 years ago
|
status-firefox-esr52:
--- → unaffected
You need to log in
before you can comment on or make changes to this bug.
Description
•