Closed Bug 1628901 Opened 4 years ago Closed 4 years ago

WebRender unavailable by runtime: Failed to create new surface. "GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057"

Categories

(Core :: Graphics: WebRender, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla77
Tracking Status
firefox-esr68 --- unaffected
firefox74 --- unaffected
firefox75 --- unaffected
firefox76 --- unaffected
firefox77 + verified

People

(Reporter: cpeterson, Assigned: gw)

References

(Regression)

Details

(Keywords: regression)

Attachments

(2 files)

I noticed that my Firefox is no longer able to use WebRender. I have Fission enabled and started seeing the "You are running Fission enabled without WebRender" warnings on every window I open today. I'm running 77 Nightly on Windows 10. This is a regression because I've been using Fission and WebRender together for months now.

My gfx.webrender.all pref is set to true and about:support reports WebRender is Force enabled by pref, but it also says WebRender is unavailable by runtime: Failed to create new surface.

about:support lists a bunch of graphics errors like GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 and WMF VPX video decoding is disabled due to a previous crash.

Attached file about:support
about:support
Flags: needinfo?(jbonisteel)

I also saw the following during opening gmail without fission.

  • GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057

(In reply to Chris Peterson [:cpeterson] from comment #0)

and WMF VPX video decoding is disabled due to a previous crash.

It seems Bug 1570046.

regressionwindow-wanted because I think this bug is a regression in Nightly within the last 1-2 days.

When I did mozregression for gmail STR of comment 2 it showed Bug 1627588

\28:47.93 INFO: No more integration revisions, bisection finished.
28:47.93 INFO: Last good revision: 3bc7f9b0801488d3d88be30e7789240993995959
28:47.94 INFO: First bad revision: c39512f89aedf7c0745137da9c8c09a54f9ab2e8
28:47.94 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=3bc7f9b0801488d3d88be30e7789240993995959&tochange=c39512f89aedf7c0745137da9c8c09a54f9ab2e8

When error happened aDirtyRect.size of DCLayerTree::CreateEGLSurfaceForCompositionSurface() was (0, 0). It seems that the size should not be (0, 0) for BeginDraw().

:gw, do you know when aDirtyRect.size could be (0, 0).

Flags: needinfo?(gwatson)
Flags: needinfo?(jbonisteel)

It's not clear to me how that can happen, yet. I'll try to have a look at it tomorrow. If it's more urgent than that, feel free to request a back out of the patch that caused the regression.

Assignee: nobody → gwatson
Flags: needinfo?(gwatson)

Based on the regression window provided by Sotaro in Comment 5, I will update the flags for this issue.

Has Regression Range: --- → yes

Was bug 1627588 backed out? WebRender is working correctly for me today and I no longer see "GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057" errors in about:support either.

No, that bug hasn't been backed out, as far as I know. So you're no longer able to reproduce it at all?

I can't seem to repro this at all, in gmail and/or fission enabled. Are either of you still able to reproduce it?

Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(cpeterson)

(In reply to Glenn Watson [:gw] from comment #11)

No, that bug hasn't been backed out, as far as I know. So you're no longer able to reproduce it at all?

I was no longer able to reproduce, but then I tried to use mozregression to determine if there was a code fix (versus a configuration change on my computer). I wasn't able to reproduce with mozregression's Firefox builds, but I did see the error in my main browser window (open in the background while I was bisecting). Perhaps there is an intermittent problem if multiple Firefoxes try to use WebRender simultaneously?

Flags: needinfo?(cpeterson)

That seems unlikely, given the panic location / message, but maybe that does somehow trigger it. It's also likely that it may depend in some way on page content and how it was scrolled, as well as window size / resolution.

I could still reproduce it with latest nightly and latest m-c during loading gmail.

Flags: needinfo?(sotaro.ikeda.g)

Previously, it was possible for a tile that had a valid scroll root
to have an empty valid (and dirty) rect due to the picture cache
clip rect, in some situations.

This could result in the tile not being tagged as off-screen, which
means it is added to the queue of tiles to be updated. On most
platforms this is benign, but the BeginDraw method of DirectComposition
fails if the dirty rect is empty.

This patch fixes the logic so that tiles that meet these conditions
are correctly tagged as not visible, and skipped from update queue.

I still can't reproduce this locally. However, from auditing the patch that caused the regression, I think I can see a code path that could cause this. I attached a patch that would fix this, if it's the cause of the problem. Would you be able to test locally, and see if it fixes the problem for you?

The try run with artifacts, if useful:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3f5f4e064129099d7987ddb8b5e0cca44399a9ef

Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(cpeterson)

(In reply to Glenn Watson [:gw] from comment #17)

I still can't reproduce this locally. However, from auditing the patch that caused the regression, I think I can see a code path that could cause this. I attached a patch that would fix this, if it's the cause of the problem. Would you be able to test locally, and see if it fixes the problem for you?

I could reproduce it on latest m-c and the patch addressed the problem for me!

Flags: needinfo?(sotaro.ikeda.g)
Pushed by gwatson@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/73632227ba00
Fix panic caused by calling BeginDraw with empty dirty rect. r=sotaro
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla77

(In reply to Glenn Watson [:gw] from comment #17)

I still can't reproduce this locally. However, from auditing the patch that caused the regression, I think I can see a code path that could cause this. I attached a patch that would fix this, if it's the cause of the problem. Would you be able to test locally, and see if it fixes the problem for you?

The try run with artifacts, if useful:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3f5f4e064129099d7987ddb8b5e0cca44399a9ef

btw, I can no longer reproduce with official builds so I didn't test this try build. Something on my machine must have changed..

Flags: needinfo?(cpeterson)
Flags: qe-verify+

I could not reproduce the initial issue using an old Nightly from 2020-04-20 in order to verify the fix. Sotaro Ikeda, can you please check if this is still reproducible for you using Firefox 77.0b9?

Flags: needinfo?(sotaro.ikeda.g)

(In reply to Bogdan Maris [:bogdan_maris], Release Desktop QA from comment #22)

Sotaro Ikeda, can you please check if this is still reproducible for you using Firefox 77.0b9?

It is not reproducible for me. The problem seems to be addressed.

Flags: needinfo?(sotaro.ikeda.g)

(In reply to Sotaro Ikeda [:sotaro] from comment #23)

(In reply to Bogdan Maris [:bogdan_maris], Release Desktop QA from comment #22)

Sotaro Ikeda, can you please check if this is still reproducible for you using Firefox 77.0b9?

It is not reproducible for me. The problem seems to be addressed.

Thanks so much for checking, marking the bug accordingly.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
See Also: → 1643620
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: