Closed Bug 1620147 Opened 9 months ago Closed 9 months ago

Graphics backend automatically falls back to Direct 3D11

Categories

(Core :: Graphics: WebRender, defect, P2)

Desktop
Windows 10
defect

Tracking

()

VERIFIED FIXED
mozilla76
Tracking Status
firefox-esr68 --- disabled
firefox73 --- disabled
firefox74 --- disabled
firefox75 --- verified
firefox76 --- verified

People

(Reporter: alice0775, Assigned: gw)

References

Details

(Keywords: crash, regression)

Attachments

(4 files)

If forcibly enable WebReder on old Non-qualified GPU,
Open page[1] and scroll then graphics backend automatically falls back to Direct 3D11 or Direct 3D11(Advanced Layers).

[1] https://tc39.github.io/ecma262/ or large plain text file as attached

#1 Regression window(rendering glitch will occur)
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=887408ca310cdf614d5ddb6f10f012aa9f1cf003&tochange=cd2634c753b9b955aafc290d57f0cbcdf9fab688

#2 Regression window(falls back to Direct 3D11)
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=8d7f2651eed8d29a9ba7bf65094e4bc2061419ff&tochange=31e91b3d071e6b7a1453b8a89e2e60e60a7a6760

Attached file about:support

Confirmed, if you grab the scroll bar and just scroll down that ecma262 page, there's a big flash like the gpu process crashed and we enter D3D11 mode.

Verified on the Toronto Windows Testing Machine, with a GTX 950.

Need to investigate this closer, but it should probably be looked at with a high priority.

Blocks: wr-76
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: crash
Priority: -- → P2
Blocks: wr-75
No longer blocks: wr-76

Alexis, can you see if there's anything in about:crashes?

Flags: needinfo?(a.beingessner)

Sotaro can you take a look at this? Looks like something we should try and fix for 75 if possible.

Flags: needinfo?(sotaro.ikeda.g)
Attached file config
I may have a repro: go to this page http://www.jagregory.com/abrash-black-book
and click on Chapter 36.

 0:00.57 c:/code/1618939_crash/obj-x86_64-pc-mingw32\dist\bin\firefox.exe -attach-console -no-remote -wait-for-browser -profile c:\code\1618939_crash\obj-x86_64-pc-mingw32\tmp\profile-default
Crash Annotation GraphicsCriticalError: |[G0][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) [GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057
Crash Annotation GraphicsCriticalError: |[G0][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G1][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) [GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057
Crash Annotation GraphicsCriticalError: |[G0][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G1][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G2][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) [GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057
Crash Annotation GraphicsCriticalError: |[G0][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G1][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G2][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G3][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) [GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057
Crash Annotation GraphicsCriticalError: |[G0][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G1][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G2][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G3][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G4][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) [GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057
Crash Annotation GraphicsCriticalError: |[G0][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G1][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G2][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G3][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G4][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G5][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) [GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057
Crash Annotation GraphicsCriticalError: |[0]GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=151.446) |[1]GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=151.446) |[2]GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=151.446) |[3]GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=151.446) |[4]GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=151.446) |[5]GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=151.446) |[6][GFX1-]: Compositors might be mixed (5,3) (t=151.606) [GFX1-]: Compositors might be mixed (5,3)

###!!! [Child][MessageChannel::SendAndWait] Error: Channel error: cannot send/recv

No info in about:crashes unfortunately.

I may have a repro: go to this page http://www.jagregory.com/abrash-black-book
and click on Chapter 36.

 0:00.57 c:/code/1618939_crash/obj-x86_64-pc-mingw32\dist\bin\firefox.exe -attach-console -no-remote -wait-for-browser -profile c:\code\1618939_crash\obj-x86_64-pc-mingw32\tmp\profile-default
Crash Annotation GraphicsCriticalError: |[G0][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) [GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057
Crash Annotation GraphicsCriticalError: |[G0][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G1][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) [GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057
Crash Annotation GraphicsCriticalError: |[G0][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G1][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G2][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) [GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057
Crash Annotation GraphicsCriticalError: |[G0][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G1][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G2][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G3][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) [GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057
Crash Annotation GraphicsCriticalError: |[G0][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G1][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G2][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G3][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G4][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) [GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057
Crash Annotation GraphicsCriticalError: |[G0][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G1][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G2][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G3][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G4][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) |[G5][GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=148.579) [GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057
Crash Annotation GraphicsCriticalError: |[0]GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=151.446) |[1]GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=151.446) |[2]GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=151.446) |[3]GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=151.446) |[4]GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=151.446) |[5]GP+[GFX1-]: DCompositionSurface::BeginDraw failed: 0x80070057 (t=151.446) |[6][GFX1-]: Compositors might be mixed (5,3) (t=151.606) [GFX1-]: Compositors might be mixed (5,3)

###!!! [Child][MessageChannel::SendAndWait] Error: Channel error: cannot send/recv

No info in about:crashes unfortunately.

Edit: 0x80070057 == E_INVALIDARG.

gfxCriticalNote << "DCompositionSurface::BeginDraw failed: "
                << update_rect.left << ", " << update_rect.top << " to " << update_rect.right << ", " << update_rect.bottom << " "
                << gfx::hexa(hr);

DCompositionSurface::BeginDraw failed: 524288, 1441792 to 525312, 1442304 0x80070057

Size is 1024x512, so that checks out, sort of -- just positioned a bit far out :)

Assignee: nobody → bpeers
Assignee: bpeers → nobody

I can see why this is occurring. The size of the virtual surface we create is VIRTUAL_OFFSET * 2 and everything is centered around VIRTUAL_OFFSET. However, the y-coordinate is > 2 * VIRTUAL_OFFSET so it's outside the valid range.

A temporary fix might be to just make VIRTUAL_OFFSET much larger - I think the maximum virtual surface size is 16 * 1024 * 1024. This shouldn't have any adverse effects on performance or memory allocation. It's possible (but unlikely) it might affect rasterization accuracy and fuzziness. It's mostly a band-aid, but might be worth a try first.

Looking at the length of that page, I think we'll need a proper fix for this issue.

I think the fix is relatively straightforward, but will require a little bit of plumbing. Let me know if it's super urgent, and I might be able to take a look over the weekend, otherwise I'll start work on it now and finish it up first thing Monday morning.

The fix will basically be:

  • Allow create_surface in the compositor trait to specify a "virtual center".
  • WR will select a virtual center position such that it is within an acceptable coordinate range for DC, but will try to avoid changing the virtual center to avoid unnecessary invalidations.
  • When the picture cache coordinates of the tile grid is outside the valid range for DC coordinates, WR will invalidate all tiles, re-create the surface and set a new "virtual center" point. This should be extremely rare event.

(In reply to Glenn Watson [:gw] from comment #8)

I can see why this is occurring. The size of the virtual surface we create is VIRTUAL_OFFSET * 2 and everything is centered around VIRTUAL_OFFSET. However, the y-coordinate is > 2 * VIRTUAL_OFFSET so it's outside the valid range.

A temporary fix might be to just make VIRTUAL_OFFSET much larger - I think the maximum virtual surface size is 16 * 1024 * 1024. This shouldn't have any adverse effects on performance or memory allocation.

When I just tried 16 * 1024 * 1024 size, the error did not happen at http://www.jagregory.com/abrash-black-book. But bottom of the page did not rendered.

This adds support for tracking and invalidating tiles based on a
movable virtual offset.

This patch fixes the bug for me. It's the correct fix, but the implementation is quite hacky and not very well documented.

If anyone else has a chance to do some testing with this patch, it'd be much appreciated. You'll need to apply it on top of the Part 11/12 patches in bug 1579235 (though both those patches are on autoland and should merge to m-c soon).

Pending try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e79335192bcf5e1326f61b21b2633dfe58b23004

Depending on urgency, we could either:
(a) Land as-is and fix up with follow up patches.
(b) Someone else can tidy up this patch tomorrow and land it on my behalf.
(c) I can tidy it up on Monday morning and get it landed then.

Any of those options are fine with me!

Assignee: nobody → gwatson

The urgency is not high. This isn't a recent regression.

Confirmed fixed with the patch for me \o/

(In reply to Jeff Muizelaar [:jrmuizel] from comment #13)

The urgency is not high. This isn't a recent regression.

OK, I'll tidy up the patch and land it on Monday then.

Flags: needinfo?(sotaro.ikeda.g)
Pushed by gwatson@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2a28c94f3db2
Fix virtual surface coords being outside bounds. r=Bert,sotaro
Status: NEW → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla76

Comment on attachment 9131396 [details]
Bug 1620147 - Fix virtual surface coords being outside bounds.

Beta/Release Uplift Approval Request

  • User impact if declined: WebRender will panic on extremely long pages, causing GPU process to crash and restart.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: Bug contains repro steps with URL. Ensure that the GPU process doesn't crash (flicker) when navigating to those pages with WebRender enabled.
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Relatively simple patch, only applies in limited cases, has been tested on nightly and manually.
  • String changes made/needed:
Attachment #9131396 - Flags: approval-mozilla-beta?
Flags: qe-verify+
QA Whiteboard: [qa-triaged]

Comment on attachment 9131396 [details]
Bug 1620147 - Fix virtual surface coords being outside bounds.

webrender fix, approved for 75.0b3

Attachment #9131396 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: needinfo?(a.beingessner)

Confirmed issue with 75.0b1 and 75.0a1 (20200304161940) on Windows 10. Howver there was no actual crash.
Fix verified with 76.0a1(20200311163942), waiting for beta3.

The laptop-config used for verification is: Asus - Intel Core m7-6y75 with Intel HD Graphics 515 gpu.

Checked with the 76.0b3 build from treeherder but the compositing changed to basic and the page is white while scrolling up/down.
There wasn't any glitches as noticed on the affected build.

Is it something we should file a separate bug for or can be patched up here?

Flags: needinfo?(gwatson)

I'm not quite sure I follow what you're saying above.

Without the fix, I'd expect to see the compositor switch to Basic (or D3D11) and you might see a flicker / glitch.
With the fix, the compositor should remain as WR the entire page.

Are you saying that even with the fix in the beta, you still see the compositor switch to Basic? Maybe the fix didn't make it in to b3?

Flags: needinfo?(gwatson)

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression

Indeed, the issue still persists on 75.0b3.
With the folllowing prefs enabled:

  • gfx.webrender.all
  • gfx.webrender.enabled

The attached test page still flickers.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

I cannot reproduce the problem anymore on Nightly76.0a1 and Firefox75.0b3 Windows10.

Cristian - can you provide clarification on whether or not you could reproduce the issue? See Glenn's questions in https://bugzilla.mozilla.org/show_bug.cgi?id=1620147#c23

Flags: needinfo?(cristian.fogel)

I downloaded 75.0b3 from beta.mozilla.org, I was no longer able to reproduce the issue that was fixed by this bug (I tested on the attached bug repro, and also the tc39 and abrash pages linked above).

The compositor remained as WR the entire time scrolling up and down, which matches what the original reporter says in c26. So I'll wait for further information from cfogel to work out what further we need to do here, if anything.

(In reply to Glenn Watson [:gw] from comment #23)

Without the fix, I'd expect to see the compositor switch to Basic (or D3D11) and you might see a flicker / glitch.

Yes, that was what I noticed with 75.0b3 on the mentioned device.

With the fix, the compositor should remain as WR the entire page.

That did not happen, it jumped on basic/

Are you saying that even with the fix in the beta, you still see the compositor switch to Basic? Maybe the fix didn't make it in to b3?

Yep, can only assume that is what happened.
With 75.0b4 from what I see the issue is no longer present. Verified with a fresh profile and install.
Updating the status in regards to this.

Thank you for the input on this!

Flags: qe-verify+
Flags: needinfo?(cristian.fogel)
Status: REOPENED → RESOLVED
Closed: 9 months ago9 months ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.