Open Bug 1694535 Opened 3 years ago Updated 3 years ago

Missing text in chrome UI, content, and "About Firefox" window

Categories

(Core :: Graphics: WebRender, defect)

Unspecified
Windows
defect

Tracking

()

Tracking Status
firefox88 --- affected

People

(Reporter: cpeterson, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(4 files)

I enabled SW-WR yesterday. Since then I've since a couple intermittent instances where chrome UI is rendered incorrectly, such as some missing pixels in a tab's title text or the "About Firefox" window missing all text entirely. See the attached screenshot.

I also have fission.autostart enabled, if that matters.

Attached file about-support.txt
Attached image window_screenshot.jpg

This is not a SW-WR problem, but I've seen a similar problem today after I had disabled SW-WR.

The problem affect just one Firefox window. Reloading tabs in the affected window, or opening new tabs, did not force any graphics invalidation to render the missing text. The tabs in other windows were unaffected, but about 30 seconds after I switched to test another window, I hit a tab crash:

[@ CContext::TID3D11DeviceContext_Map_<T> ] which is bug 1691344. I hit that crash almost every day.

Summary: sw-wr: missing text in "About Firefox" window → Missing text in chrome UI, content, and "About Firefox" window
Severity: -- → S3

I've seen this bug twice today. It seems to happen after I wake my Windows laptop from sleep.

I've now hit this bug four times today after waking my laptop.

Lee, could the increase in this bug's frequency be a regression from your fix for 24-bit SWGL bug 1687157? That fix is in mozilla-central's pushlog between yesterday's Nightly build 2021-03-08 and today's build 2021-03-09:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=419bc25a914812f1f3a828b04a61507fa459a114&tochange=5f0f6477c734369a72fec1211b608eb14d33bd4a

Flags: needinfo?(lsalzman)

(In reply to Chris Peterson [:cpeterson] from comment #4)

I've now hit this bug four times today after waking my laptop.

Lee, could the increase in this bug's frequency be a regression from your fix for 24-bit SWGL bug 1687157? That fix is in mozilla-central's pushlog between yesterday's Nightly build 2021-03-08 and today's build 2021-03-09:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=419bc25a914812f1f3a828b04a61507fa459a114&tochange=5f0f6477c734369a72fec1211b608eb14d33bd4a

The 24-bit depth fix neither affects performance nor memory usage, and given that this shows up on non-SW-WR, my gut feeling is that they are unrelated.

Flags: needinfo?(lsalzman)

Sotaro/Andrew, any ideas on what this might be?

Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(aosmond)

(In reply to Chris Peterson [:cpeterson] from comment #2)

[@ CContext::TID3D11DeviceContext_Map_<T> ] which is bug 1691344. I hit that crash almost every day.

The crash typically happens when ID3D11Texture2D allocation failed(Bug 1696331, Bug 1696325). From it, the PC might not have enough GPU memory or GPU memory was not freed soon. Though I am not sure about Missing text.

ID3D11Texture2D memory usage was increased recently by Bug 1694840.

(In reply to Sotaro Ikeda [:sotaro] from comment #7)

(In reply to Chris Peterson [:cpeterson] from comment #2)

[@ CContext::TID3D11DeviceContext_Map_<T> ] which is bug 1691344. I hit that crash almost every day.

The crash typically happens when ID3D11Texture2D allocation failed(Bug 1696331, Bug 1696325). From it, the PC might not have enough GPU memory or GPU memory was not freed soon. Though I am not sure about Missing text.

Anyways, when ID3D11Texture2D allocation failure happens, it seems better to fallback from WebRender (Software D3D11) to WebRender (Software).

Created Bug 1697335 for comment 8. But I am not sure whether Bug 1697335 could address this bug.

Depends on: 1697335
Flags: needinfo?(sotaro.ikeda.g)

(In reply to Chris Peterson [:cpeterson] from comment #4)

I've now hit this bug four times today after waking my laptop.

Lee, could the increase in this bug's frequency be a regression from your fix for 24-bit SWGL bug 1687157? That fix is in mozilla-central's pushlog between yesterday's Nightly build 2021-03-08 and today's build 2021-03-09:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=419bc25a914812f1f3a828b04a61507fa459a114&tochange=5f0f6477c734369a72fec1211b608eb14d33bd4a

That was the same build that bug 1696325 landed in.

(In reply to Chris Peterson [:cpeterson] from comment #3)

I've seen this bug twice today. It seems to happen after I wake my Windows laptop from sleep.

It sounds like you aren't losing the GPU process due to the crash in bug 1696325 anymore and hence not resetting your entire state. That makes it easier to reproduce the missing text issue, rather than bug 1696325 or bug 1687157 introducing the issue...

Flags: needinfo?(aosmond)

When you hit the issue, can you check about:support for the gfx critical log to see if there is anything of note?

Flags: needinfo?(cpeterson)

My best guess here is that we are hitting a device reset, which maybe caused the allocation failures that sotaro otherwise fixed the handling for, and we don't recover from it properly with SW-WR.

Also, can you confirm you still have a GPU process when this happens? If you disable the GPU process by setting layers.gpu-process.enabled to false, do you still see the problem?

(In reply to Andrew Osmond [:aosmond] from comment #13)

Also, can you confirm you still have a GPU process when this happens? If you disable the GPU process by setting layers.gpu-process.enabled to false, do you still see the problem?

I experienced the problem again and I do still have a GPU process. I will now try testing with layers.gpu-process.enabled = false for a few days.

Remote Processes

...
Type: GPU
Count: 1

And here is my Graphics Failure Log from about:support:

(#0): GP+[GFX1-]: SSP:Add 94489280705 init
(#143): GP+[GFX1-]: SSP:Add 94489280867 init
(#144): GP+[GFX1-]: SSP:Add 94489280868 init
(#145): GP+[GFX1-]: SSP:Add 94489280869 init
(#146): GP+[GFX1-]: SSP:Add 94489280870 init
(#147): GP+[GFX1-]: SSP:Add 94489280871 init
(#148): GP+[GFX1-]: SSP:Add 94489280872 init
(#149): GP+[GFX1-]: SSP:Add 94489280873 init
(#150): GP+[GFX1-]: SSP:Add 94489280874 init
(#151): GP+[GFX1-]: SSP:Add 94489280875 init
(#152): GP+[GFX1-]: SSP:Add 94489280876 init
(#153): GP+[GFX1-]: SSP:Add 94489280877 init
(#154): GP+[GFX1-]: SSP:Add 94489280878 init
(#155): GP+[GFX1-]: SSP:Add 94489280879 init
(#156): GP+[GFX1-]: SSP:Add 94489280880 init
(#157): GP+[GFX1-]: DataSourceSurface of SharedSurfaces does not exist for extId:94489280730

Flags: needinfo?(cpeterson)

I wonder if it is possible that a variation of the following occurs:

  1. Hit OOM as above where we fail to map in a whole bunch of images.
  2. Fail a resource update as part of a display list mapping in the images. Might not map in fonts either.
  3. Drops the display list update, so the fonts never get rendered, and avoids a GPU process crash to reset and maybe redraw the popup.
  4. No subsequent updates come in (the popup is simple I guess) so it remains in this state.

Since widespread reports are not occurring, it sounds like it is related to the OOM due to virtual address space. We reduced the likelihood with bug 1694480 but not eliminated.

(In reply to Chris Peterson [:cpeterson] from comment #14)

(In reply to Andrew Osmond [:aosmond] from comment #13)

Also, can you confirm you still have a GPU process when this happens? If you disable the GPU process by setting layers.gpu-process.enabled to false, do you still see the problem?

I experienced the problem again and I do still have a GPU process. I will now try testing with layers.gpu-process.enabled = false for a few days.

I hit this bug with layers.gpu-process.enabled = false the very next time I put my laptop to sleep.

Since widespread reports are not occurring, it sounds like it is related to the OOM due to virtual address space. We reduced the likelihood with bug 1694480 but not eliminated.

I am running a 32-bit Firefox build on Win64 OS (to help dogfood 32-bit) and do run into quite a few OOMs in general.

What was your about:support critical log contents after that?

(In reply to Andrew Osmond [:aosmond] from comment #17)

What was your about:support critical log contents after that?

It took a couple days to reproduce the problem with layers.gpu-process.enabled = false, but here is my Graphics Failure Log:

Failure Log
(#0) Error: Failed to lock ExternalImage for extId:124554051724
(#246) Error: SSP:Add 137438954104 init
(#247) Error: SSP:Add 137438954105 init
(#248) Error: SSP:Add 137438954106 init
(#249) Error: SSP:Add 137438954107 init
(#250) Error: SSP:Add 137438954108 init
(#251) Error: SSP:Add 137438954109 init
(#252) Error: SSP:Add 137438954110 init
(#253) Error: SSP:Add 137438954111 init
(#254) Error: SSP:Add 137438954112 init
(#255) Error: SSP:Add 137438954113 init
(#256) Error: SSP:Add 137438954114 init
(#257) Error: SSP:Add 137438954115 init
(#258) Error: SSP:Add 137438954116 init
(#259) Error: SSP:Add 137438954117 init
(#260) Error: DataSourceSurface of SharedSurfaces does not exist for extId:137438953963

See Also: → 1708195
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: