Closed Bug 1696039 Opened 4 years ago Closed 4 years ago

Intermittently distorted or missing glyphs on Adreno 330

Categories

(Core :: Graphics: WebRender, defect, P3)

ARM
Android
defect

Tracking

()

VERIFIED FIXED
89 Branch
Tracking Status
firefox88 --- wontfix
firefox89 --- verified

People

(Reporter: h.winnemoeller, Assigned: jnicol)

References

(Blocks 1 open bug)

Details

Attachments

(3 files)

Steps to reproduce

  1. Set gfx.webrender.all to true and gfx.webrender.allow-partial-present-buffer-age to false (see Bug 1695771) and restart Fenix.
  2. Open https://www.swr.de/swraktuell/rheinland-pfalz/zahl-der-corona-infizierten-in-rheinland-pfalz-100.html or https://www.swr.de/swraktuell/rheinland-pfalz/index.html or https://edition.cnn.com/

Expected result
Websites are displayed normally with all glyphs visible.

Actual result
Intermittently, glyphs are missing or distorted, see the attached images, red boxes have been added by me. gfx.webrender.allow-partial-present-buffer-age can be set to false (see Bug 1695771) or true. Thus, at least changing the flag does not regress this bug further.

Device Info
Vendor and Model: Sony Xperia Z2
OS version: Android 6.0.1
GPU model: Adreno 330 (WebRender Compositing)
Number of cores: 4 (Snapdragon 801)

Fenix version
Nightly 210301 17:01 (Build #2015796169)
AC: 74.0.20210301143120, 8a55899f2
GV: 88.0a1-20210301093612
AS: 72.1.0

Blocks: wr-adreno3xx

How frequently do you see this, and is it reliably reproducible?

If you enable gfx.webrender.batched-texture-uploads and restart, does that fix it?

Flags: needinfo?(h.winnemoeller)

While always loading https://www.swr.de/swraktuell/rheinland-pfalz/zahl-der-corona-infizierten-in-rheinland-pfalz-100.html, I did first try it on non-wr stable 86.1.1 (Build #2015794881), AC: 72.0.16, f4aa67a9c, GV: 86.0-20210222142601, AS: 67.2.0 for 10 times and force-reloading in between: no issues

Using Nightly 210310 17:03 (Build #2015797897), AC: 74.0.20210310112134, 7e3b6102a, GV: 88.0a1-20210309094921, AS: 72.1.0, and always force-reloading between steps, I saw the following results:

STR: (y = issue is present, n = issue is not present, all glyphs visible)

  1. y
  2. y (missing glyphs changed)
  3. y (missing glyphs changed)
  4. y (missing glyphs changed)
  • [background-foreground Fenix]
  1. n
  2. n
  3. n
  4. n
  5. n
  6. n
  1. y
  2. y (missing glyphs unchanged)
  3. y (missing glyphs unchanged)
  4. y (missing glyphs unchanged)
  5. y (missing glyphs unchanged)
  6. y (missing glyphs unchanged)
  7. y (missing glyphs unchanged)
  8. y (missing glyphs unchanged)
  9. y (missing glyphs unchanged)
  10. y (missing glyphs unchanged)
  • [background-foreground Fenix]
  1. y (other glyphs now missing)
  2. y (missing glyphs unchanged)
  • [background-foreground Fenix]
  1. y (other glyphs now missing)
  • [background-foreground Fenix]
  1. y (same glyphs still missing)

I did not find gfx.webrender.batched-texture-uploads but gfx.webrender.debug.batched-texture-uploads and set the latter to true and restarted Fenix. The results were the following:

  1. n
  2. n
  3. n
  4. n
  5. n
  6. n
  7. n
  8. n
  9. n
  10. n
  1. n
  2. n
  3. n
  4. n
  5. n

Thus, enabling gfx.webrender.debug.batched-texture-uploads seems to resolve the missing glyphs issue. I have tried all my regular websites and did not experience any additional issues.

Flags: needinfo?(h.winnemoeller)

Thanks. I suspect we might want to enable batched uploads anyway on Adreno 3xx for performance reasons, but I worry that the bug could still exist and affect users in the wild, just be much less likely to occur due to performing fewer uploads.

So I'm going to hold off on making that change for now as I'd like to figure out the underlying bug first. But feel free to keep that pref enabled for your own use.

You mention that the glyphs which are missing change occasionally when you refresh. Can you see any pattern to it? Does it happen to some glyphs more than others, or does it just seem random?

(In reply to Jamie Nicol [:jnicol] from comment #4)

You mention that the glyphs which are missing change occasionally when you refresh. Can you see any pattern to it? Does it happen to some glyphs more than others, or does it just seem random?

It seemed quite random which glyphs went missing but there was a pattern to the overall position of the issues. It only happened in the subelements of the website that contain graphs, not in the larger text parts. Also, I observed that some elements like "Landesweite
Neuinfektinsrate von 50" [sic] had all glyphs the moment it first was rendered, but then it went through some relayouting, jumped around and then was e.g. missing its "k"-glyph. So maybe an issue with elements that get loaded in by the website?

When Bug 1695912 and Bug 1507074 are resolved, I will start using WR as a daily driver and hopefully catch any remaining issues.

I've been using WR for the last 2 weeks with gfx.webrender.debug.batched-texture-uploads set to True and I have not seen any glyph issues although I was looking for them. I still may have missed some missing/distorted glyphs, but at least on this configuration on this device it seems to be quite rare.

Severity: -- → S4
Priority: -- → P3

Thanks for that information Henrik. I've been doing some profiling, however, and actually I'm not convinced that batched texture uploads help performance. They might in fact even hurt it slightly, though either way texture upload performance on Adreno 3xx is sub par...

So for now at least I think it's best to fix the bug with regular uploads rather than switching to batched uploads. Thankfully it didn't turn out to be very hard to figure out the issue.

I noticed that when using RGBA8 glyphs I could never reproduce the corruption, it only reproduced with (the default setting) R8 glyphs. The important difference here is the stride which we pack data in to the PBOs to upload to the texture. On Adreno we currently ensure that the stride is equivalent to 64 pixels (for performance reasons on later adreno devices). With R8 that is 64 bytes, with RGBA8 that is 256 bytes.

I verified that using R8 format glyphs and rounding to 128 bytes (or a higher power of 2) avoids the corruption, but 64 bytes or lower reproduces the corruption.

I also verified that the performance is unaffected by the stride on Adreno 3xx, unlike on Adreno 5xx and 6xx for which we added the 64-pixel alignment in the first place.

So basically, on Adreno 3xx we should align the stride to 128 bytes.

On Adreno 3xx we see intermittently corrupted or missing glyphs due to
texture uploads failing. This has only started occuring since glyphs
were switched to using R8 format textures, from RGBA8.

Due to performance reasons on other Adreno devices, we currently
ensure texture data has a stride aligned to 64 pixels. So the effect
of switching texture format is that the alignment switched from 256
bytes to 64 bytes.

It appears that PBO texture uploads must be performed with an
alignment of 128 bytes on Adreno 3xx, otherwise this corruption may
occur. Additionally, the 64 pixel requirement to hit the fast path
does not seem to apply to Adreno 3xx. Therefore this patch sets the
requirement to 128 bytes on Adreno 3xx, and leaves it as 64 pixels on
other Adreno devices.

It also renames optimal_pbo_stride to required_pbo_stride, to better
reflect the fact that this is now sometimes required for correctness
reasons, not just performance.

Assignee: nobody → jnicol
Status: NEW → ASSIGNED
Pushed by jnicol@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5fe42b0b50c6 Align texture upload stride to multiples of 128 bytes on Adreno 3xx r=kvark
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 89 Branch

So on Adreno 300, if you don't align the stride to 128 pixels you hit this bug, and if you do align the stride to 128 pixels you hit bug 1513185. What fun! :)

(Luckily bug 1513185 no longer applies to us as we no longer use texture arrays.)

Henrik, once this patch lands in Fenix nightly I'd be grateful if you could disable batched uploads and test whether this issue is indeed resolved. Thanks!

Flags: needinfo?(h.winnemoeller)
Blocks: 1705349

This has now landed in Fenix Nightly 210416 17:01 (Build #2015805001), AC: 75.0.20210415213638, 4ad86e3b2, GV: 89.0a1-20210415133200, AS: 74.0.1.

Steps to reproduce

  1. Set gfx.webrender.all to true and ensure that gfx.webrender.debug.batched-texture-uploads is set to false, restart Fenix.
  2. Open https://www.swr.de/swraktuell/rheinland-pfalz/zahl-der-corona-infizierten-in-rheinland-pfalz-100.html repeatedly and check for missing or distorted glyphs, force-reload between checks, restart Fenix after 5 runs.

After 3 complete runs and 15 checks, the glyphs are all rendered correctly. Great work! :)

Status: RESOLVED → VERIFIED
Flags: needinfo?(h.winnemoeller)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: