Open Bug 1513185 Opened 11 months ago Updated 3 months ago

Intermittently missing glyphs on Android on Adreno 3xx devices

Categories

(Core :: Graphics: WebRender, defect, P3)

65 Branch
Unspecified
Android
defect

Tracking

()

People

(Reporter: jnicol, Assigned: jnicol)

References

(Blocks 2 open bugs, )

Details

+++ This bug was initially created as a clone of Bug #1505508 +++

Original comment:
> These are both from the GeckoView example app with gfx.webrender.all=true
> on a Sony Z3C phone. Just started the app which will load mozilla.org by
> default, and scroll down. In this case the letter 'w' is missing from two
> different headlines. This seems to happen often but not every time, not sure why.

This was originally filed by kats as bug 1505508, observed on his Z3C (Adreno 330). On Adreno 4xx, 5xx and 6xx devices I noticed similar behaviour, and proceeded to investigate then fix that. That solution does not work for Adreno 3xx GPUs as they are missing the required extension. Additionally, the 3xx bug seems to have a slightly different cause, though I do believe it is also a driver bug when blitting between textures to resize the texture cache.

I have been investigating this the last few days, and there are a few issues going on, each slightly different than the cache-resizing blitting bug on 4xx and later (bug 1505508).

First, blitting from a texture array to the main framebuffer is completely broken on Adreno 3xx. So the texture cache debug view makes it look as if we hit the cache resizing blitting bug. But actually blitting between texture arrays works just fine on Adreno 3xx. So we can remove the renderbuffer workaround. In fact, it seems like blitting layer i actually blits layers i..n. Which means we're copying pixels a lot more times than we want to. So we might want to only ask it to blit layer 0 rather than 0..n. I'll file a bug for these changes. I'm not sure if there's currently anywhere we we actually need to blit from an array to the main framebuffer, but it's something we need to look out for.

The main problem here is that PBO uploads, where the stride is a multiple of 128 bytes, always write the data to the 0th layer of a texture array.

This explains why, at the time this was reported, only a small number of glyphs/images were corrupted (the ones with a width of a multiple of 32 pixels). And why the fix for bug 1498732 made things much worse: because we purposefully aligned the stride of all uploads so it took the fast path, but taking the fast path on 3xx means we hit this bug. This meant we started hitting this bug for basically everything in the cache (except standalone textures, or if an entry is lucky enough to be in the 0th layer.)

The solution from bug 1498732 can easily be adapted to work around this problem: for uploads with a bad width copy the data in to a PBO with an extra pixel per row, set GL_UNPACK_ROW_LENGTH, and upload. However, this will put us on the slow path, and the whole point of bug 1498732 was to set the stride so we hit the fast path. It might be that uploading with an aligned stride to a 2d non-array texture, then doing a GPU-side copy to the array, could be faster. I'll look in to that.

You need to log in before you can comment on or make changes to this bug.