Closed Bug 1983036 Opened 5 months ago Closed 3 months ago

ext-webbgis.lansstyrelsen.se - Rendering of map is broken on certain devices with GPU Exynos 2200 from AMD RDNA2

Categories

(Core :: Graphics: CanvasWebGL, defect)

ARM
Android
defect

Tracking

()

RESOLVED FIXED
146 Branch
Tracking Status
firefox145 --- fixed
firefox146 --- fixed

People

(Reporter: rbucata, Assigned: jnicol)

References

(Blocks 1 open bug, )

Details

(Whiteboard: [webcompat-source:web-bugs])

Attachments

(2 files)

Environment:
Operating system: Android 15
Firefox version: Firefox Mobile 140.0

Steps to reproduce:
Map doesn't render properly in Firefox on Android. The upper right half (a triangle from top-left to top-right to bottom-right to top-left) is a mirror of the bottom left, but sideways. Chrome on Android works as you'd expect.

It is also much slower to zoom and pan in the map than in Chrome, and to load more details after zooming is either broken or very very slow.

Seemed to hog a lot of memory too, as trying to use it caused another app to be killed, that shows a screen overlay. After a little while more, the content process showing this site was (most likely) killed too, while in the foreground, as the page reloaded without interaction.

Actual Behavior:
Page not loading correctly

Notes:

  • Reproduces regardless of the status of ETP
  • Reproduces in firefox-nightly, and firefox-release
  • Does not reproduce in chrome

Created from https://github.com/webcompat/web-bugs/issues/166538

QA does not have the required devices or setup, but we think this is something worth investigating.

Component: Site Reports → Graphics: CanvasWebGL
Product: Web Compatibility → Core

The same thing happens here: https://storymaps.arcgis.com/stories/2042e23dc9664e279ac7e2460f479797

Device: Samsung Galaxy A55, Android 15
Firefox version: 143.0a1 (Build #2016108703), a678ec376450cf56285b4baa38afa6cdf31a847a
GV: 143.0a1-20250815084539
AS: 143.20250814050417

Jamie, ideas?

Flags: needinfo?(jnicol)
Duplicate of this bug: 1984119
Duplicate of this bug: 1988344

Seems to affect various Xclipse GPUs after the Samsung Android 15 update, and seems specific to website that use the arcgis map library. (haven't seen anything similar on any other pages)

Xclipse GPUs use ANGLE on Vulkan as their OpenGL driver. I tried to determine whether this was a bug in ANGLE introduced in between the version used in Android 14 and Android 15. Android allows you to build a custom version of angle and use it as the GL driver for an app: https://chromium.googlesource.com/angle/angle.git/+/HEAD/doc/DevSetupAndroid.md

I couldn't determine the exact Android version used in Android 14 and 15 on my device, but I built ANGLE from latest upstream as well as a version a few years old, and in both cases the bug reproduced.

This leads me to believe the bug lies in the underlying SPIR-V driver as opposed to in ANGLE.

So far I've been unable to capture what the application is doing in renderdoc or such, but I think that's probably the best way to continue investigating.

Flags: needinfo?(jnicol)

Apart from arcgis I have seen a similar thing on the hitta.se map (https://hitta.se/kartan), but rather in overlay bits than the map itself. See screenshot.

Thanks Andreas, that does indeed look like the same issue.

As I said above, this must be a bug in the Android 15 Xclipse Vulkan driver, given that running a locally-built angle on an Android 14 device does not reproduce the issue.

However, I found that running a very old version of Angle on an affected Android 15 device does in fact avoid the issue. I didn't test far back enough previously, but when attempting to capture the rendering in AGI I noticed it no longer reproduced, which led me to discover AGI installs a very old Angle version on the device. A (not so) quick git bisect later and we find that this angle commit causes us to hit the bug.

And sure enough, disabling robustness when creating our EGL context avoids the issue. We should look for a better workaround, however

Not sure if at all relevant, but throwing it in here: https://issuetracker.google.com/issues/386749841

Minimal testcase: https://codepen.io/jamienicol/full/KwVXzrr

This should draw a triangle where the top corner is red, bottom-right is green, and bottom left is blue. But it seems as if the vertex data for the color attribute of the final (bottom-left) vertex is incorrect - being rendered as either black or white

I can also reproduce this in the equivalent java application, as long as the EGL context is created with the EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR flag. So it's not due to anything our webgl code is inserting into the GL command stream (other than the fact we set the robust access flag)

Chrome is not affected, and from a quick look it appears as if they always allocate a larger buffer than the application requests, so don't hit the mBufferWithUserSize code path in the angle commit I mentioned in comment 8.

I've additionally filed an ANGLE bug here: https://issues.angleproject.org/issues/451733089

On devices with Samsung Xclipse GPUs running Android 15, we see broken
rendering due to attribute data residing at the end of a vertex buffer
not being read correctly. Ensuring we pad the size of vertex buffers
by an additional GL_MAX_VERTEX_ATTRIB_STRIDE bytes (i.e. definitely
enough space for one additional vertex) avoids the issue.

Assignee: nobody → jnicol
Status: NEW → ASSIGNED
Pushed by jnicol@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/cc900f29e5c4 https://hg.mozilla.org/integration/autoland/rev/49539466967d Pad vertex buffers with space for an extra vertex on Xclipse GPUs. r=gfx-reviewers,nical
Pushed by sstanca@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/db4f5b168b8a https://hg.mozilla.org/integration/autoland/rev/112a2823c71c Revert "Bug 1983036 - Pad vertex buffers with space for an extra vertex on Xclipse GPUs. r=gfx-reviewers,nical" as requested for landing it wrongly.

Reverted this as requested for landing it wrongly.

Flags: needinfo?(jnicol)
Pushed by jnicol@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/71b4f75f6415 https://hg.mozilla.org/integration/autoland/rev/cd9fbbfa9e84 Pad vertex buffers with space for an extra vertex on Xclipse GPUs. r=gfx-reviewers,nical
Pushed by chorotan@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/8fba39c9fd62 https://hg.mozilla.org/integration/autoland/rev/611863cc6b82 Revert "Bug 1983036 - Pad vertex buffers with space for an extra vertex on Xclipse GPUs. r=gfx-reviewers,nical" for causing bc failures on gl.rs

Backed out for causing bc failures on gl.rs

Backout link

Push with failures

Failure log

Oh whoops. The max stride variable isn't available in all versions of opengl, so querying it raises an error, which causes an assertion in webrender in debug builds.

I'll move querying the max vertex stride into the block which only executes on affected devices, as we know they support the variable

Flags: needinfo?(jnicol)
Pushed by jnicol@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/47a1818c6a30 https://hg.mozilla.org/integration/autoland/rev/883fdd0d9621 Pad vertex buffers with space for an extra vertex on Xclipse GPUs. r=gfx-reviewers,nical
Status: ASSIGNED → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
Target Milestone: --- → 146 Branch
Blocks: 1994982

Comment on attachment 9520244 [details]
Bug 1983036 - Pad vertex buffers with space for an extra vertex on Xclipse GPUs. r?#gfx-reviewers

Beta/Release Uplift Approval Request

  • User impact if declined/Reason for urgency: Broken rendering on some webgl sites for users with Xclipse GPUs (many Samsung phones)
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Minor fix and isolated to specific devices.
  • String changes made/needed:
  • Is Android affected?: Yes
Attachment #9520244 - Flags: approval-mozilla-beta?
Attachment #9520244 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

It looks like the patch which was uplifted to beta was an old version of the phabricator revision (which also caused the same failures on autoland and was backed out as a result.) The bug is fixed in the latest version of the phabricator revision, can we re-land that one?

Flags: needinfo?(pascalc)
Flags: needinfo?(pascalc)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: