Closed Bug 1787520 Opened 3 months ago Closed 3 months ago

[Bug]: Possible Memory Corruption? (Google Pixel 6a = Mali-G78)

Categories

(Core :: Graphics: WebRender, defect)

ARM64
Android
defect

Tracking

()

VERIFIED FIXED
106 Branch
Tracking Status
firefox104 --- wontfix
firefox105 + verified
firefox106 + verified

People

(Reporter: kbrosnan, Assigned: jnicol)

References

(Blocks 1 open bug)

Details

(Keywords: correctness)

Attachments

(3 files)

From github: https://github.com/mozilla-mobile/fenix/issues/26661.

Steps to reproduce

  1. Open website https://misskey.io
  2. Login and press some buttons, or just pressing the login button
  3. Get glitches

Expected behaviour

No glitches in website

Actual behaviour

Many glitches

https://user-images.githubusercontent.com/49326405/186905247-2dbe8242-45ab-42ed-a7d4-88224d5de7e5.mp4

Device name

Google Pixel 6a

Android version

Android 13

Firefox release type

Firefox

Firefox version

104.1.0

Device logs

No response

Additional information

This bug did not happen when I was using Android 12 on Pixel 6a. After my upgrade, this bug happend.
I've tried clearing cache, deleting user data, rebooting the device. None of it helped.

┆Issue is synchronized with this Jira Task

Change performed by the Move to Bugzilla add-on.

User posted a video of the problem https://user-images.githubusercontent.com/49326405/186905247-2dbe8242-45ab-42ed-a7d4-88224d5de7e5.mp4

The user noted that tapping on the login button reproduced for them. I was not able to reproduce that behavior using a Pixel 6 Pro running Android 12. It is possible that this is an Android 13 only bug. These both use the same SoC and GPU.

Hi, I'm the original author of this issue on GitHub.
Tapping on the login button sometimes triggerd this bug, but was not as often as the posted video.

Blocks: wr-mali
Hardware: Unspecified → ARM64
Summary: [Bug]: Possible Memory Corruption? → [Bug]: Possible Memory Corruption? (Google Pixel 6a = Mali-G78)

Seems like it's probably a bug in the android 13 Mali driver. I've ordered a pixel 6a to investigate when I return from PTO in a couple of weeks.

Since it's not super easy to reproduce and doesn't reproduce on every website, we can probably just live with it until then rather than switching affected users to swgl

Flags: needinfo?(jnicol)

Assigning to Jamie for now, to look at after PTO.

Assignee: nobody → jnicol

Exact same issue on my Pixel 6a with GrapheneOS tested on Misskey as well

For further context it seems to be a issue with the backdrop blur css effect not just related to misskey happens in brave search as another example

(In reply to Jamie Nicol [:jnicol] PTO until 9/9/22 from comment #3)

Seems like it's probably a bug in the android 13 Mali driver. I've ordered a pixel 6a to investigate when I return from PTO in a couple of weeks.

Since it's not super easy to reproduce and doesn't reproduce on every website, we can probably just live with it until then rather than switching affected users to swgl

Hi,
you should try swiping through this sample gallery:
https://xenforo.com/community/threads/media-gallery-2-2-lightbox-navigation-attachment-mirroring-and-more.183044/

My Pixel 6 Pro goes completely wild with flickering and colored artifacts in pixelated blocks, especially when swiping slightly up and down while in the gallery.

I had no issues with Chromium so far. Only with the fox.

Hello

I confirm having the same problem on my Pixel 6A running Android 13 and Firefox 104.1.10.

The bug can easily be reproduced (for me) by going to any Wikipedia article and browsing the pictures.

Here is a quick video demonstrating the issue. There are roughly the same corruption pattern as the author video.

https://user-images.githubusercontent.com/19673370/189172033-45ca8800-f236-44ca-b188-f5c2680fa366.mp4

The visual glitches also appears on other websites, but I can consistently reproduce it on any Wikipedia article (with pictures) I tried.

For what it's worth, I don't remember having this issue with Android 12, but since I did not spend a long time with this phone on Android 12 I am not so sure.

I tested with various browsers (Samsung Browser, Opera, Brave DuckDuckGo, Chrome) and there was no issues, but I had the issue with Firefox nightly.

Do you think it could be a hardware problem?

Just unboxed my pixel 6a which running android 12 by default. Cannot reproduce, as expected. Attaching about:support for comparison purposes before I update the OS

And here's the about:support on Android 13, where I can indeed reproduce.

The driver version is now v1.r36p0 rather than v1.r32p1. So I think we'll need to apply any workaround to r36 onwards. It's unclear at this point whether other Mali GPUs are affected or just the G78.

This doesn't reproduce in renderdoc, and I couldn't get AGI to work. Interestingly when setting up AGI that also switched the device to use the ANGLE renderer, which also does not reproduce the issue. Uninstalling ANGLE made it reproduce again.

The upload method (any permutation of batched enabled or disabled, or PBOs enabled or disabled) makes no difference.

However, commenting out the invalidate_render_target() call in end_pass() makes the problem go away. My hunch is that if we invalidate a render target which is then subsequently reused in a later pass, the contents remains undefined rather than being correctly reinitialized in the later pass.

As this is a purely an optimization, we can simply avoid it for now.

Flags: needinfo?(jnicol)

On the Pixel 6 family devices we are seeing frequent image corruption
issues on some websites, which started following the Android 13
upgrade. This can be avoided by not invalidating the no-longer-needed
render targets at the end of each render pass. This is only an
optimization anyway, so is safe to skip.

The Android 13 update upgraded the Mali driver from version v1.r32p1
to v1.r36p0. As we did not encounter this bug prior to the Android 13
update, this patch only applies the workaround on driver versions
v1.r36p0 and above. It's possible other GPUs than the G78 are also
affected, but for now we limit the workaround to just this GPU. We can
re-evaluate if and when we receive bug reports from users on other
GPUs.

Pushed by jnicol@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/28680960f741
Avoid invalidating render targets on Mali-G78 devices. r=gfx-reviewers,jrmuizel

whoops, unused variable. relanding.

Flags: needinfo?(jnicol)
Pushed by jnicol@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ce1c3f8552f7
Avoid invalidating render targets on Mali-G78 devices. r=gfx-reviewers,jrmuizel
Status: NEW → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
Target Milestone: --- → 106 Branch

Since nightly and release are affected, beta will likely be affected too.
For more information, please visit auto_nag documentation.

This missed the RC cutoff for Fenix 105 already, unfortunately. That said, I'd like to get this in a dot release eventually (or RC respin should the need arise), so if we could try to verify the fix with tomorrow's Nightly build, that would be helpful.

The patch landed in nightly and beta is affected.
:jnicol, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox105 to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(jnicol)

Hi, are any of you able to install Firefox Nightly and see if things are working better now? We'd like to eventually get this fixed in a 105 point release and getting it verified on Nightly would be a big help for doing so.

Flags: needinfo?(robin.dpnt)
Flags: needinfo?(nonetrix)
Flags: needinfo?(bugzilla)

Between the time of my comment 1 and now I received the Android 13 update. Once I installed 13 I saw there were a lot of minor corruptions on animations and user profile photos on several sites. From a quick glance I am not seeing those corruptions on an up to date nightly.

Fixed for me on today's nightly. Assuming we want release uplift now rather than beta?

Flags: needinfo?(jnicol)

Comment on attachment 9294239 [details]
Bug 1787520 - Avoid invalidating render targets on Mali-G78 devices. r?#gfx-reviewers

Beta/Release Uplift Approval Request

  • User impact if declined: Frequent image corruption for Pixel 6 (pro/a) users
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Avoids calling a function that is purely an optimization, which avoids a driver bug
  • String changes made/needed:
  • Is Android affected?: Yes
Attachment #9294239 - Flags: approval-mozilla-release?
Flags: needinfo?(robin.dpnt)
Flags: needinfo?(nonetrix)
Flags: needinfo?(bugzilla)

Comment on attachment 9294239 [details]
Bug 1787520 - Avoid invalidating render targets on Mali-G78 devices. r?#gfx-reviewers

Per discussion with Product, we're going to respin the 105 RC builds to take this fix given the number of devices potentially affected.

Attachment #9294239 - Flags: approval-mozilla-release? → approval-mozilla-release+
Flags: qe-verify+

We tested the issue on Fenix RC 105.1.0, Nightly 106.0a1 (2022-09-15) and Focus RC 105.1.0, Nightly 106.0a1 (Build #362590509) on several Pixel 6 (Android 13) devices and the issue no longer occurs. We didn't find any glitches when browsing the https://misskey.io/ website.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
See Also: → 1795614
You need to log in before you can comment on or make changes to this bug.