Closed Bug 1941154 Opened 1 year ago Closed 7 months ago

Problems with UI and rendering websites with v134

Categories

(Core :: Graphics, defect, P3)

Firefox 134
All
Android
defect

Tracking

()

RESOLVED FIXED
143 Branch
Tracking Status
firefox-esr128 --- unaffected
firefox-esr140 --- wontfix
firefox141 --- wontfix
firefox142 --- fixed
firefox143 --- fixed

People

(Reporter: syphyr, Assigned: jnicol)

References

(Regression)

Details

(Keywords: regression)

Attachments

(5 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:134.0) Gecko/20100101 Firefox/134.0

Steps to reproduce:

Visit kernel.org and the website is not rendered properly. Attached is an image. This problem is happening with all current versions of Firefox on Android (stable, beta, nightly, and focus). My Android device is a Samsung T813 tablet running Android Nougat (7.1). Also, all the avatars on github.com are also not rendered properly and look torn and flash when scrolling. Also, the settings within some extensions, such as FoxyProxy, are showing artifacts with other versions of Firefox. Also, about:config now shows UI artifacts. None of these issues occurred with version 133.0.3. This all started after updating to v134 with all versions of Firefox.

Actual results:

The website looks pixelated and the background colors of the images are pixelated.

Expected results:

The websites should be rendered properly.

The user agent above is from my laptop and not my android device. This problem above is not happening with LInux, only Android.

This problem may be specific to Android Nougat. This is my video card from log:

01-11 22:23:27.102 4288 4378 W ResourceType: Too many attribute references, stopped at: 0x01010099
01-11 22:23:27.122 4288 4378 I Adreno : QUALCOMM build : 422eead, I8e707a60c0
01-11 22:23:27.122 4288 4378 I Adreno : Build Date : 11/23/17
01-11 22:23:27.122 4288 4378 I Adreno : OpenGL ES Shader Compiler Version: XE031.09.00.04
01-11 22:23:27.122 4288 4378 I Adreno : Local Branch : mybranch29262620
01-11 22:23:27.122 4288 4378 I Adreno : Remote Branch : quic/LA.BR.1.3.6.c2_rb2.28
01-11 22:23:27.122 4288 4378 I Adreno : Remote Branch : NONE
01-11 22:23:27.122 4288 4378 I Adreno : Reconstruct Branch : NOTHING

01-12 00:41:26.315 11034 11386 W ActivityThread: ClassLoader.getResources: The class loader returned by Thread.getContextClassLoader() may fail for processes that host multiple applications. You should explicitly specify a context class loader. For example: Thread.setContextClassLoader(getClass().getClassLoader());
01-12 00:41:26.700 11034 11034 I GeckoSession: handleMessage GeckoView:PageStop uri=null
01-12 00:41:26.710 11034 11128 I SessionStorage/AutoSave: Save: Load finished
01-12 00:41:32.775 11034 11092 I SessionPrioritizationMiddleware: Update the tab 98b8105f-5ea7-408e-9166-3bec1bce2647 priority to DEFAULT
01-12 00:41:32.781 11034 11128 I SessionStorage/AutoSave: Save: New tab selected
01-12 00:41:32.941 11034 11034 W GeckoViewActivityContextDelegate: Activity context is null.
01-12 00:41:34.554 11034 11201 W OpenGLRenderer: Points are too far apart 4.000000
01-12 00:41:34.555 11034 11201 W OpenGLRenderer: Points are too far apart 4.005340
01-12 00:41:34.555 11034 11201 W OpenGLRenderer: Points are too far apart 4.002563
01-12 00:41:34.555 11034 11201 W OpenGLRenderer: Points are too far apart 4.000000
01-12 00:41:34.555 11034 11201 W OpenGLRenderer: Points are too far apart 4.000000
01-12 00:41:34.557 11034 11201 W OpenGLRenderer: Points are too far apart 4.000000
01-12 00:41:36.405 11034 11034 I MemoryController: onTrimMemory(20)
01-12 00:41:36.414 11034 11034 I FenixApplication: onTrimMemory(), level=20, main=true
01-12 00:41:36.514 11034 11034 I ContileTopSitesUpdater: Stopped periodic work to update Contile top sites
01-12 00:41:36.548 11034 11034 I service-pocket: Stopped periodic work to refresh content recommendations
01-12 00:41:36.742 11034 11034 I SessionStorage/AutoSave: Save: Background
01-12 00:41:36.799 865 889 D ConnectivityService: requestNetwork for uid/pid:10074/11034 NetworkRequest [ TRACK_DEFAULT id=63, [ Capabilities: INTERNET&NOT_RESTRICTED&TRUSTED&NOT_VPN] ]
01-12 00:41:36.821 11034 11145 I WM-WorkerWrapper: Worker result SUCCESS for Work [ id=8f3adb38-ee62-41ef-a8aa-286c49c2500b, tags={ mozilla.telemetry.glean.scheduler.PingUploadWorker,mozac_service_glean_ping_upload_worker } ]

I used logcat and opened kernel.org. Now I see a few lines from OpenGLRenderer.

It seems that " W OpenGLRenderer: Points are too far apart" is related to the scaling of SVG files.

I just checked and torbrowser also has the "Points are too far apart" warning but it does not have any rendering problems either. So, I guess that warning is not it.

The other suspect in the logcat is this:

01-12 00:41:20.950 11034 11102 D skia : --- SkAndroidCodec::NewFromStream returned null
01-12 00:41:20.950 11034 11126 D skia : --- SkAndroidCodec::NewFromStream returned null

Did the version of skia change in v134?

The version of Firefox that works does not show any debug warnings from skia.

If you encounter a situation where SkAndroidCodec::NewFromStream returns null, it typically indicates an issue with the input stream or the data contained within it. This method is used to create a codec from a stream of image data, and returning null suggests it was unable to successfully decode the image.

And my problem is that the images look like they are not decoded all the way... this really seems like the problem.

https://codingtechroom.com/question/troubleshooting-skandroidcodec-newfromstream-null-returns-in-android-development

Since this issue with skia has crept into all the flavors of Firefox on Android (stable, beta, nightly, and focus), should I create separate bug reports for all of those? Or, is this bug report enough to get all of them fixed?

Moving to Fenix product since the first post mentions all the versions and it would have a greater impact if it is reproducible on Fenix also.
Also adding QA to try to confirm/reproduce the bug.

Flags: qe-verify+
Product: Focus → Fenix
Attached image 1941154.png

Thanks, Mihai!

I've tested on the only devices with Android 7 I have available, and I was not able to reproduce this.
Tested on Firefox for Android 134.0.1, and Nightly 136.0a1, with a Sony Xperia Z5 Premium (Android 7.1.1), and on Huawei MediaPad M3 Lite (Android 7.0) tablet.
I've also tested on Focus 134.0.1, and I was not able to reproduce the issue.

Flags: qe-verify+

The Sony Xperia Z5 has an Adreno 430 GPU. The MediaPad M3 Lite has an Adreno 505 GPU. The only people I've seen so far that have seen this bug have a device with an Adreno 510 GPU. My Samsung T813 also has an Adreno 510 GPU. Although, this bug should be reproducible on any variant of the Samsung Galaxy S2 Tablet.

I am also curious if those skia debug messages about the image decoding errors were shown when testing with these other devices, like this:
D skia : --- SkAndroidCodec::NewFromStream returned null

Would it be possible to to also test with an Android Emulator?

Looking at the skia bugs, I see this one that could be related to this issue:
https://issues.skia.org/issues/361309711

The best suggestion I can think of to fix this image decoding problem with specific Adreno GPU is to go back to the previously working version of skia used in Firefox 133.0.3 and then backport the CVE that are needed into that version. The recently fixed CVE for skia are very easy to backport when looking at Google's CVE bulletin. I've looked at the fixes for the latest skia CVE and they are very small.

These are the CVE for skia that Google just fixed in android-security-12.1.0_r3 and android-security-12.1.0_r4 tags

94d684dd-RESTRICT-AUTOMERGE-Avoid-potential-overflow-when-allocating-3D-mask-from-emboss-filter
fa512f87-pdf-Bounds-check-in-skiaallocfunc.patch
18bcdb2f-RESTRICT-AUTOMERGE-Check-for-size-overflow-before-allocating-SkMask-data.patch
ad726e15-Prevent-overflow-when-growing-an-SkRegions-RunArray.patch

Component: General → Browser Engine

Moving this to Browser Engine to be triaged

Component: Browser Engine → Graphics
Product: Fenix → Core
Duplicate of this bug: 1941647

The severity field is not set for this bug.
:bhood, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(bhood)

Is there any progress on this? This bug still exists in all versions of firefox on Android.

I also wanted to mention that this bug causes all instances of "gravatar" avatars to be distorted looking.

As a workaround, disabling hardware acceleration fixes the problem.

layers.acceleration.disabled = true

An even better workaround is this:

gfx.webrender.software = true

I'm not sure if Skia is rejecting Android 7 or this GPU driver in particular, I'm not familiar with the Skia code, but we should handle fallback to software better if Skia doesn't support this device.

NeedInfo - Do we have a device like this for testing? Any thoughts on what we should do here?

Severity: -- → S2
Flags: needinfo?(bhood) → needinfo?(jnicol)
Priority: -- → P3

I don't think this has anything to do with Skia. I could be wrong but it would surprise me, as we don't use Skia for rendering much web content

Looking at the kernel.org screenshot, it looks like rendering the mask for box shadows hasn't worked, and we are then using a garbage mask to render the box shadow itself. You can see this on the yellow latest release box, but also around the edges of all the white boxes.

Syphyr, first of all apologies this has taken so long to look at. It would be really helpful if you could run mozregression. This is a tool which will download and install a series of versions of Firefox. for each one you answer whether it is good or bad, and eventually it will narrow it down to which change to the code caused the bug. You need to have adb debugging set up, which I presume you do since you have been looking at the logcat, and plug your tablet/phone into the computer. If your computer is windows then installing the GUI version is easiest, or on linux or macos I prefer to use the command line. You can install the command line version with pip install --user mozregression. Then run it like so mozregression --good 133 --bad 134 --app gve. In the GUI version you need to click the scissor icon to start a new bisection, then there are several settings screens to click through, you need to make sure you select "GVE" or "geckoview_example" as the application, and for the good and bad versions enter 133 and 134 respectively. Eventually it should provide you with a link to a list of changes, if you can post that here that would be great!

Let me know if you need any help with that!

Flags: needinfo?(jnicol) → needinfo?(syphyr)

I appreciate the suggestion regarding mozregression. It seems this issue is actually not related to skia after bisecting the issue.

24:45.36 INFO: Last good revision: 0dcf81adac3321229840276141739058f97c72af
24:45.36 INFO: First bad revision: 5a011f959bd0866ccd9b47c75a4a021069c6f1e5
24:45.36 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=0dcf81adac3321229840276141739058f97c72af&tochange=5a011f959bd0866ccd9b47c75a4a021069c6f1e5

This issue seems to be caused by one of the following changesets:

https://hg-edge.mozilla.org/integration/autoland/rev/5a011f959bd0866ccd9b47c75a4a021069c6f1e5
https://hg-edge.mozilla.org/integration/autoland/rev/c1a4592d3826fd5c2505bc42a7147a2ca3f30f20
https://hg-edge.mozilla.org/integration/autoland/rev/e167efc91f97de3fab75fc373b398702e822874b

Flags: needinfo?(syphyr)

That's great, thank you. This does indeed seem to be with the alpha render targets for rendering the box shadows, like I guessed above. Bug 1924736 makes sense, it was a refactoring of the logic regarding clearing render targets, but must have accidentally introduced a behavioural change. We've seen issues before with clearing of render targets having to be done a very specific way otherwise we run in to driver bugs such as this.

Could you also please attach your about:support information from the affected devices? Or if it's easier just to comment, specifically I'm interested in the GPU and driver version for each one. if you search for "driver version" you should find it. Thanks. And which device did you use to run mozregression?

Flags: needinfo?(syphyr)
Keywords: regression
Regressed by: 1924736
Attached file support1.txt

Attached is about:support for my affected device (T813). Thanks.

Flags: needinfo?(syphyr)

I ran the command line version of mozregression on linux connected to my Samsung Galaxy S2 (T813).

Description: Model: SM-T813, Product: gts210vewifixx, Manufacturer: samsung, Hardware: qcom, OpenGL: Qualcomm -- Adreno (TM) 510 -- OpenGL ES 3.2 V@145.0 (GIT@I8e707a60c0)
Vendor ID: Qualcomm
Device ID: Adreno (TM) 510
Driver Version: OpenGL ES 3.2 V@145.0 (GIT@I8e707a60c0)

:nical, since you are the author of the regressor, bug 1924736, could you take a look?

For more information, please visit BugBot documentation.

Flags: needinfo?(nical.bugzilla)

Recent refactoring of render target initialization caused stale
contents of alpha render targets to be used as clip masks instead of
valid masks on Adreno 510 devices. Presumably we were fortunate not to
be hitting this previously, and it was regressed by a subtle
accidental behavioural change introduced by the refactor. Using quads
to clear alpha targets appears to be the most robust way of avoiding
it.

Assignee: nobody → jnicol
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true

That's great, thank you. I have a device (Redmi Note 3) that I can reproduce the corruption on kernel.org with. However, for me it was regressed by a slightly different change, (bug 1922323). This is touching similar code, so presumably the underlying bug is the same.

We can avoid this issue by using draw calls rather than scissored glClear to clear alpha render targets.

Both your and my device have an Adreno 510, but mine has driver version V@251 and yours has V@145. I'm going to apply the workaround to all Adreno 510 devices regardless of driver version.

Set release status flags based on info from the regressing bug 1924736

Status: ASSIGNED → RESOLVED
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: --- → 143 Branch
Flags: needinfo?(nical.bugzilla)

The patch landed in nightly and beta is affected.
:jnicol, is this bug important enough to require an uplift?

For more information, please visit BugBot documentation.

Flags: needinfo?(jnicol)

Recent refactoring of render target initialization caused stale
contents of alpha render targets to be used as clip masks instead of
valid masks on Adreno 510 devices. Presumably we were fortunate not to
be hitting this previously, and it was regressed by a subtle
accidental behavioural change introduced by the refactor. Using quads
to clear alpha targets appears to be the most robust way of avoiding
it.

Original Revision: https://phabricator.services.mozilla.com/D260105

Attachment #9505732 - Flags: approval-mozilla-beta?

firefox-beta Uplift Approval Request

  • User impact if declined: Broken rendering on some pages for users with Adreno 510 GPUs (quite old, so a fairly low population would guess)
  • Code covered by automated testing: yes
  • Fix verified in Nightly: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: N/A
  • Risk associated with taking this patch: Low
  • Explanation of risk level: Isolated fix, only affects behaviour for users who may be affected by the original bug. Switches to a rendering path we know works on other devices, but there could be driver bugs on these GPUs that we aren't aware of
  • String changes made/needed: N/A
  • Is Android affected?: yes
Attachment #9505732 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

I have just confirmed that this issue is fixed with latest Firefox Nightly. Everything is working now. I really appreciate everyone involved that helped get this fixed.

I noticed that esr140 is flagged as "wontfix" and I was wondering if this branch could also include this fix so that this bug is not carried over to the future versions of Tor Browser.

Thanks everyone.

Flags: needinfo?(jnicol)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: