Closed Bug 1816299 Opened 1 year ago Closed 1 year ago

No response when displaying very long lines (with a lot of font fallback within the element)

Categories

(Core :: Graphics: Text, defect, P3)

Firefox 109
defect

Tracking

()

VERIFIED FIXED
112 Branch
Tracking Status
firefox-esr102 --- wontfix
firefox110 --- wontfix
firefox111 --- wontfix
firefox112 --- verified

People

(Reporter: bczhc0, Assigned: jfkthame)

References

(Regression)

Details

(Keywords: regression)

Attachments

(3 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/109.0

Steps to reproduce:

  1. Go to the test page: https://gist.github.com/bczhc/5dc726c786febb6555df846714730b53/4e66adfb5d96be3233b21963de2bc4f7b5505b93

  2. Scroll to the bottom of the page

Actual results:

Page is not responding and shows a whole white

Expected results:

The page can display the very lone line correctly.

I tried many times, and the console log can vary.
Console log case 1:

ATTENTION: default value of option mesa_glthread overridden by environment.
[unhandlable oom] Failed to mmap, likely no more mappings available /build/firefox/src/firefox-109.0/memory/build/mozjemalloc.cpp : 1504ExceptionHandler::GenerateDump cloned child 2028190
ExceptionHandler::SendContinueSignalToChild sent continue signal to child
ExceptionHandler::WaitForContinueSignal waiting for continue signal...

In this case Firefox just hangs. No crash, just becoming totally not responding. I have to manually send a SIGINT to it to kill it.

Console log case 2:

ATTENTION: default value of option mesa_glthread overridden by environment.
[unhandlable oom] Failed to mmap, likely no more mappings available /build/firefox/src/firefox-109.0/memory/build/mozjemalloc.cpp : 1504Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.

In this case it crashes immediately, but no crash report is recorded. (about:crashes has no report)

Console log case 3:

ATTENTION: default value of option mesa_glthread overridden by environment.
ATTENTION: default value of option mesa_glthread overridden by environment.
ATTENTION: default value of option mesa_glthread overridden by environment.
ATTENTION: default value of option mesa_glthread overridden by environment.
out of memory: 0x0000000000080000 bytes requested
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.

Crashes and no crash report recorded.

Firefox on Android doesn't have this problem.

The Bugbug bot thinks this bug should belong to the 'Core::Panning and Zooming' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Panning and Zooming
Product: Firefox → Core

Can repro hang/stutter + 1.8GB RAM use on Winx64 : https://share.firefox.dev/3YrT4O4
Edit : With IPC and allocations : https://share.firefox.dev/3HP4VPg
The page makes my mouse stutter and large GPU use.

Status: UNCONFIRMED → NEW
Component: Panning and Zooming → Graphics: Text
Ever confirmed: true
Flags: needinfo?(jfkthame)

2023-02-12T22:50:48.543000: DEBUG : Found commit message:
Bug 1757647 - Implement Windows 11 overlay scrollbars. r=cmartin

Put it behind a pref for nightly and early beta for now.

Differential Revision: https://phabricator.services.mozilla.com/D139987

2023-02-12T22:50:48.543000: DEBUG : Did not find a branch, checking all integration branches
2023-02-12T22:50:48.543000: INFO : The bisection is done.
2023-02-12T22:50:48.543000: INFO : Stopped

Flags: needinfo?(jfkthame) → needinfo?(emilio)
Regressed by: 1757647
Attached file about:support

I can reproduce slower rendering with overlay scrollbars on Linux as well, but bug 1757647 can't be the "real" regressor here, if there's one... On macOS the bug should reproduce much earlier than that (same on Linux or Windows with ui.useOverlayScrollbars=1).

It seems WRRenderBackend is having a hard time under webrender::texture_pack::guillotine::GuillotineAllocator::find_index_of_best_rect. Nical, Glenn, do you know if there's something to optimize there or there is something Gecko should do to not trigger that?

Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(gwatson)
Flags: needinfo?(emilio)

Seems like the underlying issue might be the same as the GPU process part of bug 1800596.

See Also: → 1800596
Flags: needinfo?(gwatson)

This is slightly different from bug 1800596 because on that one there's only one textrun, but in this one we get a ton of different ones with a tons of calls to TextDrawTarget::FillGlyphs. I thinkTextDrawTarget::FillGlyphs` could trim glyphs (or glyph runs) that we know are fully outside of the current clip rect.

However, the glyph buffer is only a pair of point + glyph index. Lee, Jonathan, do you know how easy would it be to compute a bounding box for that / if there's one available around? Can the caller deal with this easily, or how do other draw targets optimize this? Presumably they just try to draw and fail?

Flags: needinfo?(lsalzman)
Flags: needinfo?(jfkthame)

ni?ing myself to try to take a closer look tomorrow.

Flags: needinfo?(emilio)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #10)

This is slightly different from bug 1800596 because on that one there's only one textrun, but in this one we get a ton of different ones with a tons of calls to TextDrawTarget::FillGlyphs.

(Terminology check: I'm curious (but haven't looked directly, yet).... for the long line of emoji in the example, do we actually have a huge number of separate text runs (gfxTextRun objects), or do we have a single gfxTextRun containing a huge number of separate glyph runs (because it constantly flips between text and emoji fonts)? I'd expect it to be the latter case: one gfxTextRun with a large array of GlyphRuns.)

We don't have bounding boxes for each of the glyph runs on hand, sadly.

gfxTextRun can return the bounding box of a range of glyphs, but doing so isn't free: it has to iterate over the range and accumulate the glyph bounds. Still, that's cheaper than pushing them all to webrender if they're just going to be clipped.

So we can improve the performance of this example by short-circuiting any glyph runs that will be entirely clipped, e.g. when gfxTextRun::Draw is looping over the glyph runs. E.g. on my MacBook, I'm seeing around 4fps when horizontally scrolling the long line of emoji in the example; but with the strawman patch[1] at https://treeherder.mozilla.org/jobs?repo=try&revision=5be7f2b24b0da6c0515f1bceaba1a282c34ef8fa, it achieves 60fps with only occasional stutter.

My concern, though, is that doing something like this will regress painting performance for cases where (most of) the text isn't outside the clip, in which case we'll still end up having to paint it all and we've just done this extra measurement work for no gain.

[1] Which - unsurprisingly - currently fails some reftests.

Flags: needinfo?(jfkthame)

Ah, interesting, I came up with something a bit similar, but computing the bounds as we fill in the GlyphBuffer, which might have lower performance impact perhaps?

https://treeherder.mozilla.org/jobs?repo=try&revision=eeb83f1610466691ceff97d041de3ba51488d221 is the current state of what I came up with, wdyt? (Assuming I haven't goofed any ltr/vertical-text boolean I think it should work)

Flags: needinfo?(emilio) → needinfo?(jfkthame)

It looks like you're computing bounds based on the glyph positions/advances and the font metrics; I think that won't necessarily work, because the "ink" of the glyphs may project beyond the origin & advance (in the inline direction) and/or beyond the font's ascent/descent metrics (in the block direction). So while much of the time those bounds will include everything that needs painting, that won't always hold true.

Flags: needinfo?(jfkthame)
Severity: -- → S2
Priority: -- → P2
Flags: needinfo?(lsalzman)
Severity: S2 → S3
Priority: P2 → P3
Assignee: nobody → jfkthame
Status: NEW → ASSIGNED
See Also: → 1772994
Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8ee1be14c009
Try to avoid sending entirely-clipped glyph runs to a TextDrawTarget. r=gfx-reviewers,lsalzman
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 112 Branch

Updating summary to more accurately indicate the case that's being handled here. A huge line in a single uniform font (as for example in bug 1800596) will still be an issue.

Summary: No response when displaying very long lines → No response when displaying very long lines (with a lot of font fallback within the element)
Flags: needinfo?(nical.bugzilla)

Set release status flags based on info from the regressing bug 1757647

Ehh I just tried Firefox nightly at rev e027953e2470621b104e21045bec0a568c2e93d7, the horizontal scrolling is smooth now, but when I select the text, it hangs and crashes again.

(In reply to bczhc0 from comment #21)

Ehh I just tried Firefox nightly at rev e027953e2470621b104e21045bec0a568c2e93d7, the horizontal scrolling is smooth now, but when I select the text, it hangs and crashes again.

Can you provide a link to a crash report for this? (See about:crashes; if there are multiple reports listed, you should be able to tell which is the relevant one by its date/time.)

Flags: needinfo?(bczhc0)

(In reply to Jonathan Kew [:jfkthame] from comment #22)

(In reply to bczhc0 from comment #21)

Ehh I just tried Firefox nightly at rev e027953e2470621b104e21045bec0a568c2e93d7, the horizontal scrolling is smooth now, but when I select the text, it hangs and crashes again.

Can you provide a link to a crash report for this? (See about:crashes; if there are multiple reports listed, you should be able to tell which is the relevant one by its date/time.)

Hello! I opened the reduced test-case, and started to select the text, then Firefox hung.. It acts the same as I mentioned in comment (case 1 and 2), and about:crashes shows empty.

Flags: needinfo?(bczhc0)

Firefox nightly rev e027953e24 on Windows doesn't encounter the problem (comment#21); text selection is OK. But on Linux X11 it's still buggy. Maybe someone can confirm this on Linux?

The patch landed in nightly and beta is affected.
:jfkthame, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox111 to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(jfkthame)

(In reply to bczhc0 from comment #24)

Firefox nightly rev e027953e24 on Windows doesn't encounter the problem (comment#21); text selection is OK. But on Linux X11 it's still buggy. Maybe someone can confirm this on Linux?

On my Linux machine, selection does feel quite sluggish but doesn't hang altogether.
Could you please file a new bug about this, with details of the system where it's happening? Thanks!

Flags: needinfo?(jfkthame)

(In reply to Release mgmt bot [:suhaib / :marco/ :calixte] from comment #25)

The patch landed in nightly and beta is affected.
:jfkthame, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox111 to wontfix.

Let's give this a few days on Nightly to see if there's any more general performance impact, before making a call on this.
(Leaving the needinfo? flag for now.)

Flags: needinfo?(jfkthame)
Depends on: 1816902

I did some tests in the "reduced testcase". Removing either one style block from the two diff below, will make Firefox 110 handle the page without a hang, and text selection (Bug 1816927) also works.

--- -	2023-02-16 10:35:51.653287488 +0800
+++ github-gist.html	2023-02-16 10:35:44.593333951 +0800
@@ -521,10 +521,6 @@
         margin-bottom: var(--base-size-8, 8px) !important;
     }
 
-    .p-0 {
-        padding: 0 !important;
-    }
-
     :root {
         --primer-actionListContent-paddingBlock: var(--primer-control-medium-paddingBlock, 6px);
     }

Or

--- -	2023-02-16 10:58:02.629137167 +0800
+++ github-gist.html	2023-02-16 10:57:55.050001030 +0800
@@ -534,12 +534,6 @@
         padding: var(--primer-stack-padding-normal, 16px);
     }
 
-    .Box-body:last-of-type {
-        border-bottom-left-radius: var(--primer-borderRadius-medium, 6px);
-        border-bottom-right-radius: var(--primer-borderRadius-medium, 6px);
-        margin-bottom: calc(var(--primer-borderWidth-thin, 1px)*-1);
-    }
-
     :root {
         --primer-duration-fast: 80ms;
         --primer-easing-easeInOut: cubic-bezier(0.65, 0, 0.35, 1);

So in this case Firefox 110 (without the fix of this issue) has no problem handling the page. Maybe this issue needs more tests and research?

See Also: → 1817184
Flags: needinfo?(jfkthame)
See Also: → 1816927

(In reply to Jonathan Kew [:jfkthame] from comment #27)

(In reply to Release mgmt bot [:suhaib / :marco/ :calixte] from comment #25)

The patch landed in nightly and beta is affected.
:jfkthame, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox111 to wontfix.

Let's give this a few days on Nightly to see if there's any more general performance impact, before making a call on this.
(Leaving the needinfo? flag for now.)

The patch here has been superseded by a better fix in bug 1817184. We might want to consider that one for uplift, if all looks good in a few days.

Flags: qe-verify+

I've reproduced the initial performance issue when using Nightly 111.0a1 (2023-02-12) on Ubuntu 22.04 while scrolling the test page. However I didn't manage to reproduce it on Windows 10 and Firefox didn't crash during that time.
While checking this on the latest Nightly 113.0a1 (2023-04-06) version, the scrolling was smooth and no crash occurred during testing.

@bczhc0, have you noticed lately Firefox to be lagging, loading slowly or crashing when scrolling the test page?

Flags: needinfo?(bczhc0)

(In reply to Ina Popescu, Desktop QA from comment #30)

I've reproduced the initial performance issue when using Nightly 111.0a1 (2023-02-12) on Ubuntu 22.04 while scrolling the test page. However I didn't manage to reproduce it on Windows 10 and Firefox didn't crash during that time.
While checking this on the latest Nightly 113.0a1 (2023-04-06) version, the scrolling was smooth and no crash occurred during testing.

@bczhc0, have you noticed lately Firefox to be lagging, loading slowly or crashing when scrolling the test page?

The latest 113.0a1 won't be laggy. This issue seems to have been fixed in https://phabricator.services.mozilla.com/D171318, and I've tested it at that time, see https://bugzilla.mozilla.org/show_bug.cgi?id=1818654#c2.

Yes, one thing is that this issue happened on my Linux firefox, but not on Windows firefox with the same version. See "Extra notes" in https://bugzilla.mozilla.org/show_bug.cgi?id=1816927#c0.

Flags: needinfo?(bczhc0)

Thank you for confirming.
Marking this as VERIFIED FIXED.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: