Nightly takes 4.5 minutes to display output on Terser.org (with Polymer.js source) . Chrome does it a few seconds.
Categories
(Core :: Layout, enhancement)
Tracking
()
People
(Reporter: mayankleoboy1, Unassigned)
References
()
Details
Attachments
(5 files)
Go to https://try.terser.org/
Copy-paste the attached sample input in the left pane (minified polymer.js source)
Wait for the output to appear in the right pane and become usable.
AR:
Nightly: https://share.firefox.dev/3Yhynpg (18s in JS, 4.5 minutes around Layout)
Chrome: https://share.firefox.dev/3Yuc9lm (About 18s in total)
Feel free to dupe to existing "slow layout" bugs.
| Reporter | ||
Comment 1•1 year ago
|
||
| Reporter | ||
Comment 2•1 year ago
|
||
| Reporter | ||
Comment 3•1 year ago
|
||
However for a similarly sized input, 0 time is spent in layout: https://share.firefox.dev/4feOScL
(maybe the polymer source has some rarely-used fonts or something that triggers the large layout?)
| Reporter | ||
Comment 4•1 year ago
|
||
Profile : https://share.firefox.dev/407z1s4 (30s in js, 1s in layout)
| Reporter | ||
Updated•1 year ago
|
Comment 5•1 year ago
|
||
I can reproduce this on Windows, but on macOS it behaves much better for me: the right pane is usable pretty much as soon as the content appears (at around 15s in this profile). On Windows the content appears but then there's a huge delay before it is actually responsive.
Not sure at the moment why it's so different between platforms.
| Reporter | ||
Comment 6•1 year ago
•
|
||
If i Nightly font settings, I set "Proportional" to "Sans-serif", the demo responds much better: https://share.firefox.dev/40cAFsg / https://share.firefox.dev/4eT9maY
(This IMHO is important enough that i will ni? you so that its not missed)
Ideal thought: What if changing the font can improve perf in other areas/bencgmarks we care about...
| Reporter | ||
Updated•1 year ago
|
Comment 7•1 year ago
|
||
That's really interesting, especially as the original "bad" profile seems to be spending its time dealing with bidi continuation frames, not directly with fonts. So I'm still unclear why it's so different -- either between platforms, or between default fonts -- as that shouldn't affect bidi resolution.
| Reporter | ||
Comment 8•1 year ago
•
|
||
Update: Your suspicion was correct. The sans-serif was a red herring. On more testing, I dont get the speedup.
What does seem to get the speedup is to run the demo _twice _ in a single browser session. The catch is you dont need to finish the first run.
In the first run, paste the text and wait. As soon as the output appears (but the page is still unusable), "end task" the content process/thread/whatever the Task Manager shows as using CPU. The terser tab will crash, but thats ofcourse expected.
Now if you reopen terser and paste the text again, it will get the speedup.
Make what you will of this.
Edit: There also seems to be some effect of disabling the shared font-list. If I disable it, in 80% of the cases I get the speedup on the first run.
| Reporter | ||
Comment 9•1 year ago
•
|
||
slightly aside: In the "Marker Table" tab, why are both fast and slow profiles full of the telemetry markers for "SYSTEM_FONT_FALLBACK_SCRIPT" and "FONT_CACHE_HIT" till the very end of the run? Does Firefox lookup fonts in cache continuously?
And there are no corresponding markers in the "Marker Chart" tab.
Comment 10•1 year ago
|
||
What does seem to get the speedup is to run the demo _twice _ in a single browser session. The catch is you dont need to finish the first run.
Could this be caused by bug 1921477?
| Reporter | ||
Comment 11•1 year ago
|
||
(In reply to Gregory Pappas [:gregp] from comment #10)
What does seem to get the speedup is to run the demo _twice _ in a single browser session. The catch is you dont need to finish the first run.
Could this be caused by bug 1921477?
Nope, I tested a build from 1Oct2024, and could still see the speedup on the second run.
Comment 12•1 year ago
|
||
(In reply to Mayank Bansal from comment #9)
slightly aside: In the "Marker Table" tab, why are both fast and slow profiles full of the telemetry markers for "SYSTEM_FONT_FALLBACK_SCRIPT" and "FONT_CACHE_HIT" till the very end of the run? Does Firefox lookup fonts in cache continuously?
These look ok to me. There's a lot of text in the document, and it includes some sections where a lot of font switching will happen on a per-character basis (because it has some large tables of quoted unicode characters, many of which won't be available in the primary font). So each time we reflow that content, a lot of font lookups will happen. Most of these hit a cached entry (as I'd expect), and overall they're not taking a significant amount of time.
One interesting thing you can see from the timestamps in the marker table is that these font lookups are not happening during the "hang". E.g. in the first profile there are a load of font fallback and cache lookups starting around 18s (this will be the first big reflow); then there's a long gap until 281s during which no font lookups are happening.
The problem is all the (apparently bidi-continuation-related) layout work that's happening between those two batches of font activity.
Comment 13•1 year ago
|
||
OK, I'm starting to understand what's happening here. The polymerjs content includes Unicode characters that require font fallback. As we haven't preloaded all the character maps for all the fonts, we start doing that (asynchronously) in the parent process, and in the meantime the content process will do a best-effort rendering that might not use the "best" choice of font. When the parent process has finished loading the character maps, it triggers gfxPlatformFontList::UpdateFontList in the content process, to tell it to reflow now that font fallbacks can be resolved more fully.
Normally, this works pretty well. However, in this case the content has a hugely long amount of text that includes some RTL characters (and so bidi resolution and non-fluid continuation chains get involved), and when we attempt to reflow that frame tree, we hit a pathologically-bad case managing the long chain of continuations (see nsBidiPresUtils::RemoveBidiContinuation calling MakeContinuationFluid and MakeContinuationsNonFluidUpParentChain in the profile).
In a case like this where there are such long continuation chains, it would actually be far better to just throw away the old frame tree and reframe everything. If we simply change this line to pass NeedsReframe::Yes when the character map loading finishes, the problem largely disappears. Or we can do this only for contexts that involve bidi, by adding a BidiEnabled() condition in nsPresContext here.
Or another workaround is to disable the gfx.font_rendering.fallback.async pref. This avoids the second reflow of the huge, bidi-containing frame tree here, because we'll instead block during the initial reflow when we need the cmaps. (And this will make some cases perform significantly worse, so it's not a great solution either.)
All these thoughts are basically workarounds to try and avoid hitting the bad case here. But even if we hack the font-related stuff so that it doesn't trigger this reflow, there will be other scenarios that can trigger an equally-expensive reflow of this frame tree. The "real" solution would be to fix continuation-chain management (or the underlying structures we're using) such that the reflow we end up doing here doesn't get so expensive.
(Ah.... I was able to cause a similar pathological bidi-involving reflow on macOS by resizing the terser.org window, after loading the polyfilljs source. Not on every resize operation, but sometimes: I suspect it depends exactly where changes in line-breaks end up happening, in relation to the bidi continuations, or something like that. Anyhow, it confirms that it's the bidi continuation chain reflow stuff that's really hurting us here, not font matching/fallback. The async font fallback just happens to be a trigger that can initiate a reflow.)
Comment 14•1 year ago
|
||
The patch in bug 1926512 should help slightly here, but probably only by around 15% or so.
We could make this specific example substantially better with one of the suggestions in comment 13, but I'm not sure how worthwhile that is, given that other simple actions (such as resizing the window) would still be able to hit the incredibly-bad case.
| Reporter | ||
Comment 15•1 year ago
|
||
Profile with latest Nightly: https://share.firefox.dev/4fdc3E8 (1.5min is layout)
So with the patches from bug 1926512, the test completes for me in 1/3rd the time now - 4.5minutes --> 1.5 minutes.
So still very slow, but a substantial improvement from before.
| Reporter | ||
Comment 16•1 year ago
|
||
Proife with latest Nightly: https://share.firefox.dev/3Utp0St (~2.5s)
So we have become 108x faster now since comment #0 !
| Reporter | ||
Comment 17•1 year ago
|
||
This looks quite good to me now. The only layout-y bits are the two 2.5s long periods near the end.
:jfkthame, is there anything else that is feasible to improve here?
Comment 18•1 year ago
|
||
At this point it looks like this is just the cost of reflowing a very large amount of content with huge numbers of elements; I don't see any more specific bottlenecks to focus on, just layout performance in general.
| Reporter | ||
Comment 19•1 year ago
|
||
(In reply to Jonathan Kew [:jfkthame] from comment #18)
At this point it looks like this is just the cost of reflowing a very large amount of content with huge numbers of elements; I don't see any more specific bottlenecks to focus on, just layout performance in general.
No specific bottlenecks --- > Sounds like a RESOLVED FIXED to me.
Thanks for fixing this bug!
Description
•