Closed Bug 1357621 Opened 7 years ago Closed 7 years ago

Amazon.com search results page sync reflows are super expensive in Gecko

Categories

(Core :: Layout, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Performance Impact high
Tracking Status
platform-rel --- +

People

(Reporter: ehsan.akhgari, Assigned: jfkthame)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Whiteboard: [platform-rel-Amazon][platform-rel-AmazonShopping])

STR: Go to <https://www.amazon.com/s/ref=nb_sb_noss_1?url=search-alias%3Daps&field-keywords=laptop>.

Amazon.com has this function called checkLineFit() that triggers tons of sync reflows as can be seen for example in https://perfht.ml/2pyHcws and https://perfht.ml/2pyPrZM.  On my machine this function overall takes about 350-400ms to run in Firefox, but in Chrome it takes something like 80-100ms using their built-in profiler.  This difference seems rather excessive.
platform-rel: --- → ?
Whiteboard: [qf:p1] → [qf:p1][platform-rel-Amazon][platform-rel-AmazonShopping]
Jonathan: please have a look if this relates to text metrics calculation, and if we can optimize. Thx!
Flags: needinfo?(jfkthame)
platform-rel: ? → +
Unless I'm missing something in these profiles, it doesn't seem as though text-related stuff is the main issue. In https://perfht.ml/2pyHcws, we're looking at a total range of 829ms, where filtering for checkLineFit shows 392ms. But filtering that same range for nsTextFrame::ReflowText shows only 46ms; and filtering for gfxTextRun:: shows 19ms.
Flags: needinfo?(jfkthame)
Blocks: 1362625
Depends on: 1348469
Jet, can you please find an owner for this?
Flags: needinfo?(bugs)
See Also: → 1362586
(In reply to Jonathan Kew (:jfkthame) from comment #2)
> Unless I'm missing something in these profiles, it doesn't seem as though
> text-related stuff is the main issue. In https://perfht.ml/2pyHcws, we're
> looking at a total range of 829ms, where filtering for checkLineFit shows
> 392ms. But filtering that same range for nsTextFrame::ReflowText shows only
> 46ms; and filtering for gfxTextRun:: shows 19ms.

Amazon is using a common pattern of flushing Layout to get correct sizes, only to truncate anyway. They're doing this in a tight loop which makes it even worse for perf.

I'd like to find ways to retain more state in the frame tree to avoid the recursive descent whenever possible. Avoiding StyleDisplay() and ReflowDirtyLines()ncalls sounds like good candidates for retention. Can you find other places to save off frame state?
Assignee: nobody → jfkthame
Flags: needinfo?(bugs)
ehsan, do you recall what platform & Firefox-version you were using to record the profiles in comment 0?  And were you logged in to Amazon?

(As I noted in bug 1351830 comment 7, I'm getting somewhat different profiles than you for this bug's Amazon page -- significantly lower ::StyleDisplay measurements than what your profiles from comment 0 show.  I tested in both Firefox 53 32-bit and Firefox 55 64-bit, on the Acer hardware with Win10, and I'm measuring < 2ms spent in ::StyleDisplay over the timeperiod of all three checkLineFit humps combined. Whereas your profiles here show ~20ms during that three-hump timeperiod.  Still not very much, but I'm curious about the difference.)
Flags: needinfo?(ehsan)
(In reply to Daniel Holbert [:dholbert] (AFK May 3-8, 13-21) from comment #5)
> ehsan, do you recall what platform & Firefox-version you were using to
> record the profiles in comment 0?

As the profiles show, OSX.  Probably the latest Nightly at the time, or maybe a few days older than the latest...

> And were you logged in to Amazon?

No.

> (As I noted in bug 1351830 comment 7, I'm getting somewhat different
> profiles than you for this bug's Amazon page -- significantly lower
> ::StyleDisplay measurements than what your profiles from comment 0 show.  I
> tested in both Firefox 53 32-bit and Firefox 55 64-bit, on the Acer hardware
> with Win10, and I'm measuring < 2ms spent in ::StyleDisplay over the
> timeperiod of all three checkLineFit humps combined. Whereas your profiles
> here show ~20ms during that three-hump timeperiod.  Still not very much, but
> I'm curious about the difference.)

Perhaps something has changed in the month that has passed since I filed the bug?  I'm not sure.
Flags: needinfo?(ehsan)
(In reply to Jet Villegas (:jet) from comment #4)
> I'd like to find ways to retain more state in the frame tree to avoid the
> recursive descent whenever possible. Avoiding StyleDisplay() and
> ReflowDirtyLines()ncalls sounds like good candidates for retention. Can you
> find other places to save off frame state?

I've been trying to look at this a bit, but have not come up with any major wins so far. The trouble with saving more state in the frame tree, of course, is knowing when we have to invalidate what we've saved...

We can squeeze a bit more out of StyleDisplay() (being pursued in bug 1351830 and dependencies), but it won't be more than a percent or so.

Avoiding ReflowDirtyLines() seems like it would potentially be big, but it's not clear how we can do that -- after all, if the content (or style) of the frames has changed, we need to reflow in order to get correct frame positions. If we're marking stuff as dirty unnecessarily, that would be a valid bug and we should fix it; but if script on the page is modifying content or styles, then we need to reflow whatever is affected.

In the case of the Amazon checkLineFit() function, I suspect that one thing that makes it expensive for us is their heavy use of floats (and floats within floats, etc). We end up calling nsBlockFrame::ReflowFloat a lot. Sometimes, we call ReflowFloat for the same float several times, even when the float itself hasn't changed (although its context may have done). I'm trying to see if there are cases where we can recognize this and optimize at all... not sure yet if this will go anywhere useful, though.
If there are floats being reflowed multiple times, it's worth seeing if bug 1308876 helps here.
(In reply to David Baron :dbaron: ⌚️UTC-7 from comment #8)
> If there are floats being reflowed multiple times, it's worth seeing if
> bug 1308876 helps here.

Given bug 1308876 comment 50, it seems like it does help a bit here. (tl;dr we got a 4% improvement in our "quantum_pageload_amazon" talos test measurement, when that bug's patches initially landed.)

Adding a dependency.
Depends on: 1308876
No longer depends on: 1348469
Depends on: 1348469
jfkthame, perhaps you could reprofile this in tomorrow's Nightly (dated 2017-07-14), which should have bug 1308876's fix? (or a later Nightly, depending on when you get to this)

Maybe/hopefully ReflowFloat won't show up as frequently as it did when you looked at this before...
Flags: needinfo?(jfkthame)
I have the impression there's considerably less ReflowFloat than previously. But then, I tried going back to a Nightly prior to that landing, and don't see much difference before vs after. And I no longer see any reference to the checkLineFit() function that we started with in comment 0. So it's not clear to me whether (or to what extent) this is a result of bug 1308876; it may be that the page scripts involved have substantially changed their behavior in the meantime, and so new profiles aren't comparable to the originals.
Flags: needinfo?(jfkthame)
(In reply to Jonathan Kew (:jfkthame) from comment #11)
> And I no longer see any reference to
> the checkLineFit() function that we started with in comment 0 [...] it
> may be that the page scripts involved have substantially changed their
> behavior in the meantime, and so new profiles aren't comparable to the
> originals.

Agreed, looks like the site has changed. And it wasn't just a JS-function-renaming-keeping-the-same-underlying-behavior, either.

Here's a profile I took today of pageload of the Amazon URL in comment 0, with a stock Linux Nightly, 0.1ms sample time:
 https://perf-html.io/public/7b53d3e3fa33dd7ab09934e9ef3ef7b965cf0197/calltree/?thread=3

I tried filtering it for "get_clientHeight" (that's the API that was responsible for all of the checkLineFit() layout time in the original profiles here).  I'm seeing only 0.2ms of results.

I *do* see another 80ms delay from get_clientWidth:
https://perf-html.io/public/7b53d3e3fa33dd7ab09934e9ef3ef7b965cf0197/calltree/?hiddenThreads=&search=get_clientWidth&thread=3
...but that mostly goes away if I reprofile after enabling stylo (and restarting Firefox), as shown in this second profile:
https://perf-html.io/public/8ab869c5d66e2b1ec383f3857b6c5e85d783d115/calltree/?hiddenThreads=&search=get_clientWidth&thread=3


I also tried searching for "get_offset" (for the offsetTop / offsetLeft / etc. APIs which also trigger synchronous reflows) and I'm not seeing a lot of time there -- the longest is a single 13ms reflow associated with "https://www.adsensecustomsearchads.com/adsense/search/async-ads.js" in each profile:
https://perf-html.io/public/7b53d3e3fa33dd7ab09934e9ef3ef7b965cf0197/calltree/?hiddenThreads=&range=5.0920_5.1679&search=get_offset&thread=3
https://perf-html.io/public/8ab869c5d66e2b1ec383f3857b6c5e85d783d115/calltree/?hiddenThreads=&range=4.4931_4.6460&search=get_offset&thread=3

13ms is a bit long, but not "super expensive", and it looks like most of the time (10ms of 13ms) is in BuildTextRuns -- maybe worth looking for things we can optimize there [perhaps jfkthame can see something], though my impression is that's at the level of stuff-we-just-have-to-do.

So: we're nowhere near the 350-400ms reflows mentioned in comment 0 anymore.  And the longest thing like that is an 80ms style flush which goes away when stylo is enabled.  So I think we can call this WORKSFORME [not quite FIXED since we're not 100% sure about the cause for the improvement, and it may have been a site change.)

ehsan, please reopen/comment if you're still seeing anything like comment 0.
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(dbaron)
Resolution: --- → WORKSFORME
If we don't see the original issue either due to fixes on our side or on the website, that's great result!
Performance Impact: --- → P1
Whiteboard: [qf:p1][platform-rel-Amazon][platform-rel-AmazonShopping] → [platform-rel-Amazon][platform-rel-AmazonShopping]
You need to log in before you can comment on or make changes to this bug.