Closed Bug 1368852 Opened 7 years ago Closed 7 years ago

twitter's load "new results" feature takes a very long time compared to other browsers when you have a large number of tweets displayed (gecko profiler profile included)

Categories

(Core :: CSS Parsing and Computation, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Performance Impact high
Tracking Status
firefox57 --- fixed

People

(Reporter: asa, Assigned: emilio)

References

Details

Attachments

(1 file)

If you have a large number of tweets displayed and new tweets come in, clicking the  header to load those new tweets can take a very long time. In my case, I had many pages of tweets loaded and it took almost 30 seconds to load the new tweets from the time I clicked on the "2 new results" button at the top of the page. 

Steps to reproduce: 

1. load https://twitter.com/search?f=tweets&vertical=default&q=firefox%20OR%20mozilla%20lang%3Aen&src=typd
2. Page down until you've got about 12 hours worth of tweets loaded
3. scroll to the top and wait for new tweets to come in (never more than a couple minutes) 
4. click the "n new results" button to load the new tweets

Results: wait ten, maybe twenty or even thirty seconds for the new tweets to load. 

Expected results: new tweets load in a few seconds like in other browsers. 

Gecko Profiler profile: https://perfht.ml/2qyNFFf
Summary: twitter's "load more" feature takes a very long time compared to other browsers when you have a large number of tweets displayed → twitter's load "new results" feature takes a very long time compared to other browsers when you have a large number of tweets displayed
Summary: twitter's load "new results" feature takes a very long time compared to other browsers when you have a large number of tweets displayed → twitter's load "new results" feature takes a very long time compared to other browsers when you have a large number of tweets displayed (gecko profiler profile included)
I captured another profile of this hang. It's still happening http://bit.ly/2vmFbqQ
Keywords: regression
Whiteboard: [qf]
Hi Selena - looks like you tagged this with the "regression" keyword -- but are we sure this is a regression? (Do we know that this was better in an old build? That's not clear to me from comment 0.)
Flags: needinfo?(sdeckelmann)
Asa, can you reprofile this bug with Stylo enable to see if it help resolve the problem.
Whiteboard: [qf] → [qf:p1]
I re-profiled this and it might be a little bit faster but it still hangs for many many seconds longer than other browsers. https://perfht.ml/2xYd26A
Unfortunately that profile doesn't have symbols (lots of "0x7ff[...] xul.pdb")  But it does seem to show a 24-second event processing delay:
https://perf-html.io/public/1ff95504826b78ec1d4f7207a6dada7ea12ab571/calltree/?hiddenThreads=&range=2.2243_25.4983~2.2243_25.4965&thread=5&threadOrder=0-2-3-4-5-6-7-1-8

The first 17 seconds don't have any samples in the corresponding content process, which is a bit mystifying.  Setting that aside, though -- the final ~8 seconds are mostly inside of the marker for "PresShell::DoFlushPendingNotifications Style".

This roughly matches the characteristics of the profile in comment 1, too. Based on that profile (which does have symbols), it looks like we were spending tons of time in restyling code. (ProcessPendingRestyles, calling into ElementRestyler::RestyleChildren, deeply nested).  Presumably something similar is going on with current Nightly and stylo.
Here's a profile I captured locally (with stylo force-enabled):
https://perf-html.io/public/8b908184adfcff5bcb54f82cfee9d2e5ece6e309/calltree/?hiddenThreads=&thread=3&threadOrder=0-2-3-1

There's a nearly 4-second hang there (handling the click event). It breaks down as follows:
 (1) 180ms in misc JS/DOM code.

 (2) 1697ms in get_offsetWidth: https://perfht.ml/2wnMI4r
    ...which further breaks down into:
    1a) 593ms of Restyle (mozilla::ServoRestyleManager::DoProcessPendingRestyles)
    1b) 1103ms of reflow (mozilla::PresShell::ProcessReflowCommands)

 (3) 100ms of misc JS/DOM code

 (4) 1569ms of "removeChild()": https://perfht.ml/2wnNdeP
    (which is largely mozilla::ServoRestyleManager::DoProcessPendingRestyles)


FWIW, I think the "removeChild" action here (and maybe all of the action) is from Twitter removing stuff from the *bottom* of the page.  Specifically: if I perform the STR and measure the page height before and after I click the new tweets button, I see a value like this before:
  document.getElementById("page-container").offsetHeight = 137561
...vs. after:
  document.getElementById("page-container").offsetHeight = 61754

So under the hood, twitter is removing a bunch of your loaded tweets, when you click that "View N new tweets" link.  And I think that's where all of our time is going.
Component: General → CSS Parsing and Computation
(In reply to Daniel Holbert [:dholbert] from comment #5)
> The first 17 seconds don't have any samples in the corresponding content
> process, which is a bit mystifying.

(mstange tells me this means the profiler's buffer was too small. Asa is kindly recording one more profile, now with a larger buffer.)
Here's a better profile from Asa: https://perfht.ml/2wo6ddd -- scroll down to "Content process 5 of 5" there. It's on Windows, so the sample times are under-reported by ~50%. The durations shown in the graphical waterfall are correct.

It seems similar to my profile in comment 6, except:
  - his reflow is triggered by getBoundingClientRect, not get_offsetWidth.
  - he's spending *way* longer in removeChild at the end. I'm guessing that's what's responsible for the potentially-unbounded-hang, depending on how much content that Twitter decides to prune (i.e. depending on how much bonus content you've scrolled) There are over 7 seconds of removeChild, shown here:
 * Half of that seems to be in mozilla::ServoRestyleManager::DoProcessPendingRestyles
 * And half seems to be in mozilla::ServoStyleSet::StyleDocument (which is calling into rayon_core::latch::LockLatch::wait, though maybe that's just from synchronization between threads while another thread is doing real restyle work...?)

(Again, recall that you need to multiply the reported "Running time (ms)" values by 2 to make them accurate.)
(In reply to Daniel Holbert [:dholbert] from comment #9)
> There are over 7 seconds of removeChild, shown here:

Sorry, I forgot the link there.  Here's the 7-second chunk of removeChild (you have to scroll down to Content Process 5 of 5): https://perfht.ml/2woySis
emilio, perhaps you can take a look (or suggest another stylo developer to take a look).
Flags: needinfo?(emilio)
(In reply to Daniel Holbert [:dholbert] from comment #2)
> Hi Selena - looks like you tagged this with the "regression" keyword -- but
> are we sure this is a regression?

Asa says he's doesn't think this is a regression, so I'm gonna assume this was just a mistake/misunderstanding. --> removing the keyword & clearing needinfo.
Flags: needinfo?(sdeckelmann)
Keywords: regression
(It's pretty late for me already, so I'll drop here the obvious stuff I'm seeing and I may have time to investigate this tomorrow or when I'm done with bug 1362532).

So it seems we're flushing after pretty much every DOM removal, which is pretty awful.

We need to restyle there because the container has slow selector flags (:first-child at least, though I suspect also :-nth-child or similar, otherwise we shouldn't spend that much time restyling).

The reason why we flush is because there's a runnable (nsHideViewer) that flushes frames (and thus also styles). The reason we post that runnable is because the child removal happens to trigger a removal of a subdocument frame (read: an <iframe>).

So I think the obvious fix is figuring out a way to not post that runnable, or post it so that it runs in the next refresh driver tick. What it really wants is to check if that frame has really gone away from the DOM to call nsFrameLoader::Hide, but it doesn't need to do that immediately I think. It really wants some kind of (hypothetical) "FrameFlushCallBack", I think (plus ensuring that such a flush happens).

I wonder if we could just (ab)use nsIReflowCallback for this. Of course it's somewhat tricky because subdocuments can flush the parent and so on, but I _think_ we handle that in the required places already...
See Also: → 775965, 997255
Note that the above fixes this case, which is nice, but that's still a perf footgun for other (arguably more rare) edge cases.

I confirmed that the STR with this patch only takes a few seconds, and doesn't spin the CPU like crazy.
Assignee: nobody → emilio
(I removed the local patches under the commit so Asa can get a build to test and confirm)
I'm unfamiliar with nsHideViewer, but it looks like tnikkel has touched it before (in e.g. bug 1261230), so I'll punt this review to him.
Attachment #8907312 - Flags: review?(dholbert) → review?(tnikkel)
Tested the try build with emilio's patch and what was a 30 second pause was reduced to a 3 second pause. What were 5 and 10 second pauses (because of less scrollback) now feel like a quick page load at a second or under. This is a dramatic improvement. Thanks!
Comment on attachment 8907312 [details]
Bug 1368852: Avoid flushing frames from nsHideViewer if we know we're not going to get a frame.

https://reviewboard.mozilla.org/r/178982/#review184224
Attachment #8907312 - Flags: review?(tnikkel) → review+
Pushed by ecoal95@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/9b4a1e0081bc
Avoid flushing frames from nsHideViewer if we know we're not going to get a frame. r=tnikkel
Flags: needinfo?(emilio)
https://hg.mozilla.org/mozilla-central/rev/9b4a1e0081bc
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Note: bug 1259311 has similar summary, though its STR are a little different. --> Adding to "See also".

(In that bug, "scroll a bunch of tweets into view" isn't part of the setup. And the STR involve waiting for a "large number" of pending new tweets [~500], and clicking the twitter "home" to accept the new tweets rather than the "View N new tweets" text)
See Also: → 1259311
Performance Impact: --- → P1
Whiteboard: [qf:p1]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: