1368852 - twitter's load "new results" feature takes a very long time compared to other browsers when you have a large number of tweets displayed (gecko profiler profile included)

Reporter

Description

•

7 years ago

If you have a large number of tweets displayed and new tweets come in, clicking the header to load those new tweets can take a very long time. In my case, I had many pages of tweets loaded and it took almost 30 seconds to load the new tweets from the time I clicked on the "2 new results" button at the top of the page. Steps to reproduce: 1. load https://twitter.com/search?f=tweets&vertical=default&q=firefox%20OR%20mozilla%20lang%3Aen&src=typd 2. Page down until you've got about 12 hours worth of tweets loaded 3. scroll to the top and wait for new tweets to come in (never more than a couple minutes) 4. click the "n new results" button to load the new tweets Results: wait ten, maybe twenty or even thirty seconds for the new tweets to load. Expected results: new tweets load in a few seconds like in other browsers. Gecko Profiler profile: https://perfht.ml/2qyNFFf

Asa Dotzler [:asa]

Reporter

Updated

•

7 years ago

Summary: twitter's "load more" feature takes a very long time compared to other browsers when you have a large number of tweets displayed → twitter's load "new results" feature takes a very long time compared to other browsers when you have a large number of tweets displayed

Asa Dotzler [:asa]

Reporter

Updated

•

7 years ago

Summary: twitter's load "new results" feature takes a very long time compared to other browsers when you have a large number of tweets displayed → twitter's load "new results" feature takes a very long time compared to other browsers when you have a large number of tweets displayed (gecko profiler profile included)

Asa Dotzler [:asa]

Reporter

Comment 1

•

7 years ago

I captured another profile of this hang. It's still happening http://bit.ly/2vmFbqQ

Selena Deckelmann :selenamarie :selena

Updated

•

7 years ago

Keywords: regression

Whiteboard: [qf]

Daniel Holbert [:dholbert]

Comment 2

•

7 years ago

Hi Selena - looks like you tagged this with the "regression" keyword -- but are we sure this is a regression? (Do we know that this was better in an old build? That's not clear to me from comment 0.)

Flags: needinfo?(sdeckelmann)

Jean Gong :jgong

Comment 3

•

7 years ago

Asa, can you reprofile this bug with Stylo enable to see if it help resolve the problem.

Jean Gong :jgong

Updated

•

7 years ago

Whiteboard: [qf] → [qf:p1]

Asa Dotzler [:asa]

Reporter

Comment 4

•

7 years ago

I re-profiled this and it might be a little bit faster but it still hangs for many many seconds longer than other browsers. https://perfht.ml/2xYd26A

Daniel Holbert [:dholbert]

Comment 5

•

7 years ago

Unfortunately that profile doesn't have symbols (lots of "0x7ff[...] xul.pdb") But it does seem to show a 24-second event processing delay: https://perf-html.io/public/1ff95504826b78ec1d4f7207a6dada7ea12ab571/calltree/?hiddenThreads=&range=2.2243_25.4983~2.2243_25.4965&thread=5&threadOrder=0-2-3-4-5-6-7-1-8 The first 17 seconds don't have any samples in the corresponding content process, which is a bit mystifying. Setting that aside, though -- the final ~8 seconds are mostly inside of the marker for "PresShell::DoFlushPendingNotifications Style". This roughly matches the characteristics of the profile in comment 1, too. Based on that profile (which does have symbols), it looks like we were spending tons of time in restyling code. (ProcessPendingRestyles, calling into ElementRestyler::RestyleChildren, deeply nested). Presumably something similar is going on with current Nightly and stylo.

Daniel Holbert [:dholbert]

Comment 6

•

7 years ago

Here's a profile I captured locally (with stylo force-enabled): https://perf-html.io/public/8b908184adfcff5bcb54f82cfee9d2e5ece6e309/calltree/?hiddenThreads=&thread=3&threadOrder=0-2-3-1 There's a nearly 4-second hang there (handling the click event). It breaks down as follows: (1) 180ms in misc JS/DOM code. (2) 1697ms in get_offsetWidth: https://perfht.ml/2wnMI4r ...which further breaks down into: 1a) 593ms of Restyle (mozilla::ServoRestyleManager::DoProcessPendingRestyles) 1b) 1103ms of reflow (mozilla::PresShell::ProcessReflowCommands) (3) 100ms of misc JS/DOM code (4) 1569ms of "removeChild()": https://perfht.ml/2wnNdeP (which is largely mozilla::ServoRestyleManager::DoProcessPendingRestyles) FWIW, I think the "removeChild" action here (and maybe all of the action) is from Twitter removing stuff from the *bottom* of the page. Specifically: if I perform the STR and measure the page height before and after I click the new tweets button, I see a value like this before: document.getElementById("page-container").offsetHeight = 137561 ...vs. after: document.getElementById("page-container").offsetHeight = 61754 So under the hood, twitter is removing a bunch of your loaded tweets, when you click that "View N new tweets" link. And I think that's where all of our time is going.

Daniel Holbert [:dholbert]

Updated

•

7 years ago

Component: General → CSS Parsing and Computation

Comment hidden (obsolete)

Daniel Holbert [:dholbert]

Comment 8

•

7 years ago

(In reply to Daniel Holbert [:dholbert] from comment #5) > The first 17 seconds don't have any samples in the corresponding content > process, which is a bit mystifying. (mstange tells me this means the profiler's buffer was too small. Asa is kindly recording one more profile, now with a larger buffer.)

Daniel Holbert [:dholbert]

Comment 9

•

7 years ago

Here's a better profile from Asa: https://perfht.ml/2wo6ddd -- scroll down to "Content process 5 of 5" there. It's on Windows, so the sample times are under-reported by ~50%. The durations shown in the graphical waterfall are correct. It seems similar to my profile in comment 6, except: - his reflow is triggered by getBoundingClientRect, not get_offsetWidth. - he's spending *way* longer in removeChild at the end. I'm guessing that's what's responsible for the potentially-unbounded-hang, depending on how much content that Twitter decides to prune (i.e. depending on how much bonus content you've scrolled) There are over 7 seconds of removeChild, shown here: * Half of that seems to be in mozilla::ServoRestyleManager::DoProcessPendingRestyles * And half seems to be in mozilla::ServoStyleSet::StyleDocument (which is calling into rayon_core::latch::LockLatch::wait, though maybe that's just from synchronization between threads while another thread is doing real restyle work...?) (Again, recall that you need to multiply the reported "Running time (ms)" values by 2 to make them accurate.)

Daniel Holbert [:dholbert]

Comment 10

•

7 years ago

(In reply to Daniel Holbert [:dholbert] from comment #9) > There are over 7 seconds of removeChild, shown here: Sorry, I forgot the link there. Here's the 7-second chunk of removeChild (you have to scroll down to Content Process 5 of 5): https://perfht.ml/2woySis

Daniel Holbert [:dholbert]

Comment 11

•

7 years ago

emilio, perhaps you can take a look (or suggest another stylo developer to take a look).

Flags: needinfo?(emilio)

Daniel Holbert [:dholbert]

Comment 12

•

7 years ago

(In reply to Daniel Holbert [:dholbert] from comment #2) > Hi Selena - looks like you tagged this with the "regression" keyword -- but > are we sure this is a regression? Asa says he's doesn't think this is a regression, so I'm gonna assume this was just a mistake/misunderstanding. --> removing the keyword & clearing needinfo.

Flags: needinfo?(sdeckelmann)

Keywords: regression

Emilio Cobos Álvarez (:emilio)

Assignee

Comment 13

•

7 years ago

(It's pretty late for me already, so I'll drop here the obvious stuff I'm seeing and I may have time to investigate this tomorrow or when I'm done with bug 1362532). So it seems we're flushing after pretty much every DOM removal, which is pretty awful. We need to restyle there because the container has slow selector flags (:first-child at least, though I suspect also :-nth-child or similar, otherwise we shouldn't spend that much time restyling). The reason why we flush is because there's a runnable (nsHideViewer) that flushes frames (and thus also styles). The reason we post that runnable is because the child removal happens to trigger a removal of a subdocument frame (read: an <iframe>). So I think the obvious fix is figuring out a way to not post that runnable, or post it so that it runs in the next refresh driver tick. What it really wants is to check if that frame has really gone away from the DOM to call nsFrameLoader::Hide, but it doesn't need to do that immediately I think. It really wants some kind of (hypothetical) "FrameFlushCallBack", I think (plus ensuring that such a flush happens). I wonder if we could just (ab)use nsIReflowCallback for this. Of course it's somewhat tricky because subdocuments can flush the parent and so on, but I _think_ we handle that in the required places already...

Emilio Cobos Álvarez (:emilio)

Assignee

Updated

•

7 years ago

Comment 15

•

7 years ago

Note that the above fixes this case, which is nice, but that's still a perf footgun for other (arguably more rare) edge cases. I confirmed that the STR with this patch only takes a few seconds, and doesn't spin the CPU like crazy.

Ryan VanderMeulen [:RyanVM]

Updated

•

7 years ago

Assignee: nobody → emilio

Comment hidden (mozreview-request)

Emilio Cobos Álvarez (:emilio)

Assignee

Comment 17

•

7 years ago

(I removed the local patches under the commit so Asa can get a build to test and confirm)

Daniel Holbert [:dholbert]

Comment 18

•

7 years ago

I'm unfamiliar with nsHideViewer, but it looks like tnikkel has touched it before (in e.g. bug 1261230), so I'll punt this review to him.

Daniel Holbert [:dholbert]

Updated

•

7 years ago

Attachment #8907312 - Flags: review?(dholbert) → review?(tnikkel)

Asa Dotzler [:asa]

Reporter

Comment 19

•

7 years ago

Tested the try build with emilio's patch and what was a 30 second pause was reduced to a 3 second pause. What were 5 and 10 second pauses (because of less scrollback) now feel like a quick page load at a second or under. This is a dramatic improvement. Thanks!

Ryan VanderMeulen [:RyanVM]

Comment 20

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=70ee9be236551dd7465ce32a4fa38e26ca64c43b was the Try push in case anybody else is interested in testing it.

Timothy Nikkel (:tnikkel)

Comment 21

•

7 years ago

mozreview-review

Comment on attachment 8907312 [details] Bug 1368852: Avoid flushing frames from nsHideViewer if we know we're not going to get a frame. https://reviewboard.mozilla.org/r/178982/#review184224

Attachment #8907312 - Flags: review?(tnikkel) → review+

Pulsebot

Comment 22

•

7 years ago

Pushed by ecoal95@gmail.com: https://hg.mozilla.org/integration/autoland/rev/9b4a1e0081bc Avoid flushing frames from nsHideViewer if we know we're not going to get a frame. r=tnikkel

Emilio Cobos Álvarez (:emilio)

Assignee

Updated

•

7 years ago

Flags: needinfo?(emilio)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 23

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/9b4a1e0081bc

Status: NEW → RESOLVED

Closed: 7 years ago

status-firefox57: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla57

Daniel Holbert [:dholbert]

Comment 24

•

7 years ago

Note: bug 1259311 has similar summary, though its STR are a little different. --> Adding to "See also". (In that bug, "scroll a bunch of tweets into view" isn't part of the setup. And the STR involve waiting for a "large number" of pending new tweets [~500], and clicking the twitter "home" to accept the new tweets rather than the "View N new tweets" text)

Updated

•

3 years ago

Performance Impact: --- → P1

Whiteboard: [qf:p1]