Open Bug 1330218 Opened 5 years ago Updated 3 years ago

GSlides seems to do a lot of synchronous reflows

Categories

(Web Compatibility :: Desktop, defect, P5)

Firefox 53
defect

Tracking

(platform-rel -, firefox53 affected)

Tracking Status
platform-rel --- -
firefox53 --- affected

People

(Reporter: ehsan.akhgari, Unassigned)

References

Details

(Keywords: perf, webcompat:site-wait, Whiteboard: [platform-rel-Google][platform-rel-GoogleSuite][platform-rel-GoogleSlides][sitewait][qf-])

Attachments

(2 files)

Please see this profile: <https://clptr.io/2iDTIEs>  I have searched for nsDocument::FlushPendingNotifications to highlight synchronous reflows.  This profile is generated from a large slide deck (which I unfortunately cannot share publicly) with 47 slides.  I have waited for the slides to finish loading, and then kept pressing the down arrow key to scroll down the list of slides.

Some of these are coming from setting scrollTop, but a ton of them are coming from the window.outerWidth getter.  Perhaps these can be avoided?

Dees, do you mind reaching out to the GSuites team about this?  I'd be happy to provide more information if needed.
Flags: needinfo?(dchinniah)
(In reply to :Ehsan Akhgari (in Taipei, laggy response time) from comment #0)
> Please see this profile: <https://clptr.io/2iDTIEs>  I have searched for
> nsDocument::FlushPendingNotifications to highlight synchronous reflows. 
> This profile is generated from a large slide deck (which I unfortunately
> cannot share publicly) with 47 slides.  I have waited for the slides to
> finish loading, and then kept pressing the down arrow key to scroll down the
> list of slides.
> 
> Some of these are coming from setting scrollTop, but a ton of them are
> coming from the window.outerWidth getter.  Perhaps these can be avoided?
> 
> Dees, do you mind reaching out to the GSuites team about this?  I'd be happy
> to provide more information if needed.

A dedicated DL is up-n-running. Go ahead and use...
Flags: needinfo?(dchinniah)
platform-rel: --- → ?
Whiteboard: [platform-rel-Google][platform-rel-GoogleSuite][platform-rel-GoogleSlides]
I have contacted the GSLides team about this.
Whiteboard: [platform-rel-Google][platform-rel-GoogleSuite][platform-rel-GoogleSlides] → [platform-rel-Google][platform-rel-GoogleSuite][platform-rel-GoogleSlides][sitewait]
See Also: → 1330565
Steven/Philipp — any thoughts here?
Flags: needinfo?(ssaviano)
Flags: needinfo?(pweis)
Thanks for the report. I'll take a look at this soon, hopefully something easy to avoid.
I defer to Philipp here.
Flags: needinfo?(ssaviano)
Version: unspecified → Firefox 53
platform-rel: ? → +
Hi Ehsan,

I took a closer at this today, sorry for not getting to it sooner.

We do have a bunch of window.outerWidth calls when scrolling through slides, but I don't see those taking much time. If I instrument them in the JS with performance.now, they're all around 20 microseconds.

Is there a way in your profile to see where those calls are coming from? I see that I can filter for e.g. outerWidth, but then the remaining trace doesn't seem to contain outerWidth anywhere, and I also don't seem to be able to see where the call is coming from in the JS. Is this possible? Same question for setting scrollTop. We have a good amount of those throughout our code, so would help if I could isolate this a bit more.
Flags: needinfo?(pweis) → needinfo?(ehsan)
(In reply to Philipp Weis from comment #6)
> Hi Ehsan,
> 
> I took a closer at this today, sorry for not getting to it sooner.
> 
> We do have a bunch of window.outerWidth calls when scrolling through slides,
> but I don't see those taking much time. If I instrument them in the JS with
> performance.now, they're all around 20 microseconds.

What window.outerWidth and similar getters that return layout information do is first check to see whether there is any out of date layout information to be calculated, and if so they do a full layout before returning the number asked for.  So, if you instrument a getter call that doesn't actually perform any layout calculation it will be fast.  But if the layout calculation needs to be updated, it will be much slower (and the amount by which it will be slower depends on how much work it needs to do which is extremely hard to predict in the general case.

In my experience often the really bad cases are in this form:

  while (...) {
    changeSomethingInDOM(); // For example set a .class or .style attribute, add/remove DOM nodes to/from the tree.
    window.outerWidth;
  }

Browser engines will typically do very little layout computation if at all in response to changeSomethingInDOM(), and try to be lazy, but a loop like this basically forces the browser to do eager layout computation.  In this situation the cost of one of these getters may still not be very high, but it adds up as the loop runs.

Not sure how you performed the measurements, but it's good to keep the above in mind.

> Is there a way in your profile to see where those calls are coming from? I
> see that I can filter for e.g. outerWidth, but then the remaining trace
> doesn't seem to contain outerWidth anywhere, and I also don't seem to be
> able to see where the call is coming from in the JS. Is this possible? Same
> question for setting scrollTop. We have a good amount of those throughout
> our code, so would help if I could isolate this a bit more.

It's possible but a little bit non-intuitive.  If you click "Javascript only" the profiler UI only hows you stack traces with JS frames inside them.  But we unfortunately don't have line number information in the JS backtraces, so we only know what functions called each other.  As a result if you search for "outerWidth" in that view you won't see any matches.  But turning off "Javascript only" will show the JS and C++ stack frames interleaved.  In that mode if you search for outerWidth you can find the correct C++ frame (for example mozilla::dom::WindowBinding::get_outerWidth which is the C++ function that implements this getter).  Once you get to the C++ frame that interests you, you can look up the stack to find the JS call stack (again, interleaved with the C++ stack.)  I'll attach a screenshot which demonstrates what this looks like.
Flags: needinfo?(ehsan)
Thanks Ehsan, this is helpful.

I instrumented all of our calls to window.outerWidth, and they all are around 10-20 microseconds, so it doesn't seem like we have any of those in problematic places. This all sums up to less than 1ms in my measurements.

In your profile, I only see a single call to outerWidth that takes 18ms. I don't see this one in my own traces, so I'm not fully clear where it comes from. But regardless, I don't think those 18ms explain any of the slowness you observed here.

Am I missing something about the other outerWidth calls? Do you have a profile where there's more than one of those? I also don't see anything show up for scrollTop in your profile. Maybe I'm not searching for the right string there?
Flags: needinfo?(ehsan)
(In reply to Philipp Weis from comment #8)
> Thanks Ehsan, this is helpful.
> 
> I instrumented all of our calls to window.outerWidth, and they all are
> around 10-20 microseconds, so it doesn't seem like we have any of those in
> problematic places. This all sums up to less than 1ms in my measurements.

Hmm, is that still with the manual measurements?  I'm not sure why there's this much difference.  The profile in comment 0 has around 250ms of window.outerWidth calls.  Is it possible that you're missing some of the call sites?

> In your profile, I only see a single call to outerWidth that takes 18ms. I
> don't see this one in my own traces, so I'm not fully clear where it comes
> from. But regardless, I don't think those 18ms explain any of the slowness
> you observed here.

Hmm, there should be many more...  Here is a permalink with the saved search I just did: <https://clptr.io/2kqyNGu>

If you can't see them all, can you please paste the URL of the profile?  It will include information such as your search phrase so that I can look at the same thing here.

> Am I missing something about the other outerWidth calls? Do you have a
> profile where there's more than one of those? I also don't see anything show
> up for scrollTop in your profile. Maybe I'm not searching for the right
> string there?

Here is a saved search for scrollTop from the same profile: <https://clptr.io/2kqyGLh>.  Can you please compare it to what you have on your side?  Here it should ~76ms of time being spent under this function.
Flags: needinfo?(ehsan)
needinfo(Philip) just in case he didn't see Ehsan's questions in comment 9.
Flags: needinfo?(pweis)
Apologies about the slow response.

(In reply to :Ehsan Akhgari from comment #9)
> (In reply to Philipp Weis from comment #8)
> > Thanks Ehsan, this is helpful.
> > 
> > I instrumented all of our calls to window.outerWidth, and they all are
> > around 10-20 microseconds, so it doesn't seem like we have any of those in
> > problematic places. This all sums up to less than 1ms in my measurements.
> 
> Hmm, is that still with the manual measurements?  I'm not sure why there's
> this much difference.  The profile in comment 0 has around 250ms of
> window.outerWidth calls.  Is it possible that you're missing some of the
> call sites?
> 
> > In your profile, I only see a single call to outerWidth that takes 18ms. I
> > don't see this one in my own traces, so I'm not fully clear where it comes
> > from. But regardless, I don't think those 18ms explain any of the slowness
> > you observed here.
> 
> Hmm, there should be many more...  Here is a permalink with the saved search
> I just did: <https://clptr.io/2kqyNGu>
> 
> If you can't see them all, can you please paste the URL of the profile?  It
> will include information such as your search phrase so that I can look at
> the same thing here.

I loaded your profile <https://clptr.io/2kqyNGu>, expanded the call tree down until I actually saw the string "outerWidth" appear, and this shows 18ms running time for just the outerWidthC call. See screenshot attached.

I'm not very familiar with this profiler, so maybe I'm missing something, but if I read this correct then all calls to outerWidth in this trace add up to only 18ms. (Not that that's nothing, but don't think it really explains the slowness you see.) Is that correct?

Also curious about number of calls. Is there any information on that here? It seems to me that this would just be from a single call?

> > Am I missing something about the other outerWidth calls? Do you have a
> > profile where there's more than one of those? I also don't see anything show
> > up for scrollTop in your profile. Maybe I'm not searching for the right
> > string there?
> 
> Here is a saved search for scrollTop from the same profile:
> <https://clptr.io/2kqyGLh>.  Can you please compare it to what you have on
> your side?  Here it should ~76ms of time being spent under this function.

If I do the same thing as above and expand the tree out to where scrollTop actually appears, I only see 13ms. Can you clarify which number is correct, and where the discrepancy would come from? Thanks!
Flags: needinfo?(pweis) → needinfo?(ehsan)
(In reply to Philipp Weis from comment #11)
> I loaded your profile <https://clptr.io/2kqyNGu>, expanded the call tree
> down until I actually saw the string "outerWidth" appear, and this shows
> 18ms running time for just the outerWidthC call. See screenshot attached.
> 
> I'm not very familiar with this profiler, so maybe I'm missing something,
> but if I read this correct then all calls to outerWidth in this trace add up
> to only 18ms. (Not that that's nothing, but don't think it really explains
> the slowness you see.) Is that correct?

No.  :-)  I think the UI is confusing.  Let me explain a bit more.

This is a basic sampling profiler which works by pausing a thread at regular intervals (let's say every milliseconds) and walking up the call stack and capturing it.  The histogram you see in the profile UI is a timeline view of all of these captured stacks across the timespan.  When you enter something in the filter text box, the UI filters out all call stacks that don't have any frames with any text matching your filter query.  IOW, when you search for "outerWidth" you'll find all call stacks with "outerWidth" appearing somewhere on one of the frames in the call stack.  Now, the running time column is always computed to sum up the visible ranges.  So, in a filtered view, when you scroll all the way to the top, and look at the running time of the top-most frame (which is named "(root)") that amounts to the total time that we have spent under all call stacks that match the filter string.  Therefore, in the example profile when searching for outerWidth, since the running time of (root) is 253ms, we have spent a total of 253 running code with the outerWidth getter somewhere on the stack.

The case you were describing above is one such place where we have called outerWidth, and in that one place we have spent 18ms.  The 253ms is the sum of all such places.  You can see where all of these are in the timeline by looking at where the UI displays visible parts in the histogram.  That is usually helpful to give you a rough idea about the frequencies of the calls.  For example, here we have several disjoint chunks in the filtered histogram which means that no single outerWidth getter has taken a huge amount of time, but there are many calls to this getter, each taking a bit of time here and there and they all add up to a lot of time.

(BTW if you want to get more used to this UI, I think a nice way would be to write contrived micro-benchmarks and profiling them to get a sense of how they would show up, for example calling elem.outerWidth in a loop, or in a series of setInterval callbacks and seeing what the resulting profile looks like in each case.  In case you wanted to give that a shot, I suggest using Firefox Nightly plus the profiler extension that you can download and install from https://perf-html.io.  Once you have the profiler installed, it runs in the background and pressing Ctrl+Shift+2 at any time brings up a profile of the recent events happening in the browser.)

> Also curious about number of calls. Is there any information on that here?
> It seems to me that this would just be from a single call?

Unfortunately the number of calls are invisible to a sampling profiler, because all it knows it what functions were on the call stack at the exact time it chose to interrupt the thread and walk up the call stack.  Based on the shape of the filtered histogram we can conclude that there are many calls to this getter, but it's unclear how many calls happen in each chunk of the filtered histogram.

> > > Am I missing something about the other outerWidth calls? Do you have a
> > > profile where there's more than one of those? I also don't see anything show
> > > up for scrollTop in your profile. Maybe I'm not searching for the right
> > > string there?
> > 
> > Here is a saved search for scrollTop from the same profile:
> > <https://clptr.io/2kqyGLh>.  Can you please compare it to what you have on
> > your side?  Here it should ~76ms of time being spent under this function.
> 
> If I do the same thing as above and expand the tree out to where scrollTop
> actually appears, I only see 13ms. Can you clarify which number is correct,
> and where the discrepancy would come from? Thanks!

Again, that is probably one scrollTop call site.  If you want to know the total cost, you should filter on it and look at the running time of the (root) node.
Flags: needinfo?(ehsan)
Whiteboard: [platform-rel-Google][platform-rel-GoogleSuite][platform-rel-GoogleSlides][sitewait] → [platform-rel-Google][platform-rel-GoogleSuite][platform-rel-GoogleSlides][sitewait][qf-]
platform-rel: + → -
Keywords: perf
Priority: -- → P5
Product: Tech Evangelism → Web Compatibility

See bug 1547409. Moving webcompat whiteboard tags to keywords.

You need to log in before you can comment on or make changes to this bug.