Closed Bug 1209697 Opened 9 years ago Closed 8 years ago

Firefox rendering is extremely slow when flexbox row is combined with absolutely positioned overlays.

Categories

(Core :: Layout, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox53 --- fixed
firefox54 --- fixed

People

(Reporter: lukebmay, Assigned: emilio)

References

(Blocks 1 open bug)

Details

Attachments

(4 files, 2 obsolete files)

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/45.0.2454.85 Chrome/45.0.2454.85 Safari/537.36

Steps to reproduce:

Create around 15 nested flexbox elements each of type `flex-direction: row`, and on each element also include an absolutely positioned overlay. 


Actual results:

A rendering delay of around 5-10 seconds, or sometimes a browser crash occurs.


Static example - Depth 15:

    included file:
        static.html
    jsfiddle:
        http://jsfiddle.net/lukebmay/aghfkwmc/1/

    - FF 41.0 - Crashes
    - FF 44.0a1 (2015-09-29) - Barely works


Dynamic example - knockout.js - Starts at Depth 5:

    included file:
        dynamic.html (I couldn't include 2 files on initial bug, but I can attach this one on a comment.)
    jsfiddle:
        http://jsfiddle.net/lukebmay/adjng0tf/

    - FF 41.0 - major slowdown at depth 15 and up
    - FF 44.0a1 (2015-09-29) - major slowdown at depth 15 and up


Expected results:

Each nested flex element should contain a child and an absolutely positioned overlay with practically unnoticeable rendering delays.

This works fine on chrome.

One interesting thing to note is that the bug does not appear to happen with `flex-direction: column`.
For what it's worth, Chrome dev channel (v47) seems to be about as slow as Firefox Nightly (v44) on the attached "static.html" testcase -- each takes about 5 seconds to load.
Status: UNCONFIRMED → NEW
Component: Untriaged → Layout
Ever confirmed: true
Product: Firefox → Core
Version: 41 Branch → Trunk
MS Edge (loaded via BrowserStack, so on a virtual machine with different perf characteristics) ends up "Not responding" for ~10 seconds when I load the attached "static.html" testcase, too.

So, everyone seems about equally bad on this testcase. What Chrome version are you testing where everything works fine?
Flags: needinfo?(lukebmay)
(Chrome dev channel & Firefox nightly behave the same on the dynamic jsfiddle linked in comment 0, too ( http://jsfiddle.net/lukebmay/adjng0tf/ ).  MS Edge seems much more OK on that one, for some reason, though I have to manually type in a depth and hit enter, since they don't render up/downarrows in the input field.)
My chrome version is:
Version 45.0.2454.101 (64-bit)

Chromium version:
Version 45.0.2454.85 Ubuntu 14.04 (64-bit)
Flags: needinfo?(lukebmay)
Thanks. I can confirm that Ubuntu's current chromium-browser package (version 45.0.2454.85 Ubuntu 15.10 (64-bit)) renders the attached testcase pretty much instantly. So, there must be a change between versions 45 and 47 which increases the complexity of rendering (and which makes Chrom[e|ium] match Firefox & Edge in terms of that complexity).
I haven't looked in detail here, but a few things I noticed while tweaking the testcase:
 - the abspos overlay doesn't seem relevant (things are still slow in Firefox Nightly & Chrome Dev Channel if I remove that)
 - I think part of the problem is the horizontal-flex -> vertical-flex -> horizontal-flex -> vertical-flex nesting. You said " the bug does not appear to happen with `flex-direction: column`" -- but the testcase does actually have that style on one of the nesting-levels.  If I remove it, things are fast.
("the testcase" being the attached static.html file)
Attachment #8667488 - Attachment description: static.html → testcase 1 (static.html)
OK, here's a testcase that's identical to testcase 1 except I removed "overflow:auto".  This makes chrome dev channel quite fast, but Firefox and Edge are still slow. (they take a few seconds to load)
Oh interesting!  Sorry for the mis-information :).  I'm trying to work backwards from our complex design to a more simple one for you to examine.  I'll take a look at playing with the alternating flex-directions.  Thanks!
So if I make both ".child" elements and ".outer" elements "flex-direction: column", then it loads almost instantly in FF 44.  If I change only ".child" to row, then it loads in around 7 sec.  If I change both ".outer" and ".child" to "row", then it takes around 15 sec.
Upon further review, I have discovered that there is a different rendering time on initial load than there is when rendering due to a browser window resize. For these tests I set ALL the flex-directions to "row" (in .app, .outer, .child) and I used Firefox 44.0a1 (2015-09-29) nightly.  

In the first test, I kept the overlay divs in, and also kept the CSS that makes them absolutely positioned in as well.  The initial load (static.html) was 20 seconds.  Then upon browser window resize, a processor would fire up to 100% to try and recalculate the layout, but it would never render (I waited around 6 minutes).

In the second test I removed the overlay divs (and abs-pos CSS).  The initial load was nearly instantaneous.  The resize load again failed to render after minutes of processor usage.  

One interesting thing to note is that the browser didn't crash or go gray.  These results would lead me to believe that the abs-pos combined with "flex-direction: row" does have a big impact on the initial rendering calculation (caused 20 second initial rendering delay, versus instantaneous load without the abs-pos).  The second conclusion that I might be inclined to draw is that the initial-load reflow calculation differs in some way than the resize reflow calculation, and there would appear to be a bug with "flex-direction: row" that is causing extremely long (if not indefinite) reflow calculations when the browser window is resized.

In the third test I performed, I again left out the overlay div (and abs-pos CSS), and removed some of the nested elements so that there were 9 remaining instead of the 15 nested elements in the previous tests.  Again the load was nearly instantaneous, but the browser resize reflow calculation took 35 seconds to complete.  I then reduced the number of nested elements to 3, and the load time was instant and the resize time was very quick (not instantaneous, there was a noticeable sub-second delay).

Hopefully this will help debug the issue.  Thanks again Daniel!
Hello, just wondered if there is likely to be any more progress on this issue or issue 1151040 as both appear to be related to rendering speed with flexbox. 

We have web app client applications that control test instrumentation that we design and manufacture and we rely on flexbox for our application layout. Until now we officially support 3 browsers namely Internet Explorer 11, Chrome and Firefox. There was a time (pre version 34 if i remember) when we had to question our ability to support Firefox because of rendering speed issues, there was a marked improvement around version 36 that enabled us to continue to support it but as our applications have evolved this is starting to become an issue once more. 

We do use a lost of nesting with flexbox and a mixture of horizontal and vertical orientations, we also use some overlays controlled with absolute and relative positioning however performance under Chrome and IE are fine, Firefox is unusable on some of our editor pages but is slow in general.

Many thanks
(In reply to gary.cuthbert from comment #12)
> Hello, just wondered if there is likely to be any more progress on this
> issue or issue 1151040 as both appear to be related to rendering speed with
> flexbox.

There will be progress, though I don't know exactly how soon. I'm hoping to have some dedicated time to devote to this in the next few months.

> Firefox is unusable on some of our editor pages but is slow in general.

I'm sorry to hear that! Two questions: 
 (1) Is Edge similarly-bad? (You mentioned IE is fine, but it implements an older snapshot of flexbox, without some of the more-potentially-expensive pieces like "min-width:auto" as I recall. Edge is more up to date and more useful for comparison.) As noted in some comments above, Edge is similarly slow to Firefox on the testcases that we have here so far.  (I verified that this is still the case today.)

 (2) Would it be possible to provide a link to one of these unusable-in-Firefox editor pages? (and a test account, if needed to view the page) [Feel free to send this to me via email, if you're uncomfortable sharing a link publicly]  That may help me identify where the problem is & what hypothetical-optimizations might help on our end. (And it might give me an idea for a suggested hackaround.)
Flags: needinfo?(gary.cuthbert)
Thanks for the quick response. I have checked it out under Edge and there are a couple of rendering issues I need to look at but on the whole its responsiveness feels similar to IE11. If it is of any use I have uploaded a few videos of the client in action under Firefox, Edge and Chrome. 

The Chrome video covers filter setting and general use of editors:
http://take.ms/TUTDU

The Edge and FireFox videos are in two sections, the first being interaction with the filter editors and the second being general interaction with the editors:

Edge Filters:
http://take.ms/Ya6D0

Edge General:
http://take.ms/v7RAa

Firefox Filters:
http://take.ms/uf8f4

Firefox General:
http://take.ms/tdKNl

The videos demonstrate that Chrome is by far the fastest, Edge isn't great but it is use able, Firefox is arguably too slow.

Our client uses the Angular framework so there will be several changes to the DOM as each editor loads & updates.

I will attempt to isolate some examples of our nested flexbox use and create something in plunker that i can share that demonstrates this behavior.
Hello, I have created a plunk that replicates one of the layouts that is considerably slower under Firefox. The plunk can be accessed here:
https://plnkr.co/edit/2g69OyEtauYbdHbASSBR?p=info

The following videos show similar interaction across Chrome:
http://take.ms/GFIK0

Edge (there are rendering glitches here e.g. incomplete borders but I believe we can resolve these):
http://take.ms/bLXp5

and Firefox:
http://take.ms/Cj5ye


Hope the plunk is of some help. 

One thing I have noticed (while investigating on a windows 10 laptop for Edge compatibility) is that a clean installation of Firefox appears to behave much better, it is still the least responsive but it is useable. 

My local Firefox installation, although it is the latest version, has been around for a while. I have disable all of the add ons (there weren't many), cleared the cache but is till get the behavior a see in the video above.

Many thanks
Thanks for coming up with that. I can confirm that I see a noticeable (~1 second) delay on most interactions on that plunk, when I load it with Firefox Nightly (on my Linux laptop), vs. no noticeable delay on Chrome.
Flags: needinfo?(gary.cuthbert) → needinfo?(dholbert)
See Also: → 1328075
I gave a crack at this. The slowness shown in there, as well as the one reported in bug 1328075 seems an exponential reflow blow-up when computing cross axes and heights.

I wrote this and it makes http://jsfiddle.net/lukebmay/aghfkwmc/1/ load way faster, and that plunk work fine.

I don't know if the patch I'm going to post is the right approach, or you David have a better idea. Happy to clean it up if you're fine with it.
Comment on attachment 8823156 [details] [diff] [review]
patch: Cache flex layout reflow results to avoid exponential blowup.

Gah, this patch has some try failures, probably the caching should be a bit more complex.

Still, do you think we should do something like this? Or do you have any better approach?
Attachment #8823156 - Flags: feedback?(dholbert)
I don't think my patch is correct, but we definitely want to prevent exponential reflows (note that this is not that trivial as I initially thought).

In the test case at http://jsfiddle.net/lukebmay/aghfkwmc/1/, with a bit of spew, it seems that my patch was only masquerading the original issue, which is that nsHTMLScrollFrame may reflow the same frame a bunch of times to try to guess which scrollbars to use (see nsHTMLScrollFrame::TryLayout). This reflows also the children frames on flexbox and grid layouts.

That makes overflow: auto terribly slow, specially when nesting them heavily and when combined with frames that need to reflow their children, like flex and grid, cc'ing mats for that.

I'd need to dig a lot more than what I'm able to right now to fix that I think.
Comment on attachment 8823156 [details] [diff] [review]
patch: Cache flex layout reflow results to avoid exponential blowup.

I don't think this patch is correct given the above comment.
Attachment #8823156 - Attachment is obsolete: true
Comment on attachment 8824429 [details]
Bug 1209697: Part 3, Drive-by ReflowInput style fixes.

https://reviewboard.mozilla.org/r/102940/#review104408

Thanks for these whitespace fixes!

They look good, but I'd rather not land them as part of this bug.  The reason being, I anticipate that the rest of this bug *might* need to get backed out or reworked at some future point.  And if that happens, it'll be confusing if there are unrelated definitely-worth-keeping changesets that also landed here.  So, I'm just going to directly import this "part 3" patch into my inbound tree and push it with "(no bug)" since it's whitespace-only anyway.  Marking it as "r-" in this patch-stack, since we should obsolete/discard the version that's here so it doesn't land redundantly.
Attachment #8824429 - Flags: review?(dholbert) → review-
Comment on attachment 8824428 [details]
Bug 1209697: Part 2, Cache flex measuring reflows to avoid exponential behavior.

https://reviewboard.mozilla.org/r/102938/#review104416

I need to think a bit more about the invariant here to convince myself it's valid, but this seems likely-good.

I do have a few non-functional nits, though:

::: layout/generic/nsFlexContainerFrame.cpp:1775
(Diff revision 1)
> + * Right now we only need to save space for the generation, the height and the
> + * ascent. This can be extended later if needed.

s/save space for/cache/          (shorter & clearer)
s/generation/reflow generation/  ("generation" is too vague)

::: layout/generic/nsFlexContainerFrame.cpp:1778
(Diff revision 1)
> + * A cached result for a measuring reflow.
> + *
> + * Right now we only need to save space for the generation, the height and the
> + * ascent. This can be extended later if needed.
> + *
> + * The assumption here is that a given flex item measuring won't change in the

nit: s/measuring/measurement/

::: layout/generic/nsFlexContainerFrame.cpp:1779
(Diff revision 1)
> + *
> + * Right now we only need to save space for the generation, the height and the
> + * ascent. This can be extended later if needed.
> + *
> + * The assumption here is that a given flex item measuring won't change in the
> + * same reflow. Caching it prevents us from doing exponential reflows.

This bit about preventing exponential reflows is perhaps a bit vague -- please add at the end:
"...in cases of deeply-nested flex containers"

::: layout/generic/nsFlexContainerFrame.cpp:1781
(Diff revision 1)
> + * We store them in the frame property table for simplicity, ideally we could
> + * keep the cached FlexLines around for long enough so this would be useful, but

Grammar nit: this wants to be two sentences.

s/simplicity, ideally/simplicity. Ideally/

::: layout/generic/nsFlexContainerFrame.cpp:1795
(Diff revision 1)
> +
> +  CachedMeasuringReflowResult(uint64_t aGeneration,
> +                              nscoord aHeight,
> +                              nscoord aAscent)
> +    : mGeneration(aGeneration)
> +    , mHeight(aHeight)

observation (no action needed): "Height" should eventually be "BSize" here, probably -- but it makes sense to use height as an incremental step for now (since this code currently works in terms of height, due to not being quite writing-mode-ified yet).

So this is fine.
Comment on attachment 8824427 [details]
Bug 1209697: Part 1, Clear ancestor intrinsic sizes when our block size changes.

https://reviewboard.mozilla.org/r/102936/#review104420

This part seems fine to me. One nit:

::: layout/base/PresShell.cpp:9051
(Diff revision 1)
>    // Don't pass size directly to the reflow state, since a
>    // constrained height implies page/column breaking.
>    LogicalSize reflowSize(wm, size.ISize(wm), NS_UNCONSTRAINEDSIZE);
> +  mPresContext->IncrementReflowGeneration();
>    ReflowInput reflowInput(mPresContext, target, &rcx, reflowSize,
> -                                ReflowInput::CALLER_WILL_INIT);
> +                          ReflowInput::CALLER_WILL_INIT);

Nit: these reflowSize/reflowInput variables are kinda tied together. So, let's not insert this new line directly between them.  Let's insert it ~4 lines further up (probably with a blank line on either side), before reflowSize & its documentation.
Attachment #8824427 - Flags: review?(dholbert) → review+
So I have a concern that this will break a particular scenario.  I haven't yet come up with a testcase that triggers that scenario, but I suspect/believe it's possible to trigger. I'll describe it hand-wavily here:

SETUP: We have a flex container whose size depends on its surroundings (e.g. it has a percent size, or its parent is a flex container or a grid, or something like that). It has an auto-sized flex item, with a paragraph of text (so its height depends on the available width).

 (A) This flex container receives a reflow where it's sized at width=100px. (due to an ancestor performing a "measuring reflow" on its children, or as part of with-or-without-scrollbar reflow trials, or something like that.)
 (B) The flex container performs a measuring reflow on its item and determines a height of 50px (which is based on the height of our paragraph, given our available width of 100px).  This gets cached, per the mechanism that you're adding in this bug.
 (C) Later on inside the same PresShell::DoReflow call, the flex container receives a *second* call to reflow, with a *different* width. (e.g. due to an ancestor's "final" reflow on its children, or due to the next step of a with-or-without-scrollbar trial.)   For this reflow, the flex container's width is different -- now it's only 80px.
 (D) Inside this second reflow, we go to perform a measuring reflow on our item -- but we find that we've already got the result cached. BUT, this cached measurement is not what we'd actually get if we went through with a measuring reflow at this point! If we actually proceeded with layout with the now-skinnier flex container, we'd likely find a taller height for our flex item.

So, we'd end up using an incorrect {min-height:auto | flex-basis | stretched-cross-axis} height on our flex item, in this scenario.  And this could produce either overflow or awkward-extra-space (depending on whether the cached measurement was too big or too small).

===

I *think* we can address this by including the item-reflowinput's AvailableSize() in the cached value, and then (on subsequent passes) disregarding the cached value if the current ReflowInput::AvailableSize() isn't equal to the AvailableSize on the cached struct.

This partially defeats the cache (because if the available size keeps oscillating between two possibilities during an exponential-blowup layout run, then our cached values will never be useful).  But if that turns out to be a problem, we could allow flex items to store cached entries for up to N different availablesize values (for some small "N") or something like that.

What do you think?
Flags: needinfo?(dholbert) → needinfo?(emilio+bugs)
BTW, excellent news - not too surprisingly, this patch stack helps with testcases on various other flexbox-perf-issues bugs. My debug (unoptimized) build, with this patch stack applied, performs noticeably better than an unpatched Nightly (opt) build on the following testcases, at least:
  https://bug1150123.bmoattachments.org/attachment.cgi?id=8586968 (watch for few-second hang on hover)
  https://bug1150123.bmoattachments.org/attachment.cgi?id=8588059 (watch for few-second hang on load)
  https://bug1151040.bmoattachments.org/attachment.cgi?id=8588134 (watch for few-second hang when clicking links)
  bug 1178783's testcase (not linking directly to it because it hangs for minutes [at least] in Nightly. Loads pretty much right away with this patch stack.
This also helps with bug 1231246's testcase (which hangs for a few seconds on load, but not with this patch stack), and with the issue described in bug 1194155 comment 11.

I'm documenting all of this because it will be useful to see whether all of these improvements still happen after we've made the change that I suggest in comment 30 here.
Attachment #8824429 - Attachment is obsolete: true
Disclaimer: I _think_ part 1 is not needed. I tried to build a test case to prove it was, but I wasn't able to. Also, try looks the same with and without that patch (modulo intermittents).

 * With part 1: https://treeherder.mozilla.org/#/jobs?repo=try&revision=baaf3e0f63e04a91af627decda542ac1114d4ccf
 * Without part 1: https://treeherder.mozilla.org/#/jobs?repo=try&revision=cd00aad252708c79217240584f4d214c4c2a3eff

I saw some browser tests failing there, but I'm not totally sure that's my fault. There are some TypeErrors there (!), and I think they're just racy with layout (which probably makes some sense, they're testing the customization UI which uses flex layout extensively).

I've verified that everything seems to be working, but I probably need to dig a bit more into those. Daniel, do you know if that's expected (or has happened before, etc.), or what can I do to debug those efficiently?
Flags: needinfo?(emilio+bugs) → needinfo?(dholbert)
Hi, sorry if I'm hijacking this thread but I wanted to say that this test case

https://cpa-level-it.github.io/flexbox/

is now working correctly with the provided patchs. 
You can drag the split pane and will be smooth, before it was lagging a lot and not really usable.

I'm wondering, how long does it generally take for a patch like this to be included in a new release ? We are developing a web app for a company that only allows Firefox as web browser. We heavily rely on flexbok for the app and without this patch the performance aren't really great...

Thanks for your time !
(In reply to paulus.cyril from comment #36)
> Hi, sorry if I'm hijacking this thread but I wanted to say that this test
> case
> 
> https://cpa-level-it.github.io/flexbox/
> 
> is now working correctly with the provided patchs. 
> You can drag the split pane and will be smooth, before it was lagging a lot
> and not really usable.

Hi! Thanks for the testcase, and no worries! :)

> I'm wondering, how long does it generally take for a patch like this to be
> included in a new release ? We are developing a web app for a company that
> only allows Firefox as web browser. We heavily rely on flexbok for the app
> and without this patch the performance aren't really great...

It depends, this patch is currently awaiting review to ensure that it's the right approach to solve the problem. Once it lands it'll be on nightly the next build, but it'll take some time until it arrives to release.

We could consider uplifting it so it arrives sooner, but I'm not sure what needs to happen so that happens :)
Sorry for the review delay. r=me on the new version of part 1. (MozReview auto-carried that forward from the old version)

> We could consider uplifting it so it arrives sooner, but I'm not sure what needs to happen so that happens :)

I'm in favor of an uplift to Aurora, after it's baked on Nightly for a few days, but we should be very open to backing it out if it causes problems. (There may be unforseen cases where we need to make this caching logic a bit smarter to avoid producing invalid results.)
Flags: needinfo?(dholbert)
Comment on attachment 8824428 [details]
Bug 1209697: Part 2, Cache flex measuring reflows to avoid exponential behavior.

https://reviewboard.mozilla.org/r/102938/#review110072

r=me with editorial nits addressed. Thanks again for taking this!

::: layout/generic/nsFlexContainerFrame.cpp:1785
(Diff revision 2)
> + * We store them in the frame property table for simplicity. Ideally we could
> + * keep the cached FlexLines around for long enough so this would be useful, but

This phrase is a bit misleading/confusing:
"Ideally we could keep the cached FlexLines around for long enough so this would be useful"

"the cached FlexLines" aren't something that currently exists, so talking about "the cached FlexLines" doesn't make sense.

I think you mean to say, "Ideally we'd also cache the FlexLines", or something like that?  (I'm not sure.)  Or maybe this sentence just wants to go away?

::: layout/generic/nsFlexContainerFrame.cpp:1794
(Diff revision 2)
> +
> +  CachedMeasuringReflowResult(const LogicalSize& aSize,
> +                              nscoord aHeight,
> +                              nscoord aAscent)
> +    : mAvailableSize(aSize)

s/aSize/aAvailableSize/, for an abundance of clarity (and to match the member variable name more closely)

Otherwise, it's easy to mistakenly assume that aSize represents the child's size (resulting from the reflow).

::: layout/generic/nsFlexContainerFrame.cpp:1813
(Diff revision 2)
> +  ReflowOutput childDesiredSize(aChildReflowInput);
> +  nsReflowStatus childReflowStatus;

Please move the childDesiredSize / childReflowStatus decls down to after the early-return clause.  (closer to their first usage)

(They're only used if we *actually need to reflow*.)

::: layout/generic/nsFlexContainerFrame.cpp:1842
(Diff revision 2)
> +  auto result = new CachedMeasuringReflowResult(
> +      availableSize,
> +      childDesiredSize.Height(),
> +      childDesiredSize.BlockStartAscent());

Stylistic nit: the indentation here (4 spaces) feels kinda arbitrary.

Let's just make this match the coding style / common practice, and wrap after the "=", so you can rid of the extreme-deindentation of the args, like so:

  auto result =
    new CachedMeasuringReflowResult(availableSize,
                                    childDesiredSize.Height(),
                                    childDesiredSize.BlockStartAscent());

::: layout/generic/nsFlexContainerFrame.cpp:1884
(Diff revision 2)
> -  ReflowOutput childDesiredSize(childRIForMeasuringHeight);
> -  nsReflowStatus childReflowStatus;
> -  const uint32_t flags = NS_FRAME_NO_MOVE_FRAME;
> +  CachedMeasuringReflowResult reflowResult =
> +    MeasureAscentAndHeightForFlexItem(
> +        aFlexItem, aPresContext, childRIForMeasuringHeight);

As above, the extreme-deindentation of args here is unnecessary & nonstandard. Please unwrap the args, like so:

  CachedMeasuringReflowResult reflowResult =
    MeasureAscentAndHeightForFlexItem(aFlexItem, aPresContext,
                                      childRIForMeasuringHeight);
Attachment #8824428 - Flags: review?(dholbert) → review+
Pushed by ecoal95@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/cfe569b9d62d
Part 1, Clear ancestor intrinsic sizes when our block size changes. r=dholbert
https://hg.mozilla.org/integration/autoland/rev/b4730a967e2c
Part 2, Cache flex measuring reflows to avoid exponential behavior. r=dholbert
https://hg.mozilla.org/mozilla-central/rev/cfe569b9d62d
https://hg.mozilla.org/mozilla-central/rev/b4730a967e2c
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Assignee: nobody → emilio+bugs
congrats, this patch shows a talos performance improvement:

== Change summary for alert #4993 (as of February 02 2017 16:18 UTC) ==

Improvements:

  5%  cart summary windows7-32 opt     40.23 -> 38.21
  3%  cart summary linux64 opt         32.47 -> 31.41
  3%  cart summary windows7-32 opt     39.59 -> 38.31
  3%  cart summary linux64 opt         30.57 -> 29.61
  3%  cart summary windows8-64 opt     33.15 -> 32.14
  2%  cart summary linux64 pgo         23.39 -> 22.82
  2%  cart summary linux64 pgo         24.95 -> 24.42

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=4993
https://hg.mozilla.org/integration/mozilla-inbound/rev/be38817f54c5dc82b2dca9ede49cd7118f92d959
Bug 1209697 followup - Add comment to explain change that might not be clear given existing comments here and in nsChangeHint.h.
Depends on: 1336708
This patchset broke whatsapp photo view on Windows, OSX and Linux (according to the following mozregression result on Linux). Disable position: relative could fix this issue. 

> media-content .media {
>     position: relative;

So I guess it's related with this bug.

> Last good revision: bdc513580a45e6392816a8de4f3da0c8da5e72b0
> First bad revision: b4730a967e2c796fba51576da7d9e91c4ae557c8
> Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=bdc513580a45e6392816a8de4f3da0c8da5e72b0&tochange=b4730a967e2c796fba51576da7d9e91c4ae557c8
> Looks like the following bug has the changes which introduced the regression:
https://bugzilla.mozilla.org/show_bug.cgi?id=1209697
(In reply to Eric Tsai from comment #48)
> This patchset broke whatsapp photo view on Windows, OSX and Linux (according
> to the following mozregression result on Linux). Disable position: relative
> could fix this issue. 
> 
> > media-content .media {
> >     position: relative;
> 
> So I guess it's related with this bug.

Yes, it is, thanks for checking. Fix is over bug 1336708 :)
Depends on: 1336887
Depends on: 1338053
Comment on attachment 8824427 [details]
Bug 1209697: Part 1, Clear ancestor intrinsic sizes when our block size changes.

Approval Request Comment
[Feature/Bug causing the regression]: none
[User impact if declined]: exponential-time flexbox layout in some cases
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: this also needs the patches over bug 1338053 and bug 1336708
[Is the change risky?]: no
[Why is the change risky/not risky?]: has been in nightly for a while already
[String changes made/needed]: none
Attachment #8824427 - Flags: approval-mozilla-aurora?
(In reply to Emilio Cobos Álvarez [:emilio] from comment #54)
> Comment on attachment 8824427 [details]
> Bug 1209697: Part 1, Clear ancestor intrinsic sizes when our block size
> changes.
> 
> Approval Request Comment

Please note that the approval request is for both of the patches in this bug, along with the aforementioned bugs in the request.
Comment on attachment 8824427 [details]
Bug 1209697: Part 1, Clear ancestor intrinsic sizes when our block size changes.

Fix a flexbox issue. Let's take this in Aurora53. Aurora53+.
Attachment #8824427 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8824427 [details]
Bug 1209697: Part 1, Clear ancestor intrinsic sizes when our block size changes.

I still have questions and might need your help to understand it.
1. For bug 1338053, the issue didn't happen in ff53. 
2. For bug 1336708, it seems that the issue didn't happen in ff53, either.
3. For this bug, the issue seems to be there for while.

All the patches seem big for me. I'm not sure if we will have regressions once all the patches are uplift to aurora53. 

Can we let the fix ride the train and won't fix.
Flags: needinfo?(emilio+bugs)
Attachment #8824427 - Flags: approval-mozilla-aurora+ → approval-mozilla-aurora?
Right, so this bug fixes a long trail of flexbox performance issues, present in firefox for a long time (see all the attached test cases in this bug and in the duplicates).

These patches introduced two regressions, that are fixed now and have been in nightly for a while already. The patches for that are the ones that bug 1338053 and bug 1336708 fix.

It's fine letting this ride the trains, but this bug is causing some negative experiences (see the comments in this bug, and also other things like bug 1328075), so my idea (and :dholbert seemed to agree with this) is that uplifting this to aurora is beneficial.
Flags: needinfo?(emilio+bugs)
(In reply to Gerry Chang [:gchang] from comment #57)
> All the patches seem big for me.

FWIW, the patches in the other 2 bugs are basically small tweaks *to the new code* from this bug's patches. So really, the total code-change in all-the-patches-combined is about the same as the code-change in just this bug here.

> I'm not sure if we will have regressions
> once all the patches are uplift to aurora53. 

Nor am I - but we can aggressively back this out (of aurora/beta) if we notice any regressions that we don't see an immediate fix for.  This is purely an optimization, so it's not expected to affect behavior (aside from making it faster) -- and if it does break something, we can back it out without affecting behavior (aside from letting things go back to being slow as they are now).

> Can we let the fix ride the train and won't fix.

We could, but I think it's worth trying to get this to users sooner to improve their experience.
Comment on attachment 8824427 [details]
Bug 1209697: Part 1, Clear ancestor intrinsic sizes when our block size changes.

For aurora uplift along with work from bug 1338053 and bug 1336708.
Attachment #8824427 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Noting I'm ok with this going to aurora as it seems like a worthwhile perf improvement as well as fixing some broken icon positioning.
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #61)
> Noting I'm ok with this going to aurora as it seems like a worthwhile perf
> improvement

Hooray!

> as well as fixing some broken icon positioning.

Note: this doesn't actually fix icon positioning - rather, it caused some breakage with icon positioning, and we fixed that in bug 1336708. [Hence the need to uplift them together.]  So, the only expected effect of the combined uplift is a perf improvement (quite a significant one, for some use cases).
Setting qe-verify- based on Emilio's assessment on manual testing needs (see Comment 54) and the fact that this fix has automated coverage.
Flags: qe-verify-
Depends on: 1362758
Depends on: 1449326
Depends on: 1495169
Duplicate of this bug: 1231246
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: