937519 - UX (Australis) branch Windows 7/8 Tab Animation Regression Test (TART) regression on 2013-11-07

Reporter

Description

•

12 years ago

Attached image TART UX-Non-PGO vs. Firefox-Non-PGO on Windows 7 (is) — Details

There was a major UX branch regression on the TART talos test starting 2013-11-07 for Windows 7 and Windows 8 only: * Windows 7 Non-PGO: http://bit.ly/181tCT0 (37% regression) * Windows 7 PGO: http://bit.ly/17kSXU8 (38% regression) * Windows 8 Non-PGO: http://bit.ly/19fZnUf (24% regression) * Windows 8 PGO: http://bit.ly/1fy7Jfo (22% regression) Windows 7 regressed to be much worse than m-c. Windows 8 regressed to be the same as m-c instead of faster. UX Regression Range: 9f6d6fa60cf0 to 88104a2deda5 http://hg.mozilla.org/projects/ux/pushloghtml?changeset=88104a2deda5 My first guess would be bug 934860 because it's specifically for the same platforms and could affect animations. I'm going to try bisect now (starting with a backout of bug 934860).

Matthew N. [:MattN]

Reporter

Comment 1

•

12 years ago

Try push with a backout of bug 934860: https://tbpl.mozilla.org/?tree=Try&rev=0cfcf8310549 comparison with UX: http://compare-talos.mattn.ca/?oldRevs=650572c94814&newRev=0cfcf8310549&submit=true comparison with m-c: http://compare-talos.mattn.ca/?oldRevs=80d56658be91&newRev=0cfcf8310549&submit=true We'll see if this fixes the regression.

Matthew N. [:MattN]

Reporter

Comment 2

•

12 years ago

That backout failed to build locally so here's another try run on top of 88104a2deda5 (instead of UX tip) just in case since I don't have time to wait for the builds to finish as I'm going to bed: https://tbpl.mozilla.org/?tree=Try&showall=1&rev=9298f6cdc333 Someone else can feel free to bisect further while I'm sleeping if the backout doesn't solve the problem.

Mike Conley (:mconley) (:⚙️)

Comment 3

•

12 years ago

MattN's build likely failed because he needed a clobber - those first try pushes worked fine, and seem to confirm that bug 934860 is the cause of this TART regression. I'll spin out some comparison profiles today to see where we're spending the time.

Blocks: 934860

Mike Conley (:mconley) (:⚙️)

Comment 4

•

12 years ago

Here's a comparison profile for Win7 on the icon-open-DPI1 subtest, between UX and UX with bug 934860 backed out. The "before" (all samples weighted -1) is UX with bug 934860 backed out. "after" is UX with bug 934860 (df30212c9a7c). http://tests.themasta.com/cleopatra/

Mike Conley (:mconley) (:⚙️)

Comment 5

•

12 years ago

Ugh, sorry, better link: http://tests.themasta.com/cleopatra/?report=8982102221e510100b7fd92eac073b11753a98b3

Matt Woodrow (:mattwoodrow)

Assignee

Comment 6

•

12 years ago

Can you upload the individual profiles too please, I don't find the comparison very easy to interpret.

Emanuel Hoogeveen [:ehoogeveen]

Comment 7

•

12 years ago

Did this improve at all after bug 930451 landed? i.e. after https://hg.mozilla.org/mozilla-central/rev/1574b1da1773 (in November 9th Nightly).

Mike Conley (:mconley) (:⚙️)

Comment 8

•

12 years ago

(In reply to Matt Woodrow (:mattwoodrow) from comment #6) > Can you upload the individual profiles too please, I don't find the > comparison very easy to interpret. Hrm, looking more closely at the profiles I've extracted, I'm not confident I've done it properly. I'll generate some new profiles and post them shortly.

Mike Conley (:mconley) (:⚙️)

Comment 9

•

12 years ago

Ok, the comparison profile I posted actually makes it look like the after profile (normal UX) was faster than the before (UX without bug 934860) profile. Having examined the raw data, I think this is because the after profile (normal UX) is drawing fewer frames than the before profile (UX without bug 934860) because computing those frames is taking longer. Because those frames are skipped, we skip drawing them. So comparison profiles are probably not going to be useful here - the overall time within a subtest (like icon-open-DPI1) is equal (or even shorter!) in the bad case... and the comparison profiles won't help us figure out *why* those frames are taking longer to compute. So, full profiles it is. Coming right up.

Mike Conley (:mconley) (:⚙️)

Comment 11

•

12 years ago

UX + backout patch: http://tests.themasta.com/cleopatra/?report=3f308853717ad5af857040d39c05009e36c89c03 UX proper: http://tests.themasta.com/cleopatra/?report=15fa6710c5063230e87ff65893c2f2ff67273e21

Mike Conley (:mconley) (:⚙️)

Comment 12

•

12 years ago

One thing that jumps out for me is that nsCSSRendering::PaintGradient takes 12x longer with UX proper than with UX + backout patch.

Matthew N. [:MattN]

Reporter

Updated

•

12 years ago

Assignee: MattN+bmo → matt.woodrow

Matt Woodrow (:mattwoodrow)

Assignee

Comment 13

•

12 years ago

(In reply to Mike Conley (:mconley) from comment #12) > One thing that jumps out for me is that nsCSSRendering::PaintGradient takes > 12x longer with UX proper than with UX + backout patch. Where do you see this? I only see 19 samples vs 27. The extra time seems to be split evenly between CreateBrushForPattern and ComputeLinearGradientLine. Both of these could probably be cached, but it's fairly insignificant. I'm looking into us spending a huge chunk of our time rendering SVG images.

Matt Woodrow (:mattwoodrow)

Assignee

Comment 14

•

12 years ago

I'm doing the work for SVG caching in bug 923341. It looks like a makes up a decent amount of our lost time, but not all of it. https://tbpl.mozilla.org/?tree=Try&rev=2bc0b4ff7d81

Depends on: 923341

Matt Woodrow (:mattwoodrow)

Assignee

Comment 15

•

12 years ago

Other things that we regress on: Flushing D2D after painting - 35 samples. CreateOffscreenSurface inside CreateSamplingRestrictedDrawable - 17 samples.

Mike Conley (:mconley) (:⚙️)

Comment 16

•

12 years ago

jrmuizel asked me to log the calls to gfxAlphaBoxBlur::Paint and nsCSSRendering::PaintGradient over the course of a single TART run for both UX proper and UX with the backout patch. UX with backout: gfxAlphaBoxBlur::Paint: 994 calls nsCSSRendering::PaintGradient: 1444 calls UX proper: gfxAlphaBoxBlur::Paint: 1299 calls (+305) nsCSSRendering::PaintGradient: 2128 calls (+684)

Mike Conley (:mconley) (:⚙️)

Comment 17

•

12 years ago

mattwoodrow: Here's the profile from Win7 with your caching patches that you asked me to kick off last night: http://tests.themasta.com/cleopatra/?report=d7f142b07c021f6e2ded89aad9ad71fcd154ccc1 What do you think of gavin's proposal[1] to back out 934860 while we work on this? [1]: https://mail.mozilla.org/pipermail/firefox-dev/2013-November/001123.html

Flags: needinfo?(matt.woodrow)

Matt Woodrow (:mattwoodrow)

Assignee

Updated

•

12 years ago

Depends on: 938395

Matt Woodrow (:mattwoodrow)

Assignee

Comment 18

•

12 years ago

Some more things: We're spending a lot more time inside SSE2 code doing blurring for box shadows on win7. From the code, it looks like we always allocate a surface for the full box shadow, and blur all of it, even if only part of it will get drawn back. Trying to clip the blurred surface size to what we actually need doesn't help much since we still need to pad the surface with space for the blur/spread radius (both of which are large in the australis use case). I tried skipping this padding step and it's a significant win, but obviously renders wrong. I'm not sure if there's a way to do this correctly. Caching the blurs would be a big win here too. The other problem we have is the blurred element is using opacity:0.85. This forces a css stacking context, and goes through some fairly expensive paths since we have to ensure that the opacity is applied to everything drawn by the element and all of it's descendants (even though there are none in this case). This is adding quite a bit of overhead even without my patches, though it is exacerbated by going through that code multiple times. I believe we can move the opacity onto the color of the box blur without changing the visible rendering, see https://hg.mozilla.org/try/rev/ae927eee2be6. I think mconley is going to confirm that this doesn't regress anything visually. With both of the above fixes applied, we're actually faster than we were before my patches landed. I'll still waiting on results without the incorrect blur clipping patch applied, but I expect we should be approximately even. For reference: With all of my patches backed out: https://tbpl.mozilla.org/?tree=Try&rev=f83d63c3282e (has SPS profiler enabled) win8 profile: http://people.mozilla.org/~bgirard/cleopatra/#report=4e5aea5120655cbeacae95c32692fe5d2f44d2fa win7 profile: http://people.mozilla.org/~bgirard/cleopatra/#report=5aa3d5fdf549c6b098008c27d6f8f6022b47c49a With all my patches so far (including the broken one): https://tbpl.mozilla.org/?tree=Try&rev=ae927eee2be6 https://tbpl.mozilla.org/?tree=Try&rev=993edcdb2e90 (has SPS profiler enabled) win 8 profile: http://people.mozilla.org/~bgirard/cleopatra/#report=9be468954f7a1bdbe9386c5e0e47b262f6f23628 win 7 profile: http://people.mozilla.org/~bgirard/cleopatra/#report=1be66b5a54fe3cc065c7750bc535c7fc3bd65381 With all patches except the broken one: https://tbpl.mozilla.org/?tree=Try&rev=367aa58b8507 https://tbpl.mozilla.org/?tree=Try&rev=d1f6151e3726 (has SPS profiler enabled) Will post profiles for this last one once they complete, unless someone beats me to it.

Flags: needinfo?(matt.woodrow)

Mike Conley (:mconley) (:⚙️)

Comment 19

•

12 years ago

(In reply to Matt Woodrow (:mattwoodrow) from comment #18) > Will post profiles for this last one once they complete, unless someone > beats me to it. Here it is: http://tests.themasta.com/cleopatra/?report=5f5639c94d430320bcc5291d32cf4840bd589cab

Mike Conley (:mconley) (:⚙️)

Comment 20

•

12 years ago

Bug 934860 got backed out, so this regression is gone.

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

Resolution: --- → WORKSFORME

Emanuel Hoogeveen [:ehoogeveen]

Comment 21

•

12 years ago

Technically it was only hidden behind a (hidden) pref: layout.paint_rects_separately. From what I've seen, it will likely be re-enabled after some other optimizations for D2D clipping land (e.g. bug 938395), and will hopefully not reintroduce this regression at that time.