1026742 - 43.8% robocheck2 android regression on inbound (fx33) from revision 32c13ea4c9e4

Reporter

Description

•

10 years ago

here is a graph that outlines the regression:
http://graphs.mozilla.org/graph.html#tests=[[201,63,20],[201,63,29]]&sel=1400448109818,1403040109818&displayrange=30&datatype=running

I did some retriggers:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&fromchange=f0d67b1ccff9&tochange=23934a59230e&jobname=Android%202.2%20Tegra%20mozilla-inbound%20talos%20remote-trobocheck2

we went from ~8 to ~10, this seems to be a solid regression.

It looks to be from this revision:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=9d16dee917ed&tochange=32c13ea4c9e4

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 1

•

10 years ago

android 4.0 has a 43.8% regression
android 2.2 has a 26.1% regression

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

10 years ago

Summary: 43.8% robocop android regression on inbound (fx33) from revision 32c13ea4c9e4 → 43.8% robocheck2 android regression on inbound (fx33) from revision 32c13ea4c9e4

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 2

•

10 years ago

I just uplifted bug 1023882 to aurora so this regression will likely show up there as well. I also pushed a bunch of try pushes to bisect which patch in the set was responsible for this. I'll update with results here.

status-firefox31: --- → unaffected

status-firefox32: --- → affected

status-firefox33: --- → affected

Version: unspecified → 33 Branch

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 3

•

10 years ago

Based on the try pushes I did, the regression appears to have come from the last patch in the set, part 8. Pretty surprising to me, I'll look into it some more.

https://tbpl.mozilla.org/?tree=Try&rev=0cb92f2a5b29 (based off cset 8508e0cc42b8)
https://tbpl.mozilla.org/?tree=Try&rev=930957239032 (based off cset 32c13ea4c9e4)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 4

•

10 years ago

The only possible way part 8 [1] could have a functional difference is if we reach the bottom of ClientTiledThebesLayer::Render with updatedLowPrecision == false && !mPaintData.mPaintFinished. In this case, we now call EndPaint() where before we would not have run that code. I pushed another try push [2] to see if changing this back fixes the robocheck2 regression.

If this is in fact the case, then I think the new behaviour is more correct, even if it regresses the talos numbers. My reasoning is that if we reach the end of that function, then we have drawn neither new high-precision nor new low-precision content. Setting the repeat transaction flag will presumably just do the same thing over and over (and cause infinite recursion), which is why we're not setting a repeat transaction flag there. So it makes sense that we should mark the paint as finished.

The difference in talos numbers probably comes from the mLastScrollOffset value not getting updated, and causing the next paint to behave differently (and incorrectly, I would think).

Chris, thoughts? I'm inclined to mark this bug WONTFIX because I think the new behaviour is more correct than it used to be, and should actually behave more consistently in user-facing scenarios.

[1] https://hg.mozilla.org/try/rev/32c13ea4c9e4
[2] https://tbpl.mozilla.org/?tree=Try&rev=7c34238ba939

Flags: needinfo?(chrislord.net)

Chris Lord [:cwiiis]

Comment 5

•

10 years ago

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #4)
> The only possible way part 8 [1] could have a functional difference is if we
> reach the bottom of ClientTiledThebesLayer::Render with updatedLowPrecision
> == false && !mPaintData.mPaintFinished. In this case, we now call EndPaint()
> where before we would not have run that code. I pushed another try push [2]
> to see if changing this back fixes the robocheck2 regression.
> 
> If this is in fact the case, then I think the new behaviour is more correct,
> even if it regresses the talos numbers. My reasoning is that if we reach the
> end of that function, then we have drawn neither new high-precision nor new
> low-precision content. Setting the repeat transaction flag will presumably
> just do the same thing over and over (and cause infinite recursion), which
> is why we're not setting a repeat transaction flag there. So it makes sense
> that we should mark the paint as finished.
> 
> The difference in talos numbers probably comes from the mLastScrollOffset
> value not getting updated, and causing the next paint to behave differently
> (and incorrectly, I would think).
> 
> Chris, thoughts? I'm inclined to mark this bug WONTFIX because I think the
> new behaviour is more correct than it used to be, and should actually behave
> more consistently in user-facing scenarios.
> 
> [1] https://hg.mozilla.org/try/rev/32c13ea4c9e4
> [2] https://tbpl.mozilla.org/?tree=Try&rev=7c34238ba939

This analysis sounds reasonable, but if it was correct, if anything I'd have expected tcheck to have gotten better and tscroll to get worse (i.e. I think the result of what you're saying would have been drawing either skipped or done in low precision, which would likely negatively affect tcheck numbers).

That said, I'm often finding my expectations are challenged wrt tiling/progressive rendering, so let's wait on the numbers (and make sure to get lots of repeat tests too) - it might also be worth enabling and looking at the logging output with and without this part 8 to see the metrics and paths being chosen.

Flags: needinfo?(chrislord.net)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 6

•

10 years ago

Btw the try push [2] in comment 4 does seem to fix the regression, so it is the case that we are reaching the end of the function with updatedLowPrecision == false && !mPaintData.mPaintFinished. Previously we would exit the function at that point without scheduling a repeat transaction or finishing the paint. Now we finish the paint, and that difference is the cause of the talos regression.

Chris Lord [:cwiiis]

Comment 7

•

10 years ago

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #6)
> Btw the try push [2] in comment 4 does seem to fix the regression, so it is
> the case that we are reaching the end of the function with
> updatedLowPrecision == false && !mPaintData.mPaintFinished. Previously we
> would exit the function at that point without scheduling a repeat
> transaction or finishing the paint. Now we finish the paint, and that
> difference is the cause of the talos regression.

So the only difference is that the last scroll offset is being updated?

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 8

•

10 years ago

Well, the EndPaint function does this:

  mPaintData.mLastScrollOffset = mPaintData.mScrollOffset;
  mPaintData.mPaintFinished = true;
  mPaintData.mFirstPaint = false;

So technically it also sets the mPaintFinished flag to true when it wouldn't have otherwise, and sets the mFirstPaint to false when it might not have otherwise. I don't think mFirstPaint would be the cause of this regression because this would have to happen on the very first paint for it to make any difference at all.

I suppose setting mPaintFinished to true might cause it to bounce out of [1] on the next paint? I'm not really sure why that check is there, to be honest. It seems like if we get to that point we should just be painting (or trying to). That might also cause more checkerboarding; I'll do a try push with that removed to see if it helps.

[1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/ClientTiledThebesLayer.cpp?rev=fc7250e9b202#353

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 9

•

10 years ago

Actually, BeginPaint() always resets mPaintFinished back to false, so there's no way that check will ever trigger.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

10 years ago

Assignee: nobody → bugmail.mozilla

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

10 years ago

Depends on: 1029278

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 10

•

10 years ago

So based on some try pushes ([1], [2], [3], [4]) it looks like a third or so of the regression comes from setting the mLastScrollOffset, and the remainder comes from setting mPaintFinished = true.

[1] https://tbpl.mozilla.org/?tree=Try&rev=7c34238ba939
[2] https://tbpl.mozilla.org/?tree=Try&rev=de06cfa07135
[3] https://tbpl.mozilla.org/?tree=Try&rev=dc5155ffcfa5
[4] https://tbpl.mozilla.org/?tree=Try&rev=cbad9abf5b66

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 11

•

10 years ago

So based on the analysis above, I think this regression is acceptable because the code is more correct now.

Status: NEW → RESOLVED

Closed: 10 years ago

status-firefox32: affected → wontfix

status-firefox33: affected → wontfix

Resolution: --- → WONTFIX

Bugzilla

Quick Search

43.8% robocheck2 android regression on inbound (fx33) from revision 32c13ea4c9e4

Categories

(Core :: Graphics: Layers, defect)

Tracking

()

People

(Reporter: jmaher, Assigned: kats)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Updated

Comment 10

Comment 11