ProgressiveUpdate returning incorrect value

RESOLVED FIXED in Firefox 20

Status

()

defect
RESOLVED FIXED
7 years ago
6 years ago

People

(Reporter: nrc, Assigned: cwiiis)

Tracking

({regression})

21 Branch
mozilla22
ARM
Android
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox20+ fixed, firefox21+ fixed, firefox22+ fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

See http://mxr.mozilla.org/mozilla-central/source/gfx/layers/basic/BasicTiledThebesLayer.cpp#438

It looks like we should not be returning false there. The code does not match the comment.
Assignee: nobody → ncameron
It does also look wrong to me. Can you confirm Chris?
Posted patch patch (obsolete) — Splinter Review
Attachment #715254 - Flags: review?(chrislord.net)
Comment on attachment 715254 [details] [diff] [review]
patch

Review of attachment 715254 [details] [diff] [review]:
-----------------------------------------------------------------

Nice catch, but I don't think the fix is right.

::: gfx/layers/basic/BasicTiledThebesLayer.cpp
@@ +435,5 @@
>                                              aScrollOffset,
>                                              aResolution,
>                                              repeat);
>  
> +    isBufferChanged |= repeat;

This doesn't look right either though - If there's no repeat, but the buffer is updated once, this will return false? Seems like you could just put isBufferChanged = true after the regionToPaint.IsEmpty() check.
Attachment #715254 - Flags: review?(chrislord.net) → review-
Posted patch patchSplinter Review
Attachment #715254 - Attachment is obsolete: true
Attachment #715610 - Flags: review?(chrislord.net)
(In reply to Chris Lord [:cwiiis] from comment #4)
> Comment on attachment 715254 [details] [diff] [review]
> patch
> 
> Review of attachment 715254 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Nice catch, but I don't think the fix is right.
> 
> ::: gfx/layers/basic/BasicTiledThebesLayer.cpp
> @@ +435,5 @@
> >                                              aScrollOffset,
> >                                              aResolution,
> >                                              repeat);
> >  
> > +    isBufferChanged |= repeat;
> 
> This doesn't look right either though - If there's no repeat, but the buffer
> is updated once, this will return false? Seems like you could just put
> isBufferChanged = true after the regionToPaint.IsEmpty() check.

Ah, I misunderstood when repeat would be true. And you are right, your suggestion fixes a bug I have been chasing for a week (on and off).

I'm not sure why this is more obvious on the refactoring than trunk, but this is a fairly frequent and annoying bug, we should probably push this patch as far as we can.
Comment on attachment 715610 [details] [diff] [review]
patch

This is good to go.
Attachment #715610 - Flags: review?(chrislord.net) → review+
tracking-fennec: --- → ?
ftr, I think we should uplift this to beta and aurora - low risk and high gain.
Comment on attachment 715610 [details] [diff] [review]
patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Progressive Update of tiled layers
User impact if declined: rendering glitches, flickering
Testing completed (on m-c, etc.): testing on m-c and graphics branch
Risk to taking this patch (and alternatives if risky): low, this is a localised and small bug fix
String or UUID changes made by this patch: none
Attachment #715610 - Flags: approval-mozilla-beta?
Attachment #715610 - Flags: approval-mozilla-aurora?
Tracking since it looks like this is a regression in FF20 caused by bug 783368
Comment on attachment 715610 [details] [diff] [review]
patch

Approving low risk uplift, early in FF20 beta cycle.
Attachment #715610 - Flags: approval-mozilla-beta?
Attachment #715610 - Flags: approval-mozilla-beta+
Attachment #715610 - Flags: approval-mozilla-aurora?
Attachment #715610 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/mozilla-central/rev/9f044f9e981a
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
We seem to be having Talos problems:



Regression: Mozilla-Beta - Robocop Checkerboarding Real User Benchmark - Android 2.2 (Native) - 127% increase
-------------------------------------------------------------------------------------------------------------
    Previous: avg 4.743 stddev 0.236 of 30 runs up to revision 58ba8dc9013a
    New     : avg 10.752 stddev 0.245 of 5 runs since revision 99c93cc91750
    Change  : +6.008 (127% / z=25.428)
    Graph   : http://mzl.la/12Nk5wC

Changeset range: http://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=58ba8dc9013a&tochange=99c93cc91750

Changesets:
  * http://hg.mozilla.org/releases/mozilla-beta/rev/99c93cc91750
    : Nicholas Cameron <ncameron@mozilla.com> - Bug 842389 - return the right thing from ProgressiveUpdate; r=BenWa,cwiiis a=lsblakk
    : http://bugzilla.mozilla.org/show_bug.cgi?id=842389

Bugs:
  * http://bugzilla.mozilla.org/show_bug.cgi?id=842389 - ProgressiveUpdate returning incorrect value





Regression: Mozilla-Beta - Robocop Checkerboarding Benchmark - Android 2.2 (Native) - 2.31e+03% increase
--------------------------------------------------------------------------------------------------------
    Previous: avg 0.024 stddev 0.016 of 30 runs up to revision 58ba8dc9013a
    New     : avg 0.591 stddev 0.051 of 5 runs since revision 99c93cc91750
    Change  : +0.566 (2.31e+03% / z=34.503)
    Graph   : http://mzl.la/WQXOZO

Changeset range: http://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=58ba8dc9013a&tochange=99c93cc91750

Changesets:
  * http://hg.mozilla.org/releases/mozilla-beta/rev/99c93cc91750
    : Nicholas Cameron <ncameron@mozilla.com> - Bug 842389 - return the right thing from ProgressiveUpdate; r=BenWa,cwiiis a=lsblakk
    : http://bugzilla.mozilla.org/show_bug.cgi?id=842389

Bugs:
  * http://bugzilla.mozilla.org/show_bug.cgi?id=842389 - ProgressiveUpdate returning incorrect value





Regression: Mozilla-Beta - Tp5 No Network Row Major MozAfterPaint (Main RSS) - MacOSX 10.6 (rev4) - 2.88% increase
------------------------------------------------------------------------------------------------------------------
    Previous: avg 197677933.333 stddev 2287798.806 of 30 runs up to revision 58ba8dc9013a
    New     : avg 203362800.000 stddev 416356.458 of 5 runs since revision 99c93cc91750
    Change  : +5684866.667 (2.88% / z=2.485)
    Graph   : http://mzl.la/WRfepf

Changeset range: http://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=58ba8dc9013a&tochange=99c93cc91750

Changesets:
  * http://hg.mozilla.org/releases/mozilla-beta/rev/99c93cc91750
    : Nicholas Cameron <ncameron@mozilla.com> - Bug 842389 - return the right thing from ProgressiveUpdate; r=BenWa,cwiiis a=lsblakk
    : http://bugzilla.mozilla.org/show_bug.cgi?id=842389

Bugs:
  * http://bugzilla.mozilla.org/show_bug.cgi?id=842389 - ProgressiveUpdate returning incorrect value
(In reply to Nick Cameron [:nrc] from comment #16)
> We seem to be having Talos problems:
> Regression: Mozilla-Beta - Robocop Checkerboarding Real User Benchmark -
> Android 2.2 (Native) - 127% increase
>     Previous: avg 4.743 stddev 0.236 of 30 runs up to revision 58ba8dc9013a
>     New     : avg 10.752 stddev 0.245 of 5 runs since revision 99c93cc91750
>     Change  : +6.008 (127% / z=25.428)
>     Graph   : http://mzl.la/12Nk5wC

> Regression: Mozilla-Beta - Robocop Checkerboarding Benchmark - Android 2.2
> (Native) - 2.31e+03% increase
>     Previous: avg 0.024 stddev 0.016 of 30 runs up to revision 58ba8dc9013a
>     New     : avg 0.591 stddev 0.051 of 5 runs since revision 99c93cc91750
>     Change  : +0.566 (2.31e+03% / z=34.503)
>     Graph   : http://mzl.la/WQXOZO
> Regression: Mozilla-Beta - Tp5 No Network Row Major MozAfterPaint (Main RSS)
> - MacOSX 10.6 (rev4) - 2.88% increase
>     Previous: avg 197677933.333 stddev 2287798.806 of 30 runs up to revision
> 58ba8dc9013a
>     New     : avg 203362800.000 stddev 416356.458 of 5 runs since revision
> 99c93cc91750
>     Change  : +5684866.667 (2.88% / z=2.485)
>     Graph   : http://mzl.la/WRfepf

I would expect drops, we're going to end up doing more work than before (correctly so) - none of these drops look hugely significant though tbh, 127% isn't much when the initial value is so small.
I think we should tolerate this regression since correctness wins over perf. I will have many more ideas for improving the performance so I rather spend efforts on that instead.
Depends on: 843997
This will need to be backed out of mozilla-beta (FF20) due to breaking sites (see bug 843997)
(In reply to Benoit Girard (:BenWa) from comment #20)
> Backout:
> https://hg.mozilla.org/releases/mozilla-beta/rev/bb28f76aa704

Please backout on aurora as well.
https://hg.mozilla.org/mozilla-central/rev/f2ec16a9feea
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Sorry for the bustage here. It would be good to know why this breaks for certain websites - the few I tested with were fine. It is a little worrying that we can render essentially nothing for a whole bunch of sites and still pass all our tests.

Cwiiis or BenWa - do you have cycles to look at this or should I (I am a little snowed under with layers refactoring, but could find time for this)?
(In reply to Nick Cameron [:nrc] from comment #24)
> Sorry for the bustage here. It would be good to know why this breaks for
> certain websites - the few I tested with were fine. It is a little worrying
> that we can render essentially nothing for a whole bunch of sites and still
> pass all our tests.
> 
> Cwiiis or BenWa - do you have cycles to look at this or should I (I am a
> little snowed under with layers refactoring, but could find time for this)?

I have no idea why this would break - did it break on all versions, or just beta and/or aurora? It doesn't make a whole lot of sense, unless it triggers a path that is otherwise not used (thinking of the broken single-tile path)

I can look at this next week, though I may not get to it until mid-week, depending on how bug 716403 goes. Would that be ok?
I tried to investigate this on the graphics branch where the patch is still applied, but I could not reproduce, which is a bit odd. I'll try with trunk once I get a build going.
(In reply to Chris Lord [:cwiiis] from comment #25)
> (In reply to Nick Cameron [:nrc] from comment #24)
> > Sorry for the bustage here. It would be good to know why this breaks for
> > certain websites - the few I tested with were fine. It is a little worrying
> > that we can render essentially nothing for a whole bunch of sites and still
> > pass all our tests.
> > 
> > Cwiiis or BenWa - do you have cycles to look at this or should I (I am a
> > little snowed under with layers refactoring, but could find time for this)?
> 
> I have no idea why this would break - did it break on all versions, or just
> beta and/or aurora? It doesn't make a whole lot of sense, unless it triggers
> a path that is otherwise not used (thinking of the broken single-tile path)
> 
> I can look at this next week, though I may not get to it until mid-week,
> depending on how bug 716403 goes. Would that be ok?

That'd be great, thanks! I'm vaguely looking into it, so let me know when you start. I'll post anything I find here.
Well I couldn't recreate the bug, all the mobile sites rendered fine for me with the patch. Over to you :-) Let me know if I can help out some how.
Blocks: 828995
Whiteboard: [no-nag]
Whiteboard: [no-nag]
Comment on attachment 715610 [details] [diff] [review]
patch

Marking this patch obsolete so we can continue to monitor this patch for tracking without sending "unlanded" email reminders daily.
Attachment #715610 - Attachment is obsolete: true
Assignee: ncameron → chrislord.net
(In reply to Chris Lord [:cwiiis] from comment #25)

> I can look at this next week, though I may not get to it until mid-week,
> depending on how bug 716403 goes. Would that be ok?

Hey Chris - since this bug is a tracked regression and bug 716403 is not currently tracking for a particular release (that I can see) do you have time to try another fix here that we could get landed to trunk, verified and hopefully uplifted before Tues March 12th when we go to build with FF20 beta 4?
(In reply to Chris Lord [:cwiiis] from comment #25)
> I have no idea why this would break - did it break on all versions, or just
> beta and/or aurora?

I could still do with an answer to this - seeing as Nick couldn't reproduce, did this break in all versions or just non-Nightly? Do we have some solid STR?
Flags: needinfo?
Comment on attachment 715610 [details] [diff] [review]
patch

clearing the approvals since this got backed out
Attachment #715610 - Flags: approval-mozilla-beta+
Attachment #715610 - Flags: approval-mozilla-aurora+
Flags: needinfo?
Comment on attachment 715610 [details] [diff] [review]
patch

Removing the obsolete flag seeing as the approval has been removed - I still see nothing wrong with this patch and suspect it's a bad interaction with old code.

I'm going to test this on Nightly and if I can't find anything, I'm pushing it to inbound - if there are issues, we need STR, otherwise we can let this ride the trains.
Attachment #715610 - Attachment is obsolete: false
I also can't reproduce on Nightly, will push when the tree is open.
https://hg.mozilla.org/mozilla-central/rev/9c58cb6c3b86
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Comment on attachment 715610 [details] [diff] [review]
patch

[Triage Comment]
We need to try uplifting this again and see if the issues from previous attempt reproduce.  As Benoit stated in comment 18, if there is just a perf regression we should tolerate it for correctness and to not ship this regression in code.
Attachment #715610 - Flags: approval-mozilla-beta+
Attachment #715610 - Flags: approval-mozilla-aurora+
Depends on: 850396
(In reply to Lukas Blakk [:lsblakk] from comment #37)
> Comment on attachment 715610 [details] [diff] [review]
> patch
> 
> [Triage Comment]
> We need to try uplifting this again and see if the issues from previous
> attempt reproduce.  As Benoit stated in comment 18, if there is just a perf
> regression we should tolerate it for correctness and to not ship this
> regression in code.

I think these issues were caused by bug 850396, which I've marked as blocking this - it's a pretty harmless change - it will be a performance regression, possibly, but using that path will very often result in rendering errors anyway.
hmm, we really don't want to have this pushed without the patch on bug 850396.. Is that also being uplifted? Because if not, this *will* cause problems.
tracking-fennec: ? → ---
You need to log in before you can comment on or make changes to this bug.