Closed Bug 1187451 Opened 9 years ago Closed 9 years ago

TPS score doesn't always measure painting due to MozAfterPaint sequence mismatch

Categories

(Testing :: Talos, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: BenWa, Assigned: BenWa)

References

Details

Attachments

(1 file, 5 obsolete files)

No description provided.
Best way to see this in action is to run TPS with profiling enabled: http://people.mozilla.org/~bgirard/cleopatra/#report=988232cbe655a2074677902813f62faaf7cb0d7d&filter=[{"type"%3A"RangeSampleFilter","start"%3A44429,"end"%3A46908},{"type"%3A"RangeSampleFilter","start"%3A46490,"end"%3A46505}] at ~t=46500ms we're loading reddit.html but we never paint. We also get no VSync this interval.
Summary: TPS doesn → TPS doesn't always painting if VSync doesn't occur while activating the tab
http://people.mozilla.org/~bgirard/cleopatra/#report=3b414ebdc1e5791755ffc5f450440bf4cfb49d44 Here it's chinaz and globo. I'm not entirely sure if it's caused by Vsync because 'chinaz.com' had a Vsync marker while it was loading.
Blocks: 1187081
Had a look at this with mattwoodrow. The problem is that we listen to MozAfterPaint but we have no guarantee that the MozAfterPaint is for the current tab. In some cases we listen to a MozAfterPaint which was for a previous paint that is still pending in the event loop. What we're going to do is switch to about:blank, let the paint settle down then we're going to use MozAfterPaint. This will have the side benefit of having a more fair tab switch time. In theory a test can have a noisy disadvantage cause by tearing down it's previous tab' layer tree which may be more or less costly then others. This will remove a tiny source of noise at the same time. http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/tests/SimpleTest/paint_listener.js#55
I should say an alternative fix would be to have some sequence numbers and/or a more useful event handler. It wouldn't solve the noise problem of tearing down the previous page and would be more difficult to implement (require platform changes) but would be more versatile. Right now I'm not planning on doing this.
Summary: TPS doesn't always painting if VSync doesn't occur while activating the tab → TPS score doesn't always measure painting due to MozAfterPaint sequence mismatch
Attached patch WIP (obsolete) — Splinter Review
This should be better but it still doesn't work. Maybe I'm still getting a bad MozAfterPaint event.
I've replaced MozAfterPaint to use rAF and the results look a lot better now: http://people.mozilla.org/~bgirard/cleopatra/#report=2e583c8b3015dab85d76915a49ee695545c6b153
Attached patch patch (obsolete) — Splinter Review
Please forward the review if you're not the best person to look at this.
Assignee: nobody → bgirard
Attachment #8638799 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8639452 - Flags: review?(blassey.bugs)
Attached patch patch (obsolete) — Splinter Review
Attachment #8639452 - Attachment is obsolete: true
Attachment #8639452 - Flags: review?(blassey.bugs)
Attachment #8639453 - Flags: review?(blassey.bugs)
Blocks: 1187442
Blocks: 1188543
Attachment #8639453 - Flags: review?(blassey.bugs) → review+
url: https://hg.mozilla.org/build/talos/rev/da09edf6d9841937f86123423c346042beb13ad3 changeset: da09edf6d9841937f86123423c346042beb13ad3 user: Benoit Girard <b56girard@gmail.com> date: Mon Jul 27 15:12:32 2015 -0400 description: Bug 1187451 - TPS: Use rAF and waitForAllPaintsFlush instead of MozAfterPaint. r=blassey
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
backed out talos change: http://hg.mozilla.org/build/talos/rev/24ecc5140689 this is because tps didn't run on try and timed out- it appears to be this change: http://hg.mozilla.org/build/talos/rev/da09edf6d984 but backing it out has a lot of merge conflicts so we backed all of them out.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached file patch v2 (obsolete) —
Attachment #8639453 - Attachment is obsolete: true
What's the deal with all the missing logs? Can we get that fixed?
Flags: needinfo?(jmaher)
Attached patch patch v3 (obsolete) — Splinter Review
I removed waitForAllPaintsFlushed because it's not clear that it was helping and it was causing complexity.
Attachment #8641208 - Attachment is obsolete: true
url: https://hg.mozilla.org/build/talos/rev/2d45216ff5eb9aa4a2f527d6dadb6aab083a0311 changeset: 2d45216ff5eb9aa4a2f527d6dadb6aab083a0311 user: Benoit Girard <b56girard@gmail.com> date: Mon Jul 27 15:12:32 2015 -0400 description: Bug 1187451 - TPS: Use rAF instead of MozAfterPaint. r=blassey
I was meaning to push to my local repo instead to do try runs. Just noticed the mistake. I already have re-triggers using this new repo running but I expect things to be better now: https://treeherder.mozilla.org/#/jobs?repo=try&revision=51a2309b4ba8
Attached patch patch v4Splinter Review
I've tweaked how I handle the initial tab. It was getting too finicky to exclude it from the result set. This should be less finicky.
Attachment #8641328 - Attachment is obsolete: true
Alright the latest run is green, look like v4 did the trick. I'll land the delta tomorrow since I pushed to the master talos repo by accident (pushed to default instead of try).
there is a raw log which includes the stdout/stderr from the process. In this case we launched the browser and nothing happened. Do let me know if there is something that should have been there.
Flags: needinfo?(jmaher)
url: https://hg.mozilla.org/build/talos/rev/c2dd2a3fe8f5d7eee2731e4f9e1682e43b806f16 changeset: c2dd2a3fe8f5d7eee2731e4f9e1682e43b806f16 user: Benoit Girard <b56girard@gmail.com> date: Fri Jul 31 14:05:13 2015 -0400 description: Bug 1187451 - Backout patch v3 for bustage url: https://hg.mozilla.org/build/talos/rev/d44548b8feb996af0595c9c369b96b0ab939b8ae changeset: d44548b8feb996af0595c9c369b96b0ab939b8ae user: Benoit Girard <b56girard@gmail.com> date: Mon Jul 27 15:12:32 2015 -0400 description: Bug 1187451 - TPS: Use rAF instead of MozAfterPaint. r=blassey
this is deployed and we have some results of what has changed: https://treeherder.allizom.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=b2db9b75841a&newProject=mozilla-inbound&newRevision=768312f4d281&hideMinorChanges=1 I am starting to think that all OSX 10.10 results are rubbish given the fact that the majority of comparisons I see show significant changes. As you can see there are a lot of random improvement/regressions there, in this case tps is the area of focus and you can see the change. My understanding is this is expected and now we are measuring something a lot more realistic. :BenWa, can you confirm that?
Flags: needinfo?(bgirard)
If I'm reading this correctly the stddev for 10.10 went from 11% to 6% so things got better with this patch? As far as I can tell what we were measuring before was not correct since we would score a page without painting it somewhat randomly. From what I was seeing with profiling we are still off by quite a bit with this patch. I've already filed bug 1188635 to get things under control. Getting rid of the GC/CC will help a bit.
Flags: needinfo?(bgirard) → needinfo?(jmaher)
yes, the stddev went down. I just wanted to know if I should ignore these TPS *regressions*. It sounds like we have another round of it coming up soon.
Flags: needinfo?(jmaher)
As far as I can tell these regressions are expected because we're more accurately measuring the thing that we want. In some cases we got away with randomly not measuring a subtest run properly which would explain why our previous test scores were lower and had a higher stddev. Even if the numbers are worse, if we're more confident we're always measuring page paint and the stddev is going down I believe it's a sign that TPS is converging towards being a more useful test. Also a lower stddev means better signal to noise.
+1 to benwa. It's expected because we now measure a different thing which is not necessarily comparable to the previous thing it measured. And the new thing is better, so it just resets our reference point.
Patches have landed and it sounds like we're good with the numbers. Resolving.
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: