Closed
Bug 1261152
Opened 8 years ago
Closed 8 years ago
Ensure tps is treating e10s fairly
Categories
(Testing :: Talos, defect)
Testing
Talos
Tracking
(e10s+, firefox48 fixed)
RESOLVED
FIXED
mozilla48
People
(Reporter: mconley, Assigned: mconley)
References
(Blocks 1 open bug)
Details
Attachments
(2 files)
The tps test is supposed to measure how long it takes to switch tabs in the browser, and present the content of those tabs to the user. It loads up a bunch of sites from the tp5 set, waits for them all to finish loading, and then switches to each of those tabs, one by one. Here's a profile I recently gathered from an e10s run on Windows: http://people.mozilla.org/~bgirard/cleopatra/?zippedProfile=http://mozilla-releng-blobs.s3.amazonaws.com/blobs/Try/sha512/c44bd51496971b55c413c5c923bf8b6c2c8f2fdcd915a20851d21e9fb819560a995dfc2e9a0b0cbc6f608655a62dfbd4a06199666683507c7974d2d0678c0c42#report=29db309b072a460b01461c67a659fad5cee2d0fd&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A80290,%22end%22%3A80807%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A80513,%22end%22%3A80570%7D%5D&selection=0,4700,6,7,4701,4702,4,5,6,7,4701,8,9,10,11,12,13,14,15,16,17,4861,4862,4863,4864,4865,287,288,289,290,291,292,293,294,295,296 One thing that strikes me is that it looks like content was done painting at Rasterize #45, and that the next composite is Composite #516 - and that's when the web content should have been displayed to the user. Instead, we wait a while - we wait for another vsync before we mark the test as done. I think this test is slightly flawed here - we shouldn't be waiting for that extra vsync.
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → mconley
Updated•8 years ago
|
Blocks: e10s-tests
tracking-e10s:
--- → +
Assignee | ||
Comment 1•8 years ago
|
||
I have more details. Here's a snippet of a tps run on a Linux 64 machine: http://people.mozilla.org/~bgirard/cleopatra/?zippedProfile=http://mozilla-releng-blobs.s3.amazonaws.com/blobs/Try/sha512/e4e9f40b66605cb4cff3af7bddb8f8574e5b9215357a696a69c6ac1f4c6c0179b57be480fd395e1588ad577bcdfb7fd609a927342bc76cdc6fabe5958a9b3454#report=72c5d0ad920d1f09457659de33518be2f2d8fd4d&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A93860,%22end%22%3A102630%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A94760,%22end%22%3A96513%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A94799,%22end%22%3A94945%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A94828,%22end%22%3A94901%7D%5D At 94830, we start the tab switch, and the paint is kicked off in the content process. That paint completes around 94858. A Vsync tick causes a second, shorter paint to occur (I'm not 100% sure this paint is necessary, but there it is) - that paint goes from 94859 to 94862. The parent, meanwhile, has heard that the layers are ready from (presumably) that first paint that occurred. At around 94862, the parent hears both a MozAfterPaint in the parent (since a composite occurred not long ago), as well as the message that the layers are ready. At that point, the async tab switcher flips the tabpanel deck to reveal the newly painted browser content. We then wait 12ms for the next composite, which _should_, I believe, show the web content. The composite occurs, and at 94876, the TabChild dispatches its own MozAfterPaint since the tab has finally been presented to the user. Here's the kicker - then we wait a whole 16ms before stopping the tps stopwatch. I suspect tps needs to be modified to pay attention to the MozAfterPaint events that are being dispatched by the TabChild, instead of the requestAnimationFrame stuff it's currently doing.
Assignee | ||
Comment 2•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=056d696dd67c
Assignee | ||
Comment 3•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e8211e5d7d52
Assignee | ||
Comment 4•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b5615c73476c
Assignee | ||
Comment 5•8 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/46803/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/46803/
Attachment #8741868 -
Flags: review?(jmaher)
Attachment #8741869 -
Flags: review?(jmaher)
Assignee | ||
Comment 6•8 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/46805/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/46805/
Assignee | ||
Comment 7•8 years ago
|
||
Hey Joel - I think this is going to address our tps "regression". It turns out this isn't really a regression - we're just measuring wrong. e10s actually appears to perform better than non-e10s in this test with the first patch: https://treeherder.mozilla.org/perf.html#/e10s?repo=try&revision=fba2a2487c10 The second patch is hopefully going to address the OS X regression here, since it keeps the content process alive between cycles of the test. I think the reason OS X is regressing is because of the asian character sets that are being cached in the single-process case, but in the multi-process case, because the content process shuts down between cycles, no cache exists. Try build with both patches is here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=09910a98365f9982fc9331b4b44fb72eb716fddd
Depends on: 1264409
Updated•8 years ago
|
Attachment #8741869 -
Flags: review?(jmaher)
Comment 8•8 years ago
|
||
Comment on attachment 8741869 [details] MozReview Request: Bug 1261152 - Keep the content process alive between tps runs. r?jmaher https://reviewboard.mozilla.org/r/46805/#review43401 one issue, It appears we still look for data.times... in the code that receives the data. ::: testing/talos/talos/tests/tabswitch/bootstrap.js:437 (Diff revision 1) > if (pref) > Services.prefs.setBoolPref("browser.tabs.warnOnCloseOtherTabs", pref); > - Services.obs.notifyObservers(win, 'tabswitch-test-results', JSON.stringify({'times': times, 'urls': testURLs})); > + > + remotePage.sendAsyncMessage("tabswitch-test-results", { > + times, > + urls: testURLs, previously you had a structure of: \{'times': value, 'urls': value\} now you have a similar structure, just there is nothing labeled 'times'
Comment 9•8 years ago
|
||
Comment on attachment 8741868 [details] MozReview Request: Bug 1261152 - Make TPS more accurate using MozAfterPaint with layer transaction ids. r?jmaher https://reviewboard.mozilla.org/r/46803/#review43405 ::: testing/talos/talos/tests/tabswitch/bootstrap.js:407 (Diff revision 1) > + let times = []; > + > + for (let tab of tabs) { > + yield forceGC(win, tab.linkedBrowser); > + let time = yield switchToTab(tab); > + dump(`${tab.linkedBrowser.currentURI.spec}: ${time}ms\n`); is this dump seen in automation? I would think it is not visible.
Attachment #8741868 -
Flags: review?(jmaher) → review+
Assignee | ||
Comment 10•8 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #8) > Comment on attachment 8741869 [details] > MozReview Request: Bug 1261152 - Keep the content process alive between tps > runs. r?jmaher > > https://reviewboard.mozilla.org/r/46805/#review43401 > > one issue, It appears we still look for data.times... in the code that > receives the data. > > ::: testing/talos/talos/tests/tabswitch/bootstrap.js:437 > (Diff revision 1) > > if (pref) > > Services.prefs.setBoolPref("browser.tabs.warnOnCloseOtherTabs", pref); > > - Services.obs.notifyObservers(win, 'tabswitch-test-results', JSON.stringify({'times': times, 'urls': testURLs})); > > + > > + remotePage.sendAsyncMessage("tabswitch-test-results", { > > + times, > > + urls: testURLs, > > previously you had a structure of: > \{'times': value, > 'urls': value\} > > now you have a similar structure, just there is nothing labeled 'times' Ah, this some ES6 goodness. Specifically, this is a shorthand property name. See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Object_initializer#New_notations_in_ECMAScript_2015 Essentially, given some local variable named times and testURLs: { times: times, urls: testURLs } is equivalent to { times, urls: testURLs }
Comment 11•8 years ago
|
||
nice, then ship it!
Assignee | ||
Comment 12•8 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #9) > Comment on attachment 8741868 [details] > MozReview Request: Bug 1261152 - Make TPS more accurate using MozAfterPaint > with layer transaction ids. r?jmaher > > https://reviewboard.mozilla.org/r/46803/#review43405 > > ::: testing/talos/talos/tests/tabswitch/bootstrap.js:407 > (Diff revision 1) > > + let times = []; > > + > > + for (let tab of tabs) { > > + yield forceGC(win, tab.linkedBrowser); > > + let time = yield switchToTab(tab); > > + dump(`${tab.linkedBrowser.currentURI.spec}: ${time}ms\n`); > > is this dump seen in automation? I would think it is not visible. It is, yes - you can see these in the test log.
Updated•8 years ago
|
Attachment #8741869 -
Flags: review+
Comment 13•8 years ago
|
||
Comment on attachment 8741869 [details] MozReview Request: Bug 1261152 - Keep the content process alive between tps runs. r?jmaher https://reviewboard.mozilla.org/r/46805/#review43409
Comment 14•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/d7d96b820538 https://hg.mozilla.org/integration/mozilla-inbound/rev/44cd41f66946
Assignee | ||
Comment 15•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/dcd51036ec228ea2ff054c3e9050b4e87e604b24 Bug 1261152 - Follow-up: bump version, repack and sign tps add-on. r=jmaher
Assignee | ||
Comment 16•8 years ago
|
||
So this seems to have put a pretty dramatic bullet through the tps regression. non-e10s has gone up. e10s has gone down. In case people are worried that I've somehow unfairly shifted the test in e10s's favour, here are some profiles I took from my local machine from a single run: non-e10s: https://cleopatra.io/#report=c77d23d2314b897b2eed8c85b1337d47103711b8&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A48038,%22end%22%3A75017%7D%5D e10s: https://cleopatra.io/#report=c835c4853b5832ce6162415ff5e95ba1166ec479&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A46761,%22end%22%3A90595%7D%5D Here's a little snippet of a single tab switch to look at, which I'll discuss next: non-e10s: https://cleopatra.io/#report=c77d23d2314b897b2eed8c85b1337d47103711b8&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A48038,%22end%22%3A75017%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A48038,%22end%22%3A48734%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A48110,%22end%22%3A48241%7D%5D e10s: https://cleopatra.io/#report=c835c4853b5832ce6162415ff5e95ba1166ec479&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A46761,%22end%22%3A90595%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A46863,%22end%22%3A47980%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A47014,%22end%22%3A47101%7D%5D Notice in the non-e10s snippet, the big paint goes from time 48128 all the way until around 48190, taking approximately ~62ms. The next composite finishes at 48213, and the MozAfterPaint event for it is serviced at 48218, which ends the test with a time of about ~106ms. In the e10s snippet, the paint gets to start a _lot_ sooner, at 47019 in the content process. That paint lasts until around 47075, lasting about ~55ms. This is likely faster than non-e10s because we don't have to worry about painting any of the browser UI. I believe the composite with that web content finishes around 47078, but it's 15ms later that the MozAfterPaint event finally has a chance to be serviced, and we can stop the stopwatch for a total time of ~77ms. There's room for improvement in terms of the accuracy of the measurement in both cases. Specifically, if we were to fix bug 1264798, we wouldn't need to care about when the MozAfterPaint event was serviced - we'd be able to get the time that the pixels were presented to the user with a very high-degree of precision.
Comment 17•8 years ago
|
||
In my netbook, when I open heavy javascript page on a tab and before this tab finished loading / rendering / completing javascript execution, switch to another tabs make spinner on middle page appear for quite a while. This make switching tab experience worse than non e10s. Is this test include heavy javascript page? Or Is this test account for switching tabs before another tab completely loaded or rendered? Or Is it tested on another talos test? Thank you.
Assignee | ||
Comment 18•8 years ago
|
||
(In reply to Katsuki from comment #17) > In my netbook, when I open heavy javascript page on a tab and before this > tab finished loading / rendering / completing javascript execution, switch > to another tabs make spinner on middle page appear for quite a while. This > make switching tab experience worse than non e10s. Is this test include > heavy javascript page? This test opens a series of pages from the tp5 set, some with reasonably complex JS running on them, yes. > Or Is this test account for switching tabs before > another tab completely loaded or rendered? This is different. We wait for all of the pages to be loaded before we start switching through them. Do you have an example site that I could test?
Flags: needinfo?(Shiratori1991)
Comment 19•8 years ago
|
||
If I open bugzilla and google on different tab, switching tab between these two tabs seem smooth, google and bugzilla page displayed properly. (I try to quickly switch tab on about 500ms interval). Then I open this page : https://reviewable.io/reviews/servo/servo/10438 on another tab. Before this page completely rendered or before completely loaded or before review being loaded, I try to switch tab between google and bugzilla on about 500 ms interval, but what I see is just the spinner on middle page. Even switching tab on 2 seconds interval just display the spinner.
Flags: needinfo?(Shiratori1991)
Assignee | ||
Comment 20•8 years ago
|
||
(In reply to Katsuki from comment #19) > If I open bugzilla and google on different tab, switching tab between these > two tabs seem smooth, google and bugzilla page displayed properly. (I try to > quickly switch tab on about 500ms interval). > > Then I open this page : https://reviewable.io/reviews/servo/servo/10438 on > another tab. Before this page completely rendered or before completely > loaded or before review being loaded, I try to switch tab between google and > bugzilla on about 500 ms interval, but what I see is just the spinner on > middle page. Even switching tab on 2 seconds interval just display the > spinner. Interesting. I am able to recreate the spinner that you're seeing, but when I use the same steps in single-process mode, the entire browser UI locks up instead. Are you not seeing this?
Flags: needinfo?(Shiratori1991)
Comment 21•8 years ago
|
||
Yes I am seeing it on single process, firefox hangs for couple of seconds. Seems like other bug? Is it about jank or responsiveness bug?
Flags: needinfo?(Shiratori1991)
Assignee | ||
Comment 22•8 years ago
|
||
(In reply to Katsuki from comment #21) > Yes I am seeing it on single process, firefox hangs for couple of seconds. > Seems like other bug? Is it about jank or responsiveness bug? It's essentially the same bug, just manifesting differently. In the non-e10s case, the main process is being bogged down by some heavy JS that reviewable.io is doing. In the e10s case, the content process is getting bogged down by that JS. The fact that the e10s-case allows the browser to still be somewhat responsive is a bonus, though it's not amazing that you can't see any tab content until the content process frees up. The transition to multiple content processes will ease this up, most likely. You might want to follow the e10s-multi metabug at bug 1207306. Otherwise, the issue you've reported is unrelated to the tps test.
Comment 23•8 years ago
|
||
OK, thank you.
Comment 24•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/d7d96b820538 https://hg.mozilla.org/mozilla-central/rev/44cd41f66946 https://hg.mozilla.org/mozilla-central/rev/dcd51036ec22
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox48:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Comment 25•8 years ago
|
||
lots of improvements/regressions as expected: https://treeherder.mozilla.org/perf.html#/alerts?id=886
Assignee | ||
Comment 26•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c8c69ccd2bdb
You need to log in
before you can comment on or make changes to this bug.
Description
•