Closed Bug 1317312 Opened 8 years ago Closed 8 years ago

[e10s-multi] Talos regressions with multiple content processes

Categories

(Core :: DOM: Content Processes, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox52 --- fixed
firefox53 --- fixed

People

(Reporter: gkrizsanits, Assigned: gkrizsanits, NeedInfo)

References

Details

(Whiteboard: [e10s-multi:?])

Attachments

(2 files)

From Bug 1303113 Comment 11: (In reply to Joel Maher ( :jmaher) from comment #11) > as a note this had a large impact on talos: > == Change summary for alert #4055 (as of November 09 2016 13:46 UTC) == > > Regressions: > > 270% tabpaint summary osx-10-10 opt e10s 58.11 -> 214.79 > 263% tabpaint summary windowsxp pgo e10s 50.45 -> 183.01 > 223% tabpaint summary windows7-32 pgo e10s 49.69 -> 160.26 > 217% tabpaint summary linux64 pgo e10s 59.5 -> 188.54 > 214% tabpaint summary linux64 opt e10s 68.86 -> 216.27 > 205% tabpaint summary windows8-64 opt e10s 59.03 -> 180.29 > 196% tabpaint summary windowsxp opt e10s 73.17 -> 216.68 > 177% tabpaint summary windows7-32 opt e10s 71.44 -> 197.61 > 67% tps summary windows7-32 pgo e10s 33.25 -> 55.64 > 58% tps summary windows7-32 opt e10s 40.02 -> 63.24 > 42% tps summary osx-10-10 opt e10s 39.78 -> 56.29 > 27% tps summary windowsxp opt e10s 39.5 -> 50.02 > 25% tps summary windowsxp pgo e10s 33.74 -> 42.19 > 16% damp summary windows7-32 pgo e10s 222.94 -> 259.04 > 14% damp summary windows8-64 opt e10s 267.11 -> 303.89 > 13% damp summary osx-10-10 opt e10s 302.7 -> 342.71 > 11% damp summary linux64 opt e10s 293.99 -> 326.18 > 10% damp summary linux64 pgo e10s 245.15 -> 270.28 > 9% tps summary linux64 opt e10s 41.7 -> 45.47 > 8% damp summary windows7-32 opt e10s 300.24 -> 325.07 > 8% tps summary linux64 pgo e10s 36.23 -> 39.22 > > Improvements: > > 3% tps summary windows8-64 opt e10s 37.1 -> 35.86 > > For up to date results, see: > https://treeherder.mozilla.org/perf.html#/alerts?id=4055 > > > please address these talos issues prior to landing this again.
Content process startup optimization will fix some of this regression, we should decide what is the acceptable regression for release (process startup and script execution won't happen without a cost). We also have to decide if we want to force single content process for these tests for now or we want to block turning on 2cp's on nightly.
Depends on: 1317304
Flags: needinfo?(mrbkap)
Whiteboard: [e10s-multi:?]
let me know if I can help answer questions or debug stuff.
Flags: needinfo?(jdemooij)
Jan, I will try to delay some of these scripts, but realistically we will need something like bug 876173 for this as well. Are you working on something related these days?
Eh sorry for the spam this was the wrong bug :(
Flags: needinfo?(jdemooij)
(In reply to Joel Maher ( :jmaher) from comment #2) > let me know if I can help answer questions or debug stuff. Thanks! So tabpaint is hardly surprising but I'm not sure about the damp regression. Opening new tab or window is expected to start slower, is this something that would explain the damp regression or do you think there is something else?
Mike, is it expected that tps is regressed by multiple content processes? I would expect it not to open tabs during the run but the extra process startup might cause some delays at the start of the test... which should not matter that much either... do you have any ideas why this test is so sensitive about an extra content process?
Flags: needinfo?(mconley)
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #1) > We also have to decide if we want to force single content process for these > tests for now or we want to block turning on 2cp's on nightly. :jimm thinks if we decide to turn it on on nightly with this regression, we should not hide it by forcing single content process.
regarding damp, looking at the subtests: linux64: https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-inbound&originalRevision=fb8d5979d0f1e4f17416e5d027a46e7a2068e1c5&newProject=mozilla-inbound&newRevision=085035586d2b4d47ed03dee2ea727df6f2de6679&originalSignature=edaec66500db21d37602c99daa61ac983f21a6ac&newSignature=edaec66500db21d37602c99daa61ac983f21a6ac&framework=1 osx: https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-inbound&originalRevision=fb8d5979d0f1e4f17416e5d027a46e7a2068e1c5&newProject=mozilla-inbound&newRevision=085035586d2b4d47ed03dee2ea727df6f2de6679&originalSignature=9663a154caf9b713fb951c8db24518c7e0ceddce&newSignature=9663a154caf9b713fb951c8db24518c7e0ceddce&framework=1 you can see similar patterns where we have some improvements and many regressions. specifically in the open/reload measurements for the different tools, the close measurements have little to no regressions. The wins that we see are in the reload typically, so that leaves the smoking gun pointing at the open measurements for damp- let me include bgrins on here as he might have some valuable insight into the damp test and specifically the open measurements.
(In reply to Joel Maher ( :jmaher) from comment #8) > regarding damp, looking at the subtests: > linux64: > https://treeherder.mozilla.org/perf.html#/ > comparesubtest?originalProject=mozilla- > inbound&originalRevision=fb8d5979d0f1e4f17416e5d027a46e7a2068e1c5&newProject= > mozilla- > inbound&newRevision=085035586d2b4d47ed03dee2ea727df6f2de6679&originalSignatur > e=edaec66500db21d37602c99daa61ac983f21a6ac&newSignature=edaec66500db21d37602c > 99daa61ac983f21a6ac&framework=1 > > osx: > https://treeherder.mozilla.org/perf.html#/ > comparesubtest?originalProject=mozilla- > inbound&originalRevision=fb8d5979d0f1e4f17416e5d027a46e7a2068e1c5&newProject= > mozilla- > inbound&newRevision=085035586d2b4d47ed03dee2ea727df6f2de6679&originalSignatur > e=9663a154caf9b713fb951c8db24518c7e0ceddce&newSignature=9663a154caf9b713fb951 > c8db24518c7e0ceddce&framework=1 > > you can see similar patterns where we have some improvements and many > regressions. specifically in the open/reload measurements for the different > tools, the close measurements have little to no regressions. The wins that > we see are in the reload typically, so that leaves the smoking gun pointing > at the open measurements for damp- let me include bgrins on here as he might > have some valuable insight into the damp test and specifically the open > measurements. FYI there's a possibility for some noise in between DAMP subtests right now, so if RDP requests are still ongoing after opening the toolbox then it might bleed into the reload time. I'm hoping that Bug 1291815 will prevent that from happening. Generally, open time is how long it takes from the start of opening the toolbox until the toolbox and specific tool are fully loaded. Most of the work is in the parent process (loading a bunch of modules and building UI), but it's also waiting for the RDP (which is running in the content process) to return whatever is needed for the tool. In some tools there can be quite a bit of traffic on a complicated page (like the inspector needs a lot of info about the DOM even at startup).
See Also: → 1291815
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #6) > Mike, is it expected that tps is regressed by multiple content processes? I > would expect it not to open tabs during the run but the extra process > startup might cause some delays at the start of the test... which should not > matter that much either... do you have any ideas why this test is so > sensitive about an extra content process? Nothing comes to mind, I'm afraid. By the time that we start measuring tab switch times, the tp5 pages should have finished loading (and presumably, the content processes are both fully started-up). I think we'll need to profile it to get a sense of what's going wrong here.
Flags: needinfo?(mconley)
ni?ing myself to get some profiles to gabor / mrbkap next week.
Flags: needinfo?(mconley)
(In reply to Mike Conley (:mconley) from comment #11) > ni?ing myself to get some profiles to gabor / mrbkap next week. tps e10s-multi https://cleopatra.io/#report=a9685a46f48e7aedd4a06cbfd7d7e39fa61f12c4&selection=0,8 tps e10s-single https://cleopatra.io/#report=4c2b79c10b4d6240c6890fc4e3e18cc49863d19d&selection=0,8 It seems to me that in the multi process case we run the first batch of test in one process and the second in another, plus I'm not sure we wait for the process to fully initialize (loading all the process scripts frame scripts and what not). I'm not sure why we can't see what is in the XRE_InitChildProcess but probably we don't start the measurements early enough, which is OK as long as we start the test measurement after the process warmed up. Anyway, what I have assumed from this test that it loads some of the tabs in one process and some in the other does not seem to happen and the regression seem to be an indicator that this test should be adjusted for e10s-multi.
tabpaint e10s-single https://cleopatra.io/#report=8f75b775d7210af02374f8e06df648f7936d7bac&selection=0,8 tabpaint e10s-multi https://cleopatra.io/#report=ab2845f370e9282e28ee2d4a2fe6258465c50f1f This test isn't exactly fair either. The single content process version seem to run the test in the original content process that is already warmed up mostly, while the e10s-multi version does not (so we basically cheating for e10s-single). On the plus side the e10s-multi case gives a nice view about the time it takes about some scripts to run at startup (I don't see the process scripts though).
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #12) > It seems to me that in the multi process case we run the first batch of test > in one process and the second in another, plus I'm not sure we wait for the > process to fully initialize http://searchfox.org/mozilla-central/source/testing/talos/talos/tests/tabswitch/bootstrap.js#204 This should guarantee that the process that owns the tab we're switching to is initialized, but does not seem to guarantee that all the other processes finished their initialization, so that can be a source of regression but I'm not sure. We should probably do this when we open the batch of tabs with an array of promises. I still don't get what's up with the profiler, the multi-process case is weird, I cannot interpret it. Here is one with processCount set to 5: https://cleopatra.io/#report=2f611a922c19ca2e72566711a2e2a9040d2c6649
Here's some interesting data. With bug 1322758 fixed, I was able to get some profiles from try on OS X. I thought this was kind of interesting - here's a single tab switch to an Amazon page about a Kindle (part of the tp5 set), in both e10s-multi and in e10s-single: e10s-multi: https://cleopatra.io/#report=af72e1a87b2296fcfe11862c922b770868902d58&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A134258,%22end%22%3A162499%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A153384,%22end%22%3A162499%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A160241,%22end%22%3A162499%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A160412,%22end%22%3A160540%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A160421,%22end%22%3A160527%7D%5D&selection=0,5509,5510,5511,5512,5101,5513,8,9,10,11,12,13,14,15,16,17,20,21,22,23,24,25,26,139,140,141,3731,5562,5652,5653,5654,5655,32,33,34,35,36,37,39,54,55,56,90,91,92,93,94,5820,5821,96,97,98,99,1757 e10s-single: https://cleopatra.io/#report=9e0b12ae6cc1cccca154f0622c1f5eab7ddfdb01&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A203410,%22end%22%3A214004%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A211142,%22end%22%3A214004%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A211222,%22end%22%3A211386%7D,%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A211266,%22end%22%3A211322%7D%5D&selection=0,5117,5118,5119,5120,4712,5121,173,174,175,176,177,178,179,180,181,182,183,184,2,185,186,187,188,379,380,381,3606,5144,5147,5148,5149,5150,269,270,271,272,273,274,276,284,285,296,297,298,299,300,301,5151,5152,303,304,305,306,3451 There is a huge difference in the amount of time we're taking to paint here. 79 samples are being spent in nsPresShell::Paint in e10s-multi. Only 34 samples are being spent in nsPresShell::Paint in e10s-single. That's a pretty substantial increase. Digging in there, it looks like the biggest difference has to do with nsDisplayText::Paint. There are 46 samples in that frame in e10s-multi, and only 5 in e10s-single. We might want to get someone from the Graphics team to weigh in on this. Hey jrmuizel - anything illuminating about this?
Flags: needinfo?(mconley) → needinfo?(jmuizelaar)
Blocks: 1303113
Filtering out the glyph caching difference between single and multi cp cases from the tps test seems to be doing the trick: https://treeherder.mozilla.org/#/jobs?repo=try&revision=09f2e414b80dc9184563a3d4527ea96663fde46d
We don't measure the content process startup time for the single content process case in this test, so we should not measure it for the multi case either. The easiest way to do that is just forcing single cp.
Attachment #8822647 - Flags: review?(mconley)
Thanks again Mike for tracking this down :)
Attachment #8822648 - Flags: review?(mconley)
Comment on attachment 8822647 [details] [diff] [review] part1: Use single content process in tabpaint Review of attachment 8822647 [details] [diff] [review]: ----------------------------------------------------------------- Reasoning is sound and the patch looks fine to me. You'll need to bump the revision of the tabpaint test's install.rdf[1], and also package / re-sign[2] the add-on (I know, it kinda blows). [1]: http://searchfox.org/mozilla-central/rev/51aa673e28802fe6ea89f4793721fc55334a6ac8/testing/talos/talos/tests/tabpaint/install.rdf#7 [2]: https://wiki.mozilla.org/EngineeringProductivity/HowTo/SignExtensions#Signing_an_Addon
Attachment #8822647 - Flags: review?(mconley) → review+
Comment on attachment 8822648 [details] [diff] [review] part2: Tps should filter out glyph cache init Review of attachment 8822648 [details] [diff] [review]: ----------------------------------------------------------------- Looks good here too (modulo typo). Same need as above, re: bumping the version[1] and signing. [1]: http://searchfox.org/mozilla-central/rev/51aa673e28802fe6ea89f4793721fc55334a6ac8/testing/talos/talos/tests/tabswitch/install.rdf#7 ::: testing/talos/talos/tests/tabswitch/bootstrap.js @@ +405,5 @@ > + // Let's do an initial run to warm up any paint related caches > + // (like glyph caches for text). In the next loop we will start with > + // a GC before each switch so we don't need here. > + // Note: in case of multiple content processes, closing all the tabs > + // would close the related conent processes, and even if we kept them typo: "conent" -> "content".
Attachment #8822648 - Flags: review?(mconley) → review+
Pushed by gkrizsanits@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/d0d55af22bcb Tps should filter out glyph cache overhead. r=mconley https://hg.mozilla.org/integration/mozilla-inbound/rev/851c282874b1 Tabpaint test should not measure content process startup. r=mconley
(In reply to Mike Conley (:mconley) - PTO until Jan 3 from comment #20) > Reasoning is sound and the patch looks fine to me. You'll need to bump the > revision of the tabpaint test's install.rdf[1], and also package / > re-sign[2] the add-on (I know, it kinda blows). I had no idea... and thanks for the links it saved me a lot of time :)
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
we have improvements in the tps test after this landed: == Change summary for alert #4695 (as of January 03 2017 13:02 UTC) == Improvements: 38% tps summary windows8-64 opt e10s 40.39 -> 24.85 35% tps summary windows7-32 opt e10s 42.37 -> 27.56 19% tps summary osx-10-10 opt 70.13 -> 56.96 11% tps summary linux64 opt 59.15 -> 52.9 10% tps summary windows7-32 opt 73.54 -> 66 10% tps summary windows8-64 opt 68.5 -> 61.48 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=4695 I assume these are expected and probably a test specific change.
Comment on attachment 8822648 [details] [diff] [review] part2: Tps should filter out glyph cache init Approval Request Comment [Feature/Bug causing the regression]: Bug 1323319 [User impact if declined]: None - but we'll show a talos regression on the tps test. [Is this code covered by automated tests?]: Yes. [Has the fix been verified in Nightly?]: Yes. [Needs manual test from QE? If yes, steps to reproduce]: No. [List of other uplifts needed for the feature/fix]: None. [Is the change risky?]: No. [Why is the change risky/not risky?]: It's updating a talos test to more correctly measure what it's supposed to measure, and avoiding alarms going off if and when bug 1323319 uplifts. [String changes made/needed]: None.
Attachment #8822648 - Flags: approval-mozilla-aurora?
Comment on attachment 8822648 [details] [diff] [review] part2: Tps should filter out glyph cache init talos test change, aurora52+, needed for bug 1323319
Attachment #8822648 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee: nobody → gkrizsanits
to confirm, we are fine with the damp regressions? I see these changes after we landed this change: == Change summary for alert #4876 (as of January 23 2017 13:16 UTC) == Regressions: 15% damp summary osx-10-10 opt 312.58 -> 357.95 13% damp summary windows7-32 opt 310.46 -> 349.73 10% damp summary linux64 pgo 252.03 -> 277.76 6% damp summary windows8-64 pgo 248.59 -> 263.63 Improvements: 16% tps summary osx-10-10 opt 30.11 -> 25.42 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=4876
Flags: needinfo?(mrbkap)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: