1317312 - [e10s-multi] Talos regressions with multiple content processes

Assignee

Description

•

8 years ago

From Bug 1303113 Comment 11: (In reply to Joel Maher ( :jmaher) from comment #11) > as a note this had a large impact on talos: > == Change summary for alert #4055 (as of November 09 2016 13:46 UTC) == > > Regressions: > > 270% tabpaint summary osx-10-10 opt e10s 58.11 -> 214.79 > 263% tabpaint summary windowsxp pgo e10s 50.45 -> 183.01 > 223% tabpaint summary windows7-32 pgo e10s 49.69 -> 160.26 > 217% tabpaint summary linux64 pgo e10s 59.5 -> 188.54 > 214% tabpaint summary linux64 opt e10s 68.86 -> 216.27 > 205% tabpaint summary windows8-64 opt e10s 59.03 -> 180.29 > 196% tabpaint summary windowsxp opt e10s 73.17 -> 216.68 > 177% tabpaint summary windows7-32 opt e10s 71.44 -> 197.61 > 67% tps summary windows7-32 pgo e10s 33.25 -> 55.64 > 58% tps summary windows7-32 opt e10s 40.02 -> 63.24 > 42% tps summary osx-10-10 opt e10s 39.78 -> 56.29 > 27% tps summary windowsxp opt e10s 39.5 -> 50.02 > 25% tps summary windowsxp pgo e10s 33.74 -> 42.19 > 16% damp summary windows7-32 pgo e10s 222.94 -> 259.04 > 14% damp summary windows8-64 opt e10s 267.11 -> 303.89 > 13% damp summary osx-10-10 opt e10s 302.7 -> 342.71 > 11% damp summary linux64 opt e10s 293.99 -> 326.18 > 10% damp summary linux64 pgo e10s 245.15 -> 270.28 > 9% tps summary linux64 opt e10s 41.7 -> 45.47 > 8% damp summary windows7-32 opt e10s 300.24 -> 325.07 > 8% tps summary linux64 pgo e10s 36.23 -> 39.22 > > Improvements: > > 3% tps summary windows8-64 opt e10s 37.1 -> 35.86 > > For up to date results, see: > https://treeherder.mozilla.org/perf.html#/alerts?id=4055 > > > please address these talos issues prior to landing this again.

Gabor Krizsanits (INACTIVE)

Assignee

Comment 1

•

8 years ago

Content process startup optimization will fix some of this regression, we should decide what is the acceptable regression for release (process startup and script execution won't happen without a cost). We also have to decide if we want to force single content process for these tests for now or we want to block turning on 2cp's on nightly.

Depends on: 1317304

Flags: needinfo?(mrbkap)

Whiteboard: [e10s-multi:?]

Joel Maher ( :jmaher ) (UTC -8)

Comment 2

•

8 years ago

let me know if I can help answer questions or debug stuff.

Gabor Krizsanits (INACTIVE)

Assignee

Updated

•

8 years ago

Flags: needinfo?(jdemooij)

Gabor Krizsanits (INACTIVE)

Assignee

Comment 3

•

8 years ago

Jan, I will try to delay some of these scripts, but realistically we will need something like bug 876173 for this as well. Are you working on something related these days?

Gabor Krizsanits (INACTIVE)

Assignee

Comment 4

•

8 years ago

Eh sorry for the spam this was the wrong bug :(

Flags: needinfo?(jdemooij)

Gabor Krizsanits (INACTIVE)

Assignee

Comment 5

•

8 years ago

(In reply to Joel Maher ( :jmaher) from comment #2) > let me know if I can help answer questions or debug stuff. Thanks! So tabpaint is hardly surprising but I'm not sure about the damp regression. Opening new tab or window is expected to start slower, is this something that would explain the damp regression or do you think there is something else?

Gabor Krizsanits (INACTIVE)

Assignee

Comment 6

•

8 years ago

Mike, is it expected that tps is regressed by multiple content processes? I would expect it not to open tabs during the run but the extra process startup might cause some delays at the start of the test... which should not matter that much either... do you have any ideas why this test is so sensitive about an extra content process?

Flags: needinfo?(mconley)

Gabor Krizsanits (INACTIVE)

Assignee

Comment 7

•

8 years ago

(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #1) > We also have to decide if we want to force single content process for these > tests for now or we want to block turning on 2cp's on nightly. :jimm thinks if we decide to turn it on on nightly with this regression, we should not hide it by forcing single content process.

Joel Maher ( :jmaher ) (UTC -8)

Comment 8

•

8 years ago

regarding damp, looking at the subtests: linux64: https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-inbound&originalRevision=fb8d5979d0f1e4f17416e5d027a46e7a2068e1c5&newProject=mozilla-inbound&newRevision=085035586d2b4d47ed03dee2ea727df6f2de6679&originalSignature=edaec66500db21d37602c99daa61ac983f21a6ac&newSignature=edaec66500db21d37602c99daa61ac983f21a6ac&framework=1 osx: https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-inbound&originalRevision=fb8d5979d0f1e4f17416e5d027a46e7a2068e1c5&newProject=mozilla-inbound&newRevision=085035586d2b4d47ed03dee2ea727df6f2de6679&originalSignature=9663a154caf9b713fb951c8db24518c7e0ceddce&newSignature=9663a154caf9b713fb951c8db24518c7e0ceddce&framework=1 you can see similar patterns where we have some improvements and many regressions. specifically in the open/reload measurements for the different tools, the close measurements have little to no regressions. The wins that we see are in the reload typically, so that leaves the smoking gun pointing at the open measurements for damp- let me include bgrins on here as he might have some valuable insight into the damp test and specifically the open measurements.

Brian Grinstead [:bgrins]

Comment 9

•

8 years ago

(In reply to Joel Maher ( :jmaher) from comment #8) > regarding damp, looking at the subtests: > linux64: > https://treeherder.mozilla.org/perf.html#/ > comparesubtest?originalProject=mozilla- > inbound&originalRevision=fb8d5979d0f1e4f17416e5d027a46e7a2068e1c5&newProject= > mozilla- > inbound&newRevision=085035586d2b4d47ed03dee2ea727df6f2de6679&originalSignatur > e=edaec66500db21d37602c99daa61ac983f21a6ac&newSignature=edaec66500db21d37602c > 99daa61ac983f21a6ac&framework=1 > > osx: > https://treeherder.mozilla.org/perf.html#/ > comparesubtest?originalProject=mozilla- > inbound&originalRevision=fb8d5979d0f1e4f17416e5d027a46e7a2068e1c5&newProject= > mozilla- > inbound&newRevision=085035586d2b4d47ed03dee2ea727df6f2de6679&originalSignatur > e=9663a154caf9b713fb951c8db24518c7e0ceddce&newSignature=9663a154caf9b713fb951 > c8db24518c7e0ceddce&framework=1 > > you can see similar patterns where we have some improvements and many > regressions. specifically in the open/reload measurements for the different > tools, the close measurements have little to no regressions. The wins that > we see are in the reload typically, so that leaves the smoking gun pointing > at the open measurements for damp- let me include bgrins on here as he might > have some valuable insight into the damp test and specifically the open > measurements. FYI there's a possibility for some noise in between DAMP subtests right now, so if RDP requests are still ongoing after opening the toolbox then it might bleed into the reload time. I'm hoping that Bug 1291815 will prevent that from happening. Generally, open time is how long it takes from the start of opening the toolbox until the toolbox and specific tool are fully loaded. Most of the work is in the parent process (loading a bunch of modules and building UI), but it's also waiting for the RDP (which is running in the content process) to return whatever is needed for the tool. In some tools there can be quite a bit of traffic on a complicated page (like the inspector needs a lot of info about the DOM even at startup).

Comment 10

•

8 years ago

(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #6) > Mike, is it expected that tps is regressed by multiple content processes? I > would expect it not to open tabs during the run but the extra process > startup might cause some delays at the start of the test... which should not > matter that much either... do you have any ideas why this test is so > sensitive about an extra content process? Nothing comes to mind, I'm afraid. By the time that we start measuring tab switch times, the tp5 pages should have finished loading (and presumably, the content processes are both fully started-up). I think we'll need to profile it to get a sense of what's going wrong here.

Flags: needinfo?(mconley)

Mike Conley (:mconley) (:⚙️)

Comment 11

•

8 years ago

ni?ing myself to get some profiles to gabor / mrbkap next week.

Flags: needinfo?(mconley)

Gabor Krizsanits (INACTIVE)

Assignee

Comment 12

•

8 years ago

(In reply to Mike Conley (:mconley) from comment #11) > ni?ing myself to get some profiles to gabor / mrbkap next week. tps e10s-multi https://cleopatra.io/#report=a9685a46f48e7aedd4a06cbfd7d7e39fa61f12c4&selection=0,8 tps e10s-single https://cleopatra.io/#report=4c2b79c10b4d6240c6890fc4e3e18cc49863d19d&selection=0,8 It seems to me that in the multi process case we run the first batch of test in one process and the second in another, plus I'm not sure we wait for the process to fully initialize (loading all the process scripts frame scripts and what not). I'm not sure why we can't see what is in the XRE_InitChildProcess but probably we don't start the measurements early enough, which is OK as long as we start the test measurement after the process warmed up. Anyway, what I have assumed from this test that it loads some of the tabs in one process and some in the other does not seem to happen and the regression seem to be an indicator that this test should be adjusted for e10s-multi.

Gabor Krizsanits (INACTIVE)

Assignee

Comment 13

•

8 years ago

tabpaint e10s-single https://cleopatra.io/#report=8f75b775d7210af02374f8e06df648f7936d7bac&selection=0,8 tabpaint e10s-multi https://cleopatra.io/#report=ab2845f370e9282e28ee2d4a2fe6258465c50f1f This test isn't exactly fair either. The single content process version seem to run the test in the original content process that is already warmed up mostly, while the e10s-multi version does not (so we basically cheating for e10s-single). On the plus side the e10s-multi case gives a nice view about the time it takes about some scripts to run at startup (I don't see the process scripts though).

Gabor Krizsanits (INACTIVE)

Assignee

Comment 14

•

8 years ago

(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #12) > It seems to me that in the multi process case we run the first batch of test > in one process and the second in another, plus I'm not sure we wait for the > process to fully initialize http://searchfox.org/mozilla-central/source/testing/talos/talos/tests/tabswitch/bootstrap.js#204 This should guarantee that the process that owns the tab we're switching to is initialized, but does not seem to guarantee that all the other processes finished their initialization, so that can be a source of regression but I'm not sure. We should probably do this when we open the batch of tabs with an array of promises. I still don't get what's up with the profiler, the multi-process case is weird, I cannot interpret it. Here is one with processCount set to 5: https://cleopatra.io/#report=2f611a922c19ca2e72566711a2e2a9040d2c6649

Mike Conley (:mconley) (:⚙️)

Comment 15

•

8 years ago

Flags: needinfo?(mconley) → needinfo?(jmuizelaar)

Mike Conley (:mconley) (:⚙️)

Updated

•

8 years ago

Depends on: 1324033

Gabor Krizsanits (INACTIVE)

Assignee

Updated

•

8 years ago

Blocks: 1303113

Gabor Krizsanits (INACTIVE)

Assignee

Comment 16

•

8 years ago

Filtering out the glyph caching difference between single and multi cp cases from the tps test seems to be doing the trick: https://treeherder.mozilla.org/#/jobs?repo=try&revision=09f2e414b80dc9184563a3d4527ea96663fde46d

Gabor Krizsanits (INACTIVE)

Assignee

Comment 17

•

8 years ago

Attached patch part1: Use single content process in tabpaint — Details — Splinter Review

We don't measure the content process startup time for the single content process case in this test, so we should not measure it for the multi case either. The easiest way to do that is just forcing single cp.

Attachment #8822647 - Flags: review?(mconley)

Gabor Krizsanits (INACTIVE)

Assignee

Comment 18

•

8 years ago

Attached patch part2: Tps should filter out glyph cache init — Details — Splinter Review

Thanks again Mike for tracking this down :)

Attachment #8822648 - Flags: review?(mconley)

Mike Conley (:mconley) (:⚙️)

Comment 19

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8b5356ff3472

Mike Conley (:mconley) (:⚙️)

Comment 20

•

8 years ago

Comment on attachment 8822647 [details] [diff] [review] part1: Use single content process in tabpaint Review of attachment 8822647 [details] [diff] [review]: ----------------------------------------------------------------- Reasoning is sound and the patch looks fine to me. You'll need to bump the revision of the tabpaint test's install.rdf[1], and also package / re-sign[2] the add-on (I know, it kinda blows). [1]: http://searchfox.org/mozilla-central/rev/51aa673e28802fe6ea89f4793721fc55334a6ac8/testing/talos/talos/tests/tabpaint/install.rdf#7 [2]: https://wiki.mozilla.org/EngineeringProductivity/HowTo/SignExtensions#Signing_an_Addon

Attachment #8822647 - Flags: review?(mconley) → review+

Mike Conley (:mconley) (:⚙️)

Comment 21

•

8 years ago

Comment on attachment 8822648 [details] [diff] [review] part2: Tps should filter out glyph cache init Review of attachment 8822648 [details] [diff] [review]: ----------------------------------------------------------------- Looks good here too (modulo typo). Same need as above, re: bumping the version[1] and signing. [1]: http://searchfox.org/mozilla-central/rev/51aa673e28802fe6ea89f4793721fc55334a6ac8/testing/talos/talos/tests/tabswitch/install.rdf#7 ::: testing/talos/talos/tests/tabswitch/bootstrap.js @@ +405,5 @@ > + // Let's do an initial run to warm up any paint related caches > + // (like glyph caches for text). In the next loop we will start with > + // a GC before each switch so we don't need here. > + // Note: in case of multiple content processes, closing all the tabs > + // would close the related conent processes, and even if we kept them typo: "conent" -> "content".

Attachment #8822648 - Flags: review?(mconley) → review+

Mike Conley (:mconley) (:⚙️)

Updated

•

8 years ago

Blocks: 1323319

Pulsebot

Comment 22

•

8 years ago

Pushed by gkrizsanits@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/d0d55af22bcb Tps should filter out glyph cache overhead. r=mconley https://hg.mozilla.org/integration/mozilla-inbound/rev/851c282874b1 Tabpaint test should not measure content process startup. r=mconley

Gabor Krizsanits (INACTIVE)

Assignee

Comment 23

•

8 years ago

(In reply to Mike Conley (:mconley) - PTO until Jan 3 from comment #20) > Reasoning is sound and the patch looks fine to me. You'll need to bump the > revision of the tabpaint test's install.rdf[1], and also package / > re-sign[2] the add-on (I know, it kinda blows). I had no idea... and thanks for the links it saved me a lot of time :)

Phil Ringnalda (:philor)

Comment 24

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/d0d55af22bcb https://hg.mozilla.org/mozilla-central/rev/851c282874b1

Status: NEW → RESOLVED

Closed: 8 years ago

status-firefox53: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla53

Joel Maher ( :jmaher ) (UTC -8)

Comment 25

•

8 years ago

we have improvements in the tps test after this landed: == Change summary for alert #4695 (as of January 03 2017 13:02 UTC) == Improvements: 38% tps summary windows8-64 opt e10s 40.39 -> 24.85 35% tps summary windows7-32 opt e10s 42.37 -> 27.56 19% tps summary osx-10-10 opt 70.13 -> 56.96 11% tps summary linux64 opt 59.15 -> 52.9 10% tps summary windows7-32 opt 73.54 -> 66 10% tps summary windows8-64 opt 68.5 -> 61.48 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=4695 I assume these are expected and probably a test specific change.

Mike Conley (:mconley) (:⚙️)

Comment 26

•

8 years ago

Comment on attachment 8822648 [details] [diff] [review] part2: Tps should filter out glyph cache init Approval Request Comment [Feature/Bug causing the regression]: Bug 1323319 [User impact if declined]: None - but we'll show a talos regression on the tps test. [Is this code covered by automated tests?]: Yes. [Has the fix been verified in Nightly?]: Yes. [Needs manual test from QE? If yes, steps to reproduce]: No. [List of other uplifts needed for the feature/fix]: None. [Is the change risky?]: No. [Why is the change risky/not risky?]: It's updating a talos test to more correctly measure what it's supposed to measure, and avoiding alarms going off if and when bug 1323319 uplifts. [String changes made/needed]: None.

Attachment #8822648 - Flags: approval-mozilla-aurora?

Julien Cristau [:jcristau]

Comment 27

•

8 years ago

Comment on attachment 8822648 [details] [diff] [review] part2: Tps should filter out glyph cache init talos test change, aurora52+, needed for bug 1323319

Attachment #8822648 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Ryan VanderMeulen [:RyanVM]

Updated

•

8 years ago

Assignee: nobody → gkrizsanits

Ryan VanderMeulen [:RyanVM]

Comment 28

•

8 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-aurora/rev/773f4844d11b https://hg.mozilla.org/releases/mozilla-aurora/rev/d37e0d774b83

status-firefox52: --- → fixed

Flags: in-testsuite+

Joel Maher ( :jmaher ) (UTC -8)

Comment 29

•

8 years ago

to confirm, we are fine with the damp regressions? I see these changes after we landed this change: == Change summary for alert #4876 (as of January 23 2017 13:16 UTC) == Regressions: 15% damp summary osx-10-10 opt 312.58 -> 357.95 13% damp summary windows7-32 opt 310.46 -> 349.73 10% damp summary linux64 pgo 252.03 -> 277.76 6% damp summary windows8-64 pgo 248.59 -> 263.63 Improvements: 16% tps summary osx-10-10 opt 30.11 -> 25.42 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=4876

Blake Kaplan (:mrbkap) (inactive)

Updated

•

8 years ago

Flags: needinfo?(mrbkap)

part1: Use single content process in tabpaint 8 years ago Gabor Krizsanits (INACTIVE) 1.75 KB, patch	mconley : review+	Details \| Diff \| Splinter Review
part2: Tps should filter out glyph cache init 8 years ago Gabor Krizsanits (INACTIVE) 2.48 KB, patch	mconley : review+ jcristau : approval-mozilla-aurora+	Details \| Diff \| Splinter Review