Closed
Bug 1184277
Opened 10 years ago
Closed 10 years ago
Have talos run GC / CC in the content process when the parent does it
Categories
(Core :: DOM: Content Processes, defect)
Core
DOM: Content Processes
Tracking
()
RESOLVED
FIXED
People
(Reporter: mconley, Assigned: mconley)
References
(Blocks 1 open bug)
Details
Attachments
(3 files)
Bug 1174776 tracks a 8.5%-30% tp5o talos regression. We've got profiles for some of the worst offenders now:
# xinhuanet.com
non-e10s: http://people.mozilla.org/~bgirard/cleopatra/?zippedProfile=http://mozilla-releng-blobs.s3.amazonaws.com/blobs/Try-Non-PGO/sha512/3f1cf539d64869a3159e7cb1934e3b47112b5cfea8cdb5b8a2208f2d44c46ec326601ef87c617547dccd865533b2bc9497031b8bf7a91e797c3d2ce43036430f&pathInZip=profile_tp5o/xinhuanet.com/cycle_0.sps#report=de3f6a10f8926d6a876c537d1dd092d8e548e35d&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A824068,%22end%22%3A825106%7D%5D&selection=0,1
e10s: http://people.mozilla.org/~bgirard/cleopatra/?zippedProfile=http://mozilla-releng-blobs.s3.amazonaws.com/blobs/Try-Non-PGO/sha512/6a7aa316e29e3130cc3de4fa966904b6739babcaa243a5e84c5331c1ec7c435f8c94f5631d5f31174f86e006f44dd4464c0c22bc4ac295db8c61c48859328eea&pathInZip=profile_tp5o/xinhuanet.com/cycle_0.sps
# youtube.com
non-e10s: http://people.mozilla.org/~bgirard/cleopatra/?zippedProfile=http://mozilla-releng-blobs.s3.amazonaws.com/blobs/Try-Non-PGO/sha512/3f1cf539d64869a3159e7cb1934e3b47112b5cfea8cdb5b8a2208f2d44c46ec326601ef87c617547dccd865533b2bc9497031b8bf7a91e797c3d2ce43036430f&pathInZip=profile_tp5o/youtube.com/cycle_0.sps#report=0e810e3b58dac0fe5f297991a08ffc622f4b1c5c&filter=%5B%7B%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A918418,%22end%22%3A918849%7D%5D&selection=0,1,593,188,188,678,679,681,682,683,680,681,682,683,684,685,686,687,688,689,729,681,682,683,684,685,686,687,682,683,684,685,686,687,738,739,740,741,682,683,680,681,738,739,740,741,682,743,744,682,683,684,685,686,687,758,759,760,687,730,698,699,700,701,702,703,704
e10s: http://people.mozilla.org/~bgirard/cleopatra/?zippedProfile=http://mozilla-releng-blobs.s3.amazonaws.com/blobs/Try-Non-PGO/sha512/6a7aa316e29e3130cc3de4fa966904b6739babcaa243a5e84c5331c1ec7c435f8c94f5631d5f31174f86e006f44dd4464c0c22bc4ac295db8c61c48859328eea&pathInZip=profile_tp5o/youtube.com/cycle_0.sps
One thing I noticed is that we spend much, much more time cycle collecting in the content process with e10s than in the main process with non-e10s.
Samples spend in nsCycleCollector::Collect:
xinhuanet.com:
non-e10s: 0
e10s: 466
youtube.com:
non-e10s: 11
e10s: 659
We should find out why.
Comment 1•10 years ago
|
||
The current theory is:
http://hg.mozilla.org/build/talos/file/8a61f8d21fc0/talos/pageloader/chrome/pageloader.js#l120
Here we force GC/CC. In e10s mode this doesn't clean up the content process garbage which would explain why we're seeing a lot more GC/CC during the page load themselves.
We should try having e10s CC behavior match the non e10s behavior and the scores deviation should converge.
Comment 2•10 years ago
|
||
Comment 3•10 years ago
|
||
Yeah that sounds right to me. Mochitest e10s had a similar issue.
Comment 4•10 years ago
|
||
Here's a try push that should be GC/CCing on the child:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1d8771c91995
Comment 5•10 years ago
|
||
https://people.mozilla.org/~bgirard/cleopatra/?zippedProfile=http://mozilla-releng-blobs.s3.amazonaws.com/blobs/Try/sha512/668bbef25bc4815042eb84b15d3850eda7a6db200988b8243e528ac81374f9c8984a7c6e8e6de33051c9edc8cee96095c9f416d13266a2dce2213c752eade5c2&pathInZip=profile_tp5o/youtube.com/cycle_0.sps
Looks much better:
https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=try&originalRevision=3a36fc31a1a1&newProject=try&newRevision=01ee797fc2c9&originalSignature=bd72d04511c657c5c5040f1633fe73642fcdcb3b&newSignature=bd72d04511c657c5c5040f1633fe73642fcdcb3b
->
https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=try&originalRevision=3a36fc31a1a1&newProject=try&newRevision=1d8771c91995&originalSignature=bd72d04511c657c5c5040f1633fe73642fcdcb3b&newSignature=bd72d04511c657c5c5040f1633fe73642fcdcb3b
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → bgirard
Assignee | ||
Comment 6•10 years ago
|
||
Yes, a nice improvement there. Is this patch ready for review? If so, I'd suggest jmaher.
Comment 7•10 years ago
|
||
It's not. I hacked the GC/CC platform messages to be sync message to test this:
https://hg.mozilla.org/try/rev/1d8771c91995
I'm not sure if we should try to make the GC/CC wait on the child by sending and waiting asynchronously on a response or if we should (not ideal) make the platform messages sync?
I'm not sure what the best way of waiting on the GC/CC is here. I can add callbacks and add a js message but it doesn't feel ideal.
Comment 8•10 years ago
|
||
(In reply to Benoit Girard (:BenWa) from comment #7)
> I'm not sure what the best way of waiting on the GC/CC is here. I can add
> callbacks and add a js message but it doesn't feel ideal.
In my still-unlanded patches for bug 1096614 and the bugs it depends on (in particular, bug 1149297) I made a thing using a promise with AsyncShutdown to wait for messages from all of the children. The async shutdown thing handles timeouts. The messages are sent async using the message manager. It worked at least in local testing, but I never got to the point where I actually tried it on a full mochitest run.
Updated•10 years ago
|
Assignee | ||
Comment 9•10 years ago
|
||
What we probably want to do is have pageloader send down a message to the child when it wants to do a GC / CC, and then go back to the event loop until a message comes back from the child saying "Done".
Comment 10•10 years ago
|
||
Any chance someone more capable then I in front-end code want to pick this up and steal credit? I'll get a head start on the next regression cause.
Assignee | ||
Updated•10 years ago
|
Summary: Investigate aggressive cycle collection in content process during tp5o test → Have talos run GC / CC in the content process when the parent doe sit
Assignee | ||
Updated•10 years ago
|
Summary: Have talos run GC / CC in the content process when the parent doe sit → Have talos run GC / CC in the content process when the parent does it
Assignee | ||
Comment 12•10 years ago
|
||
Bug 1184277 - Make the nextPage callback of Pageloader use Task.async so we can do async stuff between pageloads. r?jmaher
There are various things we'd like to do between pageloads, like asynchronously dump
profiles, or send messages to content (like "do a GC / CC"). This adds the infrastructure
to do these async activities between page loads.
Attachment #8636242 -
Flags: review?(jmaher)
Assignee | ||
Comment 13•10 years ago
|
||
Bug 1184277 - Have the content process force GC / CC between pageload tests. r?jmaher
Attachment #8636243 -
Flags: review?(jmaher)
Comment 14•10 years ago
|
||
Comment on attachment 8636242 [details]
MozReview Request: Bug 1184277 - Make the nextPage callback of Pageloader use Task.async so we can do async stuff between pageloads. r?jmaher
https://reviewboard.mozilla.org/r/13619/#review12291
this is simple and straightforward. Will this work on Android? I would like to ensure we compare talos numbers before/after this change (async + gc/cc) to ensure that we understand any shifts we see.
Attachment #8636242 -
Flags: review?(jmaher) → review+
Comment 15•10 years ago
|
||
Comment on attachment 8636243 [details]
MozReview Request: Bug 1184277 - Have the content process force GC / CC between pageload tests. r?jmaher
https://reviewboard.mozilla.org/r/13621/#review12293
overall this is straightforward, thanks! I could see a followup bug to move all the code we load in the child process to be in talos-content.js- that way we could have pageload/cc/etc. in the same file and setup once/cleanup once.
::: talos/pageloader/chrome/pageloader.js:456
(Diff revision 1)
> + report.recordCCTime(contentCCTime);
while we don't really use the CC time in the report won't the double entry here confuse us? We report on line 452 already and now we are reporting again on line 456.
Attachment #8636243 -
Flags: review?(jmaher) → review+
Assignee | ||
Comment 16•10 years ago
|
||
url: https://hg.mozilla.org/build/talos/rev/124e5287b36186a45c99f0ee5745a89a8fc2d69a
changeset: 124e5287b36186a45c99f0ee5745a89a8fc2d69a
user: Mike Conley <mconley@mozilla.com>
date: Fri Jul 17 17:37:01 2015 -0400
description:
Bug 1184277 - Make the nextPage callback of Pageloader use Task.async so we can do async stuff between pageloads. r=jmaher.
There are various things we'd like to do between pageloads, like asynchronously dump
profiles, or send messages to content (like "do a GC / CC"). This adds the infrastructure
to do these async activities between page loads.
url: https://hg.mozilla.org/build/talos/rev/e97460aa654e5b4064142b747516a4e4849059b7
changeset: e97460aa654e5b4064142b747516a4e4849059b7
user: Mike Conley <mconley@mozilla.com>
date: Mon Jul 20 12:26:19 2015 -0400
description:
Bug 1184277 - Have the content process force GC / CC between pageload tests. r=jmaher
Assignee | ||
Comment 17•10 years ago
|
||
For the record, I removed the content process GC / CC timing stuff before I landed.
Assignee | ||
Comment 18•10 years ago
|
||
url: https://hg.mozilla.org/build/talos/rev/deca965654a39a38c9084804adebd78c1e23efb7
changeset: deca965654a39a38c9084804adebd78c1e23efb7
user: Mike Conley <mconley@mozilla.com>
date: Tue Jul 21 15:13:43 2015 -0400
description:
Bug 1184277 - Follow up: add talos-content.js to the main thread IO whitelist. r=jmaher.
Assignee | ||
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•