Closed
Bug 821371
Opened 12 years ago
Closed 12 years ago
Add telemetry for prep work done for cycle collection
Categories
(Core :: XPCOM, defect)
Core
XPCOM
Tracking
()
RESOLVED
FIXED
mozilla20
People
(Reporter: mccr8, Assigned: mccr8)
References
Details
(Whiteboard: [Snappy])
Attachments
(2 files, 1 obsolete file)
4.64 KB,
patch
|
smaug
:
review+
|
Details | Diff | Splinter Review |
9.39 KB,
patch
|
smaug
:
review+
|
Details | Diff | Splinter Review |
Olli thinks this is causing long CC pauses with small graphs when we CC right after a GC, which is certainly believable. He's fixing it in another bug by trying harder to not run a CC right after a GC, but we should display in the error console when this happens, and add telemetry for it.
Assignee | ||
Comment 1•12 years ago
|
||
I also noticed that the CYCLE_COLLECTOR time does not include any synchronous work we do in nsJSEnvironment, such as finishing a GC or synchronous forget skippables. We should probably start recording that, too.
Assignee | ||
Comment 2•12 years ago
|
||
We should also include the time spent finishing off the current GC. If either of these things are still a problem after Olli's patch, we can try complicating things more to split out work into a separate slice.
Assignee | ||
Updated•12 years ago
|
Whiteboard: [Snappy]
Assignee | ||
Comment 3•12 years ago
|
||
I have a patch to record various related data. There are two things other than the actual CC that can occur during a CC pause: 1) Finishing the current incremental GC, if the CC has been locked out for too long. 2) Running ForgetSkippable I'm wondering what would make sense to record for telemetry. I could record a simple boolean for (1) and (2), if we had to do that kind of work at all, or the time spent in them. I'm planning on adding telemetry for the full CC pause time, so that plus the boolean may hint at what is causing problems. I could also just record the time spent in (1) or (2). For this, I'm inclined to only record the time for FS, as finishing the current IGC isn't a huge problem at the moment. Maybe I could record a boolean for (1) and the time for (2)? Or maybe adding 50 vs 2 buckets isn't a big deal and I should just always record the time? Any thoughts, Olli? (Side note: I have a patch to change how (1) is done so that if we are locked out and we CCNow, then we'll always finish the IGC inside of CCNow, so it will be recorded.)
Assignee | ||
Updated•12 years ago
|
Summary: Record number of synchronous forgetSkippables done before a CC → Add telemetry for prep work done for cycle collection
Assignee | ||
Comment 4•12 years ago
|
||
This is a little ugly, but I think necessary to detect possible regressions that make IGC lock out CC too much.
Assignee | ||
Comment 5•12 years ago
|
||
This adds three new measurements: - time spent finishing off IGCs when the CC was locked out for too long - time spent doing forget skippable synchornously with the CC - the entire time spent doing work in CycleCollectNow For these three measurements, we make sure they are in telemetry, the error console, and the JSON report.
Comment 6•12 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #3) > I'm wondering what would make sense to record for telemetry. I could record > a simple boolean for (1) and (2), if we had to do that kind of work at all, Boolean sounds good > or the time spent in them. I'm planning on adding telemetry for the full CC > pause time, so that plus the boolean may hint at what is causing problems. Sound good. > I > could also just record the time spent in (1) or (2). For this, Hmm, perhaps not too useful.
Assignee | ||
Comment 7•12 years ago
|
||
So, you think I should do just the boolean for the two prep phases, plus the full pause time? That sounds okay to me.
Comment 8•12 years ago
|
||
yeah, I think that is enough.
Assignee | ||
Comment 9•12 years ago
|
||
My attempt to pretty up the formatting strings in the source by splitting them into multiple lines appears to not work on MSVC, so I will have to fix that.
Assignee | ||
Comment 10•12 years ago
|
||
Attachment #696365 -
Flags: review?(bugs)
Assignee | ||
Updated•12 years ago
|
Attachment #695816 -
Attachment is obsolete: true
Assignee | ||
Updated•12 years ago
|
Attachment #695815 -
Flags: review?(bugs)
Comment 11•12 years ago
|
||
Comment on attachment 695815 [details] [diff] [review] part 1: don't finish current GC when CCTimerFired calls into CycleCollectNow This is indeed a bit ugly. I thought first FinishAnyIncrementalGC was there for some optimization, but it is there to not change the current behavior. Other option would be to use some global variable which is set in CCTimerFired and also in CycleCollectNow if not already set, and then used in CycleCollectNow after CC. Not sure that would be any less-ugly. But, to make this less error prone, could you call FinishAnyIncrementalGC() in FireForgetSkippable() (and in CycleCollectNow). That would change the behavior if there are just a few new suspected objects, but I believe the change would be ok (gives more time for iGC).
Attachment #695815 -
Flags: review?(bugs) → review+
Updated•12 years ago
|
Attachment #696365 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 12•12 years ago
|
||
Good suggestion. Landed with review comments addressed. https://hg.mozilla.org/integration/mozilla-inbound/rev/faf5e1fdde88 https://hg.mozilla.org/integration/mozilla-inbound/rev/8c4236dde5e3
Assignee | ||
Comment 13•12 years ago
|
||
Followup to remove the !sCCLockedOut assertions, because with FinishAnyIncrementalGC moved to FireForgetSkippable() we don't always finish the current IGC by the time we finish the timer, in the case where we decide we don't need to run the CC anyways. Oops, my mistake. https://hg.mozilla.org/integration/mozilla-inbound/rev/1e90b2848512
Comment 14•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/faf5e1fdde88 https://hg.mozilla.org/mozilla-central/rev/8c4236dde5e3 https://hg.mozilla.org/mozilla-central/rev/1e90b2848512
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
Comment 15•12 years ago
|
||
There are now crashes on windows.
Comment 16•12 years ago
|
||
https://crash-stats.mozilla.com/report/index/bp-24ff3faf-9268-4847-9eff-aeb6c2130102
Comment 17•12 years ago
|
||
Should %llums be %lums now? ccNowDuration is 32bit
Comment 18•12 years ago
|
||
Tryserver for that https://tbpl.mozilla.org/?tree=Try&rev=01740f1c4a44
Assignee | ||
Comment 20•12 years ago
|
||
Yikes, thanks for fixing that.
You need to log in
before you can comment on or make changes to this bug.
Description
•