Closed Bug 821371 Opened 13 years ago Closed 12 years ago

Add telemetry for prep work done for cycle collection

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla20

People

(Reporter: mccr8, Assigned: mccr8)

References

Details

(Whiteboard: [Snappy])

Attachments

(2 files, 1 obsolete file)

part 1: don't finish current GC when CCTimerFired calls into CycleCollectNow 12 years ago Andrew McCreight [:mccr8] 4.64 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
part 2: add telemetry etc for time taken for prep work before CC 12 years ago Andrew McCreight [:mccr8] 9.56 KB, patch		Details \| Diff \| Splinter Review
part 2: add telemetry etc for time taken for prep work before CC 12 years ago Andrew McCreight [:mccr8] 9.39 KB, patch	smaug : review+	Details \| Diff \| Splinter Review

Andrew McCreight [:mccr8]

Assignee

Description

•

13 years ago

Olli thinks this is causing long CC pauses with small graphs when we CC right after a GC, which is certainly believable. He's fixing it in another bug by trying harder to not run a CC right after a GC, but we should display in the error console when this happens, and add telemetry for it.

Andrew McCreight [:mccr8]

Assignee

Comment 1

•

13 years ago

I also noticed that the CYCLE_COLLECTOR time does not include any synchronous work we do in nsJSEnvironment, such as finishing a GC or synchronous forget skippables. We should probably start recording that, too.

Andrew McCreight [:mccr8]

Assignee

Comment 2

•

13 years ago

We should also include the time spent finishing off the current GC. If either of these things are still a problem after Olli's patch, we can try complicating things more to split out work into a separate slice.

Andrew McCreight [:mccr8]

Assignee

Updated

•

12 years ago

Whiteboard: [Snappy]

Andrew McCreight [:mccr8]

Assignee

Comment 3

•

12 years ago

I have a patch to record various related data. There are two things other than the actual CC that can occur during a CC pause: 1) Finishing the current incremental GC, if the CC has been locked out for too long. 2) Running ForgetSkippable I'm wondering what would make sense to record for telemetry. I could record a simple boolean for (1) and (2), if we had to do that kind of work at all, or the time spent in them. I'm planning on adding telemetry for the full CC pause time, so that plus the boolean may hint at what is causing problems. I could also just record the time spent in (1) or (2). For this, I'm inclined to only record the time for FS, as finishing the current IGC isn't a huge problem at the moment. Maybe I could record a boolean for (1) and the time for (2)? Or maybe adding 50 vs 2 buckets isn't a big deal and I should just always record the time? Any thoughts, Olli? (Side note: I have a patch to change how (1) is done so that if we are locked out and we CCNow, then we'll always finish the IGC inside of CCNow, so it will be recorded.)

Andrew McCreight [:mccr8]

Assignee

Updated

•

12 years ago

Summary: Record number of synchronous forgetSkippables done before a CC → Add telemetry for prep work done for cycle collection

Andrew McCreight [:mccr8]

Assignee

Comment 4

•

12 years ago

Attached patch part 1: don't finish current GC when CCTimerFired calls into CycleCollectNow — Details — Splinter Review

This is a little ugly, but I think necessary to detect possible regressions that make IGC lock out CC too much.

Andrew McCreight [:mccr8]

Assignee

Comment 5

•

12 years ago

Attached patch part 2: add telemetry etc for time taken for prep work before CC (obsolete) — Details — Splinter Review

This adds three new measurements: - time spent finishing off IGCs when the CC was locked out for too long - time spent doing forget skippable synchornously with the CC - the entire time spent doing work in CycleCollectNow For these three measurements, we make sure they are in telemetry, the error console, and the JSON report.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 6

•

12 years ago

(In reply to Andrew McCreight [:mccr8] from comment #3) > I'm wondering what would make sense to record for telemetry. I could record > a simple boolean for (1) and (2), if we had to do that kind of work at all, Boolean sounds good > or the time spent in them. I'm planning on adding telemetry for the full CC > pause time, so that plus the boolean may hint at what is causing problems. Sound good. > I > could also just record the time spent in (1) or (2). For this, Hmm, perhaps not too useful.

Andrew McCreight [:mccr8]

Assignee

Comment 7

•

12 years ago

So, you think I should do just the boolean for the two prep phases, plus the full pause time? That sounds okay to me.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 8

•

12 years ago

yeah, I think that is enough.

Andrew McCreight [:mccr8]

Assignee

Comment 9

•

12 years ago

My attempt to pretty up the formatting strings in the source by splitting them into multiple lines appears to not work on MSVC, so I will have to fix that.

Andrew McCreight [:mccr8]

Assignee

Comment 10

•

12 years ago

Attached patch part 2: add telemetry etc for time taken for prep work before CC — Details — Splinter Review

Attachment #696365 - Flags: review?(bugs)

Andrew McCreight [:mccr8]

Assignee

Updated

•

12 years ago

Attachment #695816 - Attachment is obsolete: true

Andrew McCreight [:mccr8]

Assignee

Updated

•

12 years ago

Attachment #695815 - Flags: review?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 11

•

12 years ago

Comment on attachment 695815 [details] [diff] [review] part 1: don't finish current GC when CCTimerFired calls into CycleCollectNow This is indeed a bit ugly. I thought first FinishAnyIncrementalGC was there for some optimization, but it is there to not change the current behavior. Other option would be to use some global variable which is set in CCTimerFired and also in CycleCollectNow if not already set, and then used in CycleCollectNow after CC. Not sure that would be any less-ugly. But, to make this less error prone, could you call FinishAnyIncrementalGC() in FireForgetSkippable() (and in CycleCollectNow). That would change the behavior if there are just a few new suspected objects, but I believe the change would be ok (gives more time for iGC).

Attachment #695815 - Flags: review?(bugs) → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

12 years ago

Attachment #696365 - Flags: review?(bugs) → review+

Andrew McCreight [:mccr8]

Assignee

Comment 12

•

12 years ago

Good suggestion. Landed with review comments addressed. https://hg.mozilla.org/integration/mozilla-inbound/rev/faf5e1fdde88 https://hg.mozilla.org/integration/mozilla-inbound/rev/8c4236dde5e3

Andrew McCreight [:mccr8]

Assignee

Comment 13

•

12 years ago

Followup to remove the !sCCLockedOut assertions, because with FinishAnyIncrementalGC moved to FireForgetSkippable() we don't always finish the current IGC by the time we finish the timer, in the case where we decide we don't need to run the CC anyways. Oops, my mistake. https://hg.mozilla.org/integration/mozilla-inbound/rev/1e90b2848512

Ryan VanderMeulen [:RyanVM]

Comment 14

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/faf5e1fdde88 https://hg.mozilla.org/mozilla-central/rev/8c4236dde5e3 https://hg.mozilla.org/mozilla-central/rev/1e90b2848512

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla20

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 15

•

12 years ago

There are now crashes on windows.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 16

•

12 years ago

https://crash-stats.mozilla.com/report/index/bp-24ff3faf-9268-4847-9eff-aeb6c2130102

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 17

•

12 years ago

Should %llums be %lums now? ccNowDuration is 32bit

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 18

•

12 years ago

Tryserver for that https://tbpl.mozilla.org/?tree=Try&rev=01740f1c4a44

Matthias Versen [:Matti]

Updated

•

12 years ago

Depends on: 826003

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 19

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/7a7bf00cd969

Boris Zbarsky [:bzbarsky]

Updated

•

12 years ago

Depends on: 826153

Andrew McCreight [:mccr8]

Assignee

Comment 20

•

12 years ago

Yikes, thanks for fixing that.

You need to log in before you can comment on or make changes to this bug.