Closed Bug 1364503 Opened 7 years ago Closed 7 years ago

GC_MAX_PAUSE_MS and CYCLE_COLLECTOR_MAX_PAUSE need to become "release criteria" quality

Categories

(Core :: JavaScript: GC, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: chutten, Assigned: chutten)

References

(Blocks 1 open bug)

Details

(Whiteboard: [measurement:client])

Attachments

(1 file)

Some quantum release criteria are to have < 2% of users experiencing maximum GC/CC pause durations of 150ms in chrome and 2500ms in content. :bsmedberg's proposal is that this is to be measured using GC_MAX_PAUSE_MS and CYCLE_COLLECTOR_MAX_PAUSE.

To do that these two probes will need to be made opt-out, and GC_MAX_PAUSE_MS will need to extend its high bucket beyond its current 1000ms. (Due to probe immutability, this will necessitate a rename)
The current measurement also makes it a little hard to see what is going on with the bad CCs, because the bulk of CCs are very short. This means that we don't get telemetry notifications for regressions in the 95th percentile, for instance (bug 1305142). Also, maybe a per-session measurement would be more useful. Really, I feel like a "bad CC" telemetry measure would be a useful supplement for the current telemetry, not a substitution for it.
What do you mean by "per-session measurement"?
What I was trying to say is that it would be nice if this telemetry measure can match more what we're trying to have as the criteria. If a user has one 10 second hang over the course of 10 hours (the CC runs every 10 seconds), then only a tiny % of their CCs are "bad" so it is hard to even see that bad CC, as it will be swamped with lots of useless data about how most of the time everything is fine.
Ah, that's why release criteria are very rarely raw data like what alerts.tmo looks at. There are and will be scheduled jobs to normalize instances of erroneous behaviour as a rate over time (or over url loads, or over active ticks, or...)

The raw data doesn't have that capability, sadly.
Comment on attachment 8870514 [details]
bug 1364503 - Allow GC/CC MAX_PAUSE collection in opt-out

https://reviewboard.mozilla.org/r/141158/#review146082

You should get review on the GC changes from jonco or sfink. I don't pay too much attention to that. The rest of it looks fine to me.
Attachment #8870514 - Flags: review?(continuation) → review+
Attachment #8870514 - Flags: feedback?(sphink)
Attachment #8870514 - Flags: feedback?(jcoppeard)
Hm. My immediate reaction is that I'd like to keep the old one too, at least for a release or two. We watch that metric pretty closely, and it would be nice to avoid a discontinuity. (I have no issue with the new one, and will switch over to it for future data when I'm not comparing with the past.)
Comment on attachment 8870514 [details]
bug 1364503 - Allow GC/CC MAX_PAUSE collection in opt-out

https://reviewboard.mozilla.org/r/141158/#review146234

data-r=me any which way makes sense for this
Attachment #8870514 - Flags: review?(benjamin) → review+
Comment on attachment 8870514 [details]
bug 1364503 - Allow GC/CC MAX_PAUSE collection in opt-out

This looks fine but as Steve said please leave the original version in for a couple of releases.
Attachment #8870514 - Flags: feedback?(sphink)
Attachment #8870514 - Flags: feedback?(jcoppeard)
Attachment #8870514 - Flags: feedback+
Priority: P2 → P1
Pushed by chutten@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9b65b087ee20
Allow GC/CC MAX_PAUSE collection in opt-out r=bsmedberg,mccr8
Blocks: 1367840
https://hg.mozilla.org/mozilla-central/rev/9b65b087ee20
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: