Closed Bug 1110818 Opened 9 years ago Closed 9 years ago

Measure subprocess crash rates in telemetry

Categories

(Toolkit :: Telemetry, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla37
Tracking Status
firefox35 --- fixed
firefox36 --- fixed
firefox37 --- fixed

People

(Reporter: benjamin, Assigned: benjamin)

References

Details

Attachments

(1 file)

It's easy to measure subprocess crash rates in telemetry (plugin/content/gmplugin). This will make it easier to collect data from the Flash protected-mode experiment.
Attached patch crash-telemetrySplinter Review
This patch is on top of bug 1108035 (minor overlapping of #include changes, not anything in the code).
Attachment #8535626 - Flags: review?(georg.fritzsche)
Attachment #8535626 - Attachment is patch: true
Comment on attachment 8535626 [details] [diff] [review]
crash-telemetry

Review of attachment 8535626 [details] [diff] [review]:
-----------------------------------------------------------------

If we want crash rates, do we have a metric to correlate the crashes against? I.e. a process creation counters for content, plugin, gmplugin.

Keyed histograms seem like overkill here (for memory and payload footprint) - can't we just use an enumerated histogram here?

::: dom/ipc/CrashReporterParent.cpp
@@ +172,5 @@
>  
>      int32_t processType;
>      int32_t crashType = nsICrashService::CRASH_TYPE_CRASH;
>  
> +    const char* telemetryKey = nullptr;

Plain pointers are not great because ownership etc. is really unclear.
|nsDependentCString key; key.AssignLiteral(...);| instead?
Attachment #8535626 - Flags: review?(georg.fritzsche)
> If we want crash rates, do we have a metric to correlate the crashes
> against? I.e. a process creation counters for content, plugin, gmplugin.

I'm not sure I understand the question. For the current experiment, it really doesn't matter, since we can just compare the test group and the control group.

In general, we should be able to chart these metrics against session length and against active ticks to see which is the most interesting denominator.

> Keyed histograms seem like overkill here (for memory and payload footprint)
> - can't we just use an enumerated histogram here?

I don't want to, because in a future patch I plan on including the plugin tag name into the key so that we can measure crash rates separately per plugin type.

> > +    const char* telemetryKey = nullptr;
> 
> Plain pointers are not great because ownership etc. is really unclear.
> |nsDependentCString key; key.AssignLiteral(...);| instead?

Since all of these are literals, why is the ownership unclear? I can change it, but if there's a way to just make it readable, the code should be safe in either case.
Flags: needinfo?(georg.fritzsche)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #3)
> > If we want crash rates, do we have a metric to correlate the crashes
> > against? I.e. a process creation counters for content, plugin, gmplugin.
> 
> I'm not sure I understand the question. For the current experiment, it
> really doesn't matter, since we can just compare the test group and the
> control group.
> 
> In general, we should be able to chart these metrics against session length
> and against active ticks to see which is the most interesting denominator.

Ok. If we don't have a process instantiation count probe we should file a separate bug for this then - e.g. 50% vs. 100% plugin crashes would be significant, right?

> > Keyed histograms seem like overkill here (for memory and payload footprint)
> > - can't we just use an enumerated histogram here?
> 
> I don't want to, because in a future patch I plan on including the plugin
> tag name into the key so that we can measure crash rates separately per
> plugin type.

Ah, ok then. (Side-note: All potential plugin tag names? Doesn't this run into the same privacy concerns we had on an earlier bug on finding out what plugins are being used in the wild?)

> > > +    const char* telemetryKey = nullptr;
> > 
> > Plain pointers are not great because ownership etc. is really unclear.
> > |nsDependentCString key; key.AssignLiteral(...);| instead?
> 
> Since all of these are literals, why is the ownership unclear? I can change
> it, but if there's a way to just make it readable, the code should be safe
> in either case.

With this i had to read both the definition and the code that assigns to figure out the ownership/semantics.
With nsDependentCString it would be clear to me (and we use that type further down anyway).

I think we should avoid plain pointers where reasonable.
Flags: needinfo?(georg.fritzsche)
Filed bug 1110996 for the launch metric.
Comment on attachment 8535626 [details] [diff] [review]
crash-telemetry

Approval Request Comment
[Feature/regressing bug #]: New code for measuring plugin crash rates
[Describe test coverage new/current, TBPL]: Manual testing
[Risks and why]: Low risk; slight memory usage increase when telemetry is enabled.
[String/UUID change made/needed]: None. New items were added to Histograms.json changed, but I don't think that can harm addons.
Attachment #8535626 - Flags: approval-mozilla-beta?
Attachment #8535626 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/185c34ef0199
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
Comment on attachment 8535626 [details] [diff] [review]
crash-telemetry

got a=lsblakk over email
Attachment #8535626 - Flags: approval-mozilla-beta?
Attachment #8535626 - Flags: approval-mozilla-beta+
Attachment #8535626 - Flags: approval-mozilla-aurora?
Attachment #8535626 - Flags: approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.