Open Bug 1759012 Opened 3 years ago Updated 2 years ago

Record more telemetry around content process startup failures & crashes

Categories

(Core :: IPC, enhancement)

enhancement

Tracking

()

People

(Reporter: nika, Unassigned)

References

(Blocks 1 open bug)

Details

This could include various probes:


How often an attempt to start a content process (e.g. through GetNewOrUsedContentProcess returning a new process) does not result in a live process being returned, or the subsequent call to WaitForLaunch{Sync,Async} returns an error (i.e. it failed to start). We probably will want to filter this to not report process launches during shutdown as they won't have a user-visible impact, and we intentionally fail during shutdown occasionally.

We'd probably want to report this as a histogram with a bucket for successful and a bucket for failed process launches, so that we get an absolute scale.

It may be a good idea to extend the window for this to be beyond WaitForLaunch{Sync,Async} and to actually wait for the Build IDs match message instead so that we can catch all crashes caused by build ID mismatches as well.


How often a process which was successfully started shuts down with an error, and what the error is. This would be recorded probably as a histogram during ContentParent::ActorDestroy recording every successful and unsuccessful shutdown, with unsuccessful ones likely being categorized based on whether we know the cause. It might be good to have buckets for situations like "failed due to shutdown hang", "failed due to KillHard()", "failed after SendShutdown()", "failed with crash report" and "failed without crash report".


There may be other probes we should be collecting for more information about process launch errors (e.g. perhaps we should try to record why the process launch is failing for the first probe?), but this should at least be a good start.

See Also: → 1618904
Blocks: 1795821
You need to log in before you can comment on or make changes to this bug.