Add more telemetry/fields to the already existing tab-kills telemetries
Categories
(Fenix :: Experimentation and Telemetry, enhancement, P2)
Tracking
(firefox130 fixed)
Tracking | Status | |
---|---|---|
firefox130 | --- | fixed |
People
(Reporter: kaya, Assigned: kaya)
References
(Blocks 1 open bug)
Details
(Whiteboard: [geckoview:m115] [geckoview:m116] [geckoview:m117] [fxdroid] [foundation] [group1])
Attachments
(1 file, 1 obsolete file)
The existing telemetry for tab-kills still leaves some gaps while explaining why the current up and downs are happening in the trends. To have more explanatory data related to tab-kills, one has to know whether we are having more tabs at the time of process termination or the number of process terminations is increased.
Assignee | ||
Updated•1 years ago
|
Assignee | ||
Updated•1 years ago
|
Updated•1 years ago
|
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Comment 1•11 months ago
|
||
I've implemented a POC for recording historical exit reasons for our processes (crashes, anrs etc). I've integrated a new api (gethistoricalprocessexitreasons available after v30). The exit reasons include some useful ones such as (REASON_ANR
, REASON_CRASH
, REASON_CRASH_NATIVE
, REASON_EXCESSIVE_RESOURCE_USAGE
, REASON_LOW_MEMORY
). With this, we can distinguish why our processes exited/killed in the previous session and we can even get the available traces.
The action plan is to traverse all of our processes (main, content, gpu, utility, extension etc.), get their historical exit reasons and if the reason is one of the above, log the traces to our reporting tools: e.g. to Glean for counting the number of pings for content/extension process exits due to low mem, excessive resource usage, to Socorro/Sentry for checking out the ANR/crash traces etc. We already have the crash data but ANR data might be useful as the ANR traces logged to Sentry is not useful.
Another good thing with that API is that, this historical exit info api can log the PSS/RSS, process importance (for OS to prioritise which proc to kill next) values of the processes at the time of the processes' death. That'd also be a pretty useful info for our analysis. I will record them too. I'll also check for up-time of the processes, but looks like it is not available there, I may have to measure the tab's uptime with some internal clocks and record them separately.
One last additional info that I plan to record along with the process exit reasons specifically for content processes is to log whether the foreground (last visible) tab was present in the killed process or not. That information may not be available to me at that layer so I may have that info somewhere near here and may log that there.
Assignee | ||
Updated•9 months ago
|
Assignee | ||
Comment 2•7 months ago
|
||
Updated•5 months ago
|
Updated•5 months ago
|
Assignee | ||
Comment 3•5 months ago
|
||
Assignee | ||
Updated•4 months ago
|
Updated•4 months ago
|
Comment 5•4 months ago
|
||
bugherder |
Description
•