Closed Bug 1836795 Opened 1 years ago Closed 4 months ago

Add more telemetry/fields to the already existing tab-kills telemetries

Categories

(Fenix :: Experimentation and Telemetry, enhancement, P2)

All
Android
enhancement

Tracking

(firefox130 fixed)

RESOLVED FIXED
130 Branch
Tracking Status
firefox130 --- fixed

People

(Reporter: kaya, Assigned: kaya)

References

(Blocks 1 open bug)

Details

(Whiteboard: [geckoview:m115] [geckoview:m116] [geckoview:m117] [fxdroid] [foundation] [group1])

Attachments

(1 file, 1 obsolete file)

The existing telemetry for tab-kills still leaves some gaps while explaining why the current up and downs are happening in the trends. To have more explanatory data related to tab-kills, one has to know whether we are having more tabs at the time of process termination or the number of process terminations is increased.

Assignee: nobody → kkaya
Severity: -- → N/A
Priority: -- → P1
Whiteboard: [geckoview:m115] [fxdroid] [foundation]
Whiteboard: [geckoview:m115] [fxdroid] [foundation] → [geckoview:m115] [geckoview:m116] [fxdroid] [foundation]
Whiteboard: [geckoview:m115] [geckoview:m116] [fxdroid] [foundation] → [geckoview:m115] [geckoview:m116] [geckoview:m117] [fxdroid] [foundation]
Priority: P1 → P2
See Also: → 1826718

I've implemented a POC for recording historical exit reasons for our processes (crashes, anrs etc). I've integrated a new api (gethistoricalprocessexitreasons available after v30). The exit reasons include some useful ones such as (REASON_ANR, REASON_CRASH, REASON_CRASH_NATIVE, REASON_EXCESSIVE_RESOURCE_USAGE, REASON_LOW_MEMORY). With this, we can distinguish why our processes exited/killed in the previous session and we can even get the available traces.

The action plan is to traverse all of our processes (main, content, gpu, utility, extension etc.), get their historical exit reasons and if the reason is one of the above, log the traces to our reporting tools: e.g. to Glean for counting the number of pings for content/extension process exits due to low mem, excessive resource usage, to Socorro/Sentry for checking out the ANR/crash traces etc. We already have the crash data but ANR data might be useful as the ANR traces logged to Sentry is not useful.

Another good thing with that API is that, this historical exit info api can log the PSS/RSS, process importance (for OS to prioritise which proc to kill next) values of the processes at the time of the processes' death. That'd also be a pretty useful info for our analysis. I will record them too. I'll also check for up-time of the processes, but looks like it is not available there, I may have to measure the tab's uptime with some internal clocks and record them separately.

One last additional info that I plan to record along with the process exit reasons specifically for content processes is to log whether the foreground (last visible) tab was present in the killed process or not. That information may not be available to me at that layer so I may have that info somewhere near here and may log that there.

Whiteboard: [geckoview:m115] [geckoview:m116] [geckoview:m117] [fxdroid] [foundation] → [geckoview:m115] [geckoview:m116] [geckoview:m117] [fxdroid] [foundation] [group1]
Blocks: 1859846
Attachment #9395287 - Attachment description: WIP: Bug 1836795 - Introduce getHistoricalProcessExitReasons API and report application exit info telemetry for the processes that exited in the previous sessions. → Bug 1836795 - Report application exit info telemetry for the processes that exited in the previous sessions.
Attachment #9395287 - Attachment description: WIP: Bug 1836795 - Introduce getHistoricalProcessExitReasons API and report application exit info telemetry for the processes that exited in the previous sessions. → Bug 1836795 - Report application exit info telemetry for the processes that exited in the previous sessions.
Attached file data review request (obsolete) —
Attachment #9410636 - Flags: data-review?(royang)
Attachment #9410636 - Attachment is obsolete: true
Attachment #9410636 - Flags: data-review?(royang)
Attachment #9395287 - Attachment description: Bug 1836795 - Report application exit info telemetry for the processes that exited in the previous sessions. → Bug 1836795 - Record application exit info telemetry for the processes that exited in the previous sessions.
Pushed by kkaya@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4131ba0bc298 Record application exit info telemetry for the processes that exited in the previous sessions. r=android-reviewers,petru,mobiletest-reviewers
Status: NEW → RESOLVED
Closed: 4 months ago
Resolution: --- → FIXED
Target Milestone: --- → 130 Branch
See Also: → 1909108
See Also: → 1842880
See Also: → 1925362
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: