Report Java exceptions in the Glean crash pings
Categories
(Fenix :: Crash Reporting, enhancement, P3)
Tracking
(firefox116 wontfix, firefox117 fixed)
People
(Reporter: afranchuk, Assigned: afranchuk)
Details
Attachments
(3 files)
The crash pings added in bug 1810951 are minimal and only report native code crashes. We should also report java exceptions (which manifest as an application crash). These should be reported as a "main" process crash, or a new designator should be used, such as "application". The Java exception details could be included, though even omitting them for now will still improve the situation as we are currently under-reporting crashes.
This would affect lib-crash in the android-components project.
Assignee | ||
Comment 1•2 years ago
•
|
||
I want to distinguish java exception crashes from native code crashes in the ping metrics. I've come up with 3 options, but I'm wondering which is preferred (if there is precedence for this sort of thing):
- Use a different
processType
value. The java exception does technically come from a different process, but ultimately the user experiences something equivalent to amain
crash. Since we generally usemain
crashes to determine the severity of crashes (users are always aware ofmain
crashes after all), I'm hesitant to add another case that is the same severity. - Use a new reason code for the crash ping. This would work, though I think the ping reason codes are intended to be used for the reason that the ping is being sent, so this might be considered a bit mismatched (but ultimately this could be okay).
- Make the distinction implicit based on the presence of other metrics in the ping. For example, we're likely going to add an
exception
metric which has the java exception details, which would be mutually exclusive with thestack_traces
metric (which is designed specifically for native code stack traces so shouldn't be used to send e.g. exception stack traces).
chutten: is my assessment of the reason code correct? Opinions?
royang: any opinions or preferences on which would be best for ingestion and processing of ping data? I think they all should be approximately the same effort when it comes to filtering, with (3) being a bit less precise since it doesn't inherently enforce the mutual exclusivity. Though if we put some time into supporting cross-language stack traces the shape of these metrics will have to change anyway.
Note: There are also exceptions which aren't fatal. I'm not sure how we should handle these (are they even considered crashes?). But at the moment lib-crash doesn't call this method for the registered CrashTelemetryService
s so it's not an immediate concern.
Comment 2•2 years ago
|
||
We provide the tools, you decide how to use them : ) But I do have opinions.
I would definitely discourage implicit distinctions. Make it explicit regardless of how so that even if in the future we consolidate on stack_traces
for language-independent stacks, we can still tell Java stacks from not-Java stacks.
As a data consumer myself, I'd not expect to find this information in processType
or reason
. I'd be looking for something like a string crash_language: java|native
or a bool is_exception: true|false
or some other standard metric. Are those acceptable options?
Updated•2 years ago
|
Assignee | ||
Comment 3•2 years ago
•
|
||
(In reply to Chris H-C :chutten from comment #2)
We provide the tools, you decide how to use them : ) But I do have opinions.
I would definitely discourage implicit distinctions. Make it explicit regardless of how so that even if in the future we consolidate on
stack_traces
for language-independent stacks, we can still tell Java stacks from not-Java stacks.As a data consumer myself, I'd not expect to find this information in
processType
orreason
. I'd be looking for something like a stringcrash_language: java|native
or a boolis_exception: true|false
or some other standard metric. Are those acceptable options?
Okay, that's what I figured you'd say. I'm thinking something like cause: java_exception|native_error
. My reasoning is that cause
more accurately describes the domain of the value than crash_language
. I.e., cause
implies that the java_exception
/os_fault
relates to the initial frame/crash reason, whereas crash_language
might be taken as relating to all of the information in the crash, but I think in the future we will try to have cross-language exception info. I don't like is_exception
because it doesn't distinguish the language (though it could).
Assignee | ||
Comment 4•2 years ago
|
||
Comment 5•2 years ago
|
||
Comment on attachment 9342151 [details]
data-review-request.md
DATA COLLECTION REVIEW RESPONSE:
- Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?
Yes, through the metrics.yaml file and the Glean Dictionary.
- Is there a control mechanism that allows the user to turn the data collection on and off?
Yes. This collection is Telemetry so can be controlled through Firefox's Preferences.
- If the request is for permanent data collection, is there someone who will monitor the data over time?
afranchuk@mozilla.com will be responsible for monitoring the data over time
- Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?
Category 1, Technical data
- Is the data collection request for default-on or default-off?
Default on for all channels.
- Does the instrumentation include the addition of any new identifiers?
No
- Is the data collection covered by the existing Firefox privacy notice?
Yes
- Does the data collection use a third-party collection tool?
No
Result: data-review+
Comment 6•2 years ago
|
||
Assignee | ||
Comment 7•2 years ago
|
||
Comment 8•2 years ago
|
||
Authored by https://github.com/afranchuk
https://github.com/mozilla-mobile/firefox-android/commit/f53afc0a2e421b2eb914ce884f47f1eabcae3183
[main] Bug 1839697 - Report Java exceptions in the Glean crash pings
Comment 9•2 years ago
|
||
I'll remove the qe-verify+ flag, since there is no need for QA manual verification.
Thank you!
Updated•2 years ago
|
Updated•2 years ago
|
Description
•