[@ EMPTY: no frame data available ] instead of Java signature for crash reports from Android
Categories
(Socorro :: Signature, defect, P2)
Tracking
(Not tracked)
People
(Reporter: robwu, Assigned: willkg)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
Crash Data
Attachments
(3 files)
In bug 1847372, I pinpointed a reliable OOM crash-trigger, and witnessed the crash happening on all recent versions (Release 116, Beta 117, Nightly 118). Strangely, the last known entry in crash-stats only is associated with Firefox 111:
- bp-f9e509b5-a739-4d14-a0a1-c794e0230316 with signature
[@ java.lang.OutOfMemoryError: at java.util.Arrays.copyOf(Arrays.java) ]
When I tried to trigger a crash report, I was unable to do so due to a regression that broke crash reporting in 115 and 116: bug 1838389. This bug is fixed in Nightly 117.
After running the STR from bug 1847372, I got a crash report, but its signature is [@ EMPTY: no frame data available ]:
- bp-3f3611e7-97e4-4db1-8061-0541b0230806 with signature
[@ EMPTY: no frame data available ].
Both reports have the Java Stack Trace field populated, which feeds my suspicion that this is a bug in Socorro rather than the client side.
Comment 1•2 years ago
|
||
Is this a duplicate of bug 1245570? ("crash in EMPTY: no crashing thread identified; no frame data available (Firefox for Android only)")
| Reporter | ||
Comment 2•2 years ago
|
||
That other bug is much older, and it was not clearly actionable.
I filed this one because of a specific actionable task: figure out why two similar crashes appear to have different crash signatures. Due to the overlapping mrtadata to extract the information from, I think that Socorro is the first place to take a look, but I wouldn't completely rule out this being a (Firefox for Android) client issue either.
| Assignee | ||
Comment 3•2 years ago
|
||
I glanced at the crash report in question and it's weird it picked up that signature. I'll grab this to look into further this week.
| Assignee | ||
Comment 4•2 years ago
|
||
What's going on is that there's no JavaStackTrace annotation in bp-3f3611e7-97e4-4db1-8061-0541b0230806 and that's what signature generation uses to generate signatures for Java crash reports.
Rob: Any idea why this crash report is missing JavaStackTrace?
| Reporter | ||
Comment 5•2 years ago
|
||
(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #4)
What's going on is that there's no
JavaStackTraceannotation in bp-3f3611e7-97e4-4db1-8061-0541b0230806 and that's what signature generation uses to generate signatures for Java crash reports.Rob: Any idea why this crash report is missing
JavaStackTrace?
Probably the same reason as bug 1838389 (example pasted below): In that bug a NPE was fixed by wrapping logic in try-catch and returning null otherwise: https://github.com/mozilla-mobile/firefox-android/commit/884a6086756fd35320e49a9d80768a646492477c
getExceptionStackTrace is used here to populate JavaStackTrace: https://github.com/mozilla-mobile/firefox-android/blob/884a6086756fd35320e49a9d80768a646492477c/android-components/components/lib/crash/src/main/java/mozilla/components/lib/crash/service/MozillaSocorroService.kt#L286-L299
Here is an example of a stack trace that causes throwable.getStacktraceAsString to raise an error, copy-pasted from about:crashes. The issue occurred when I tried to submit the crash report from bug 1847372 on Beta.
ddf4650b-cf4a-431c-b461-d920a70eda9e
java.lang.NullPointerException: Attempt to invoke virtual method 'java.lang.String java.lang.Object.toString()' on a null object reference
* New Sentry Instance: https://sentry.io/organizations/mozilla/issues/?project=6295551&query=4414637fdbc2433eb352dca9124104e2
* New Sentry Instance: https://sentry.io/organizations/mozilla/issues/?project=6295551&query=34b5fd80ceb041f0adfbfa4aa6a298d9
----
java.lang.NullPointerException: Attempt to invoke virtual method 'java.lang.String java.lang.Object.toString()' on a null object reference
at java.lang.String.valueOf(String.java:3657)
at java.lang.StringBuilder.append(StringBuilder.java:132)
at java.lang.Throwable.printEnclosedStackTrace(Throwable.java:717)
at java.lang.Throwable.printStackTrace(Throwable.java:682)
at java.lang.Throwable.printStackTrace(Throwable.java:743)
at mozilla.components.support.base.ext.ThrowableKt.getStacktraceAsString$default(Throwable.kt:19)
at mozilla.components.lib.crash.service.MozillaSocorroService.sendCrashData(MozillaSocorroService.kt:607)
at mozilla.components.lib.crash.service.MozillaSocorroService.sendReport$lib_crash_release(MozillaSocorroService.kt:285)
at mozilla.components.lib.crash.service.MozillaSocorroService.report(MozillaSocorroService.kt:5)
at mozilla.components.lib.crash.CrashReporter$submitReport$2.invokeSuspend(CrashReporter.kt:70)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:9)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:112)
at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:4)
at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:3)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:96)
Suppressed: kotlinx.coroutines.internal.DiagnosticCoroutineContextException: [StandaloneCoroutine{Cancelling}@337a98a, Dispatchers.IO]
| Assignee | ||
Comment 6•2 years ago
|
||
That sounds like an issue you should raise in the fenix crash reporter.
Currently, Socorro needs a JavaStackTrace value for signature generation. Changing that is a project and covered in bug #1693863.
| Assignee | ||
Comment 7•2 years ago
|
||
I don't think there's anything I can do here. Unassigning myself.
| Assignee | ||
Comment 8•2 years ago
|
||
Oops--the bug for the "let's rethink signatures for Java" is bug #1541120.
| Reporter | ||
Comment 9•2 years ago
|
||
Bug 1541120 looks like a larger-scope issue than this one. That one is about being smarter than extracting the signature from JavaStackTrace.
https://crash-stats.mozilla.org/signature/?product=Fenix&signature=EMPTY%3A%20no%20frame%20data%20available#reports (on Nightly 118.0a1 alone, there are 160 such reported crashes in the past 7 days).
In this bug, JavaStackTrace is null, but JavaException is not (MozillaSocorroService.kt sets both at the same time, but the value may sometimes be null as seen in bug 1838389).
What would it take to extract the signature from JavaException when JavaStackTrace is null?
FYI:
- JavaStackTrace is generated by: https://github.com/mozilla-mobile/firefox-android/blob/884a6086756fd35320e49a9d80768a646492477c/android-components/components/support/base/src/main/java/mozilla/components/support/base/ext/Throwable.kt#L22-L28
- JavaException is generated by: https://github.com/mozilla-mobile/firefox-android/blob/884a6086756fd35320e49a9d80768a646492477c/android-components/components/support/base/src/main/java/mozilla/components/support/base/ext/Throwable.kt#L30-L88
| Assignee | ||
Comment 10•2 years ago
|
||
Currently, Socorro requires JavaStackTrace to generate a signature. If the crash report doesn't contain a JavaStackTrace, that's a bug with the relevant crash reporter that should get figured out.
Your idea of changing signature generation to factor in JavaException seems reasonable, but it's a much bigger project than a "well, why don't we just ..." because of the way signature generation is implemented. Looks like this affects < 350 crash reports out of 1 million for Fenix in the last month. Unless there's some serious urgency here, I'm not going to get to fixing this any time soon.
The data on Crash Stats is available via APIs. You can unblock your work by writing scripts to manipulate the data to get what you want to see out of it. I have a set of utility commands to make that easier:
https://github.com/willkg/crashstats-tools
Hope that helps!
| Assignee | ||
Updated•2 years ago
|
Comment 11•2 years ago
|
||
FYI the [@ EMPTY: no frame data available] is currently Fenix' top crasher. It might be worth putting the signature here since those are all Java exceptions missing the Java stack trace, but since it looks like a native crash it's confusing people.
Updated•2 years ago
|
Comment 12•2 years ago
|
||
(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #10)
Currently, Socorro requires
JavaStackTraceto generate a signature. If the crash report doesn't contain aJavaStackTrace, that's a bug with the relevant crash reporter that should get figured out.
Will, is there any Socorro work to be done in this bug? Or can I move this bug to the Fenix::Crash Reporting component and use it to investigate what Fenix client changes might be needed to fix crash reports without a JavaStackTrace?
| Assignee | ||
Comment 13•2 years ago
|
||
There should definitely be a bug/issue for Fenix and maybe android-components about why there is a JavaException, but no JavaStackTrace.
Since comment #10, it looks like the number of crash reports this affects has increased dramatically and this is now a top crasher signature. I don't think we should move this bug to Fenix::Crash Reporting. I should probably grab this and figure out what I can do about it in socorro.
Comment 14•2 years ago
|
||
btw, I suspect the new crashes are bug 1846306. I looked in Sentry for top crash signatures that aren't in Socorro and found that bug. It's the top Sentry crash signature over the last 30 days, by both number of crash events and number of affected users.
The crash volume spike started August ~16, which happened to be the release date for the Fenix 116.0.3 dot release.
Comment 15•2 years ago
|
||
116.0.3 included a crash reporter fix to help diagnose bug 1846306.
| Assignee | ||
Comment 16•2 years ago
|
||
| Assignee | ||
Comment 17•2 years ago
|
||
That fix minimally disrupts things. It should only affect crash reports where we have a JavaException but no JavaStackTrace. It generates a signature just like it would have if there was a JavaStackTrace with the mild caveat that it does the right thing by not including line numbers. The current JavaStackTrace-using code includes the line numbers for non .java files. That's in bug #1851202.
I'll try to get it to production next week. Once I do, I can reprocess all the existing crash reports with the problem and they'll pick up new signatures.
| Assignee | ||
Comment 18•2 years ago
|
||
willkg merged PR #6464: "bug 1847429: implement signature generation for JavaException" in b60b65b.
This will automatically deploy to the stage environment. I'll test it there and (hopefully) deploy it next week to production.
| Assignee | ||
Updated•2 years ago
|
| Assignee | ||
Comment 19•2 years ago
|
||
Also, since this involves signature generation changes, I'll write an intent-to-ship email on stability and crash-reporting-wg mailing lists before pushing it to production.
| Assignee | ||
Comment 20•2 years ago
|
||
I checked stage this morning and the change looks good:
$ supersearchfacet --host=https://crash-stats.allizom.org \
--_facets=product \
--signature='=EMPTY: no frame data available' \
--relative-range=2w --period=daily --format=markdown
| date | -- | Fenix | Focus | total | notes |
|---|---|---|---|---|---|
| 2023-08-22 00:00:00 | 0 | 242 | 5 | 247 | |
| 2023-08-23 00:00:00 | 0 | 271 | 9 | 280 | |
| 2023-08-24 00:00:00 | 0 | 274 | 4 | 278 | |
| 2023-08-25 00:00:00 | 0 | 295 | 4 | 299 | |
| 2023-08-26 00:00:00 | 0 | 280 | 1 | 281 | |
| 2023-08-27 00:00:00 | 0 | 310 | 4 | 314 | |
| 2023-08-28 00:00:00 | 0 | 293 | 2 | 295 | |
| 2023-08-29 00:00:00 | 0 | 355 | 9 | 364 | |
| 2023-08-30 00:00:00 | 0 | 351 | 7 | 358 | |
| 2023-08-31 00:00:00 | 0 | 282 | 7 | 289 | |
| 2023-09-01 00:00:00 | 0 | 331 | 5 | 336 | <-- landed fix late afternoon |
| 2023-09-02 00:00:00 | 0 | 11 | 0 | 11 | |
| 2023-09-03 00:00:00 | 0 | 8 | 0 | 8 | |
| 2023-09-04 00:00:00 | 0 | 4 | 0 | 4 | |
| 2023-09-05 00:00:00 | 0 | 4 | 0 | 4 |
Currently, there are around 50k crash reports since August 1st with this signature that will change signatures when I reprocess them.
I emailed the stability and crash-reporting-wg mailing lists with the intended deploy and reprocessing.
Comment 21•2 years ago
|
||
Thanks for this Will!
| Assignee | ||
Comment 22•2 years ago
|
||
I deployed this to prod just now in bug #1851648. I'm reprocessing crash reports from 2023-08-01 through now.
| Assignee | ||
Comment 23•2 years ago
|
||
| Assignee | ||
Comment 24•2 years ago
|
||
I reprocessed the crash reports in that list. There are still 7,101 Fenix crash reports since 2023-08-01 which have "EMPTY: no frame data available". I spot checked those and they don't have a JavaStackTrace annotation, a JavaException annotation, or a minidump, so ... I think that's the best we're going to do for now.
When we redo signature generation for Java crash reports, we can include information from other annotations like CrashType or something like that which adds some information and differentiates between crash reports that have no frame data.
Marking this as FIXED.
| Assignee | ||
Comment 25•2 years ago
|
||
| Assignee | ||
Updated•2 years ago
|
| Assignee | ||
Updated•2 years ago
|
| Assignee | ||
Comment 26•2 years ago
•
|
||
I did a first round of reprocessing for crash reports >= 2023-08-01. We went from 51,320 to 7,232.
Before:
$ supersearchfacet --signature='=EMPTY: no frame data available' --date='>=2023-08-01' \
--_facets=product --format=markdown
| product | count |
|---|---|
| Fenix | 50344 |
| Focus | 947 |
| ReferenceBrowser | 29 |
| total | 51320 |
After:
$ supersearchfacet --signature='=EMPTY: no frame data available' --date='>=2023-08-01' \
--_facets=product --format=markdown
| product | count |
|---|---|
| Fenix | 7104 |
| Focus | 127 |
| ReferenceBrowser | 1 |
| total | 7232 |
At Chris' behest, I did a second round of reprocessing for crash reports >= 2023-07-01 and < 2023-08-01. We went from 3,527 to 3,527--it looks like those weren't affected.
$ supersearchfacet --signature='=EMPTY: no frame data available' --date='>=2023-07-01' --date='<2023-08-01' \
--_facets=product --format=markdown
| product | count |
|---|---|
| Fenix | 3493 |
| Focus | 34 |
| total | 3527 |
It looks like none of them have a JavaStackTrace or JavaException.
$ supersearch --signature='=EMPTY: no frame data available' --date='>=2023-07-01' --date='<2023-08-01' --num=all \
| wc -l
3527
$ supersearch --signature='=EMPTY: no frame data available' --date='>=2023-07-01' --date='<2023-08-01' \
--crash_report_keys=JavaStackTrace --crash_report_keys=JavaException --num=all \
| wc -l
0
Comment 27•2 years ago
|
||
One interesting bit about JavaException is that it's a bit of a misnomer. It's actually a stack trace, just in a different format compared to JavaStackTrace. Anyway, I feel like the remaining fixes need to happen in Fenix' crash handler. We can close this bug as fixed and open a new one in Fenix crash handler to make sure it tries harder to populate at least one of the two annotations.
Comment 28•2 years ago
|
||
I filed bug 1851898 to fix the Fenix crash reporter.
Should crash reports include both JavaException and JavaStackTrace annotations? Or prefer JavaException? Bug 1792902 asks if we should retire JavaStackTrace now that we have JavaException.
| Assignee | ||
Comment 29•2 years ago
|
||
Currently, Socorro depends on JavaStackTrace. We'd need to figure out what's involved in changing that and then change it. I wrote up bug #1851903 for that work. Until that work is completed, we need at least JavaStackTrace for the foreseeable future.
Description
•