Closed Bug 1681846 Opened 3 years ago Closed 11 months ago

Missing MOZ_CRASH reason in crash reports from Android, take two

Categories

(Toolkit :: Crash Reporting, defect, P2)

ARM
Android
defect

Tracking

()

RESOLVED FIXED
115 Branch
Tracking Status
firefox114 --- fixed
firefox115 --- fixed

People

(Reporter: mstange, Assigned: gsvelto)

References

Details

Attachments

(1 file)

This is a follow-up bug to bug 1666383, which did not fix the bug for me.

Steps to reproduce:

  1. In Firefox for Android, navigate to about:crashcontent.
  2. Alternatively, in a build before bug 1681842 appeared, navigate to about:crashparent.

Expected results:
The Socorro crash report for that crash should include a Moz crash reason.

Actual results:
No moz crash reason can be found.

Example reports:

I can't reproduce in the emulator so this must have something to do with the environment we have on the phones. I'm wondering if SELinux might be interfering with the code we run in the exception handler? That would also explain bug 1681842 which I also cannot reproduce in the emulator. One thing I noticed while inspecting some crashes is that some other fields are also missing: the StackTraces field, the Telemetry* as well as the ContentSandbox* ones.

Assignee: nobody → gsvelto
Severity: -- → S2
Status: NEW → ASSIGNED
Priority: -- → P1

This is only happening on ARM, I verified only the 32-bit build but I'll try the 64- one too. On x86-64 - which is what we use for the emulator - the issue doesn't repro.

Hardware: All → ARM

I smell a compiler issue: if I put a print statement in AnnotateMozCrashReason() everything works as advertised, if I remove it then the gMozCrashReason variable is never assigned the string passed to MOZ_CRASH(). The fact that this only happens on ARM and not on x86-64 makes me doubly suspicious about the compiler. Lowering the priority given that I don't have time to investigate into clang ATM.

Assignee: gsvelto → nobody
Status: ASSIGNED → NEW
Priority: P1 → P2

See the comment above

Flags: needinfo?(mh+mozilla)

We updated compilers a few times since, assuming this is no longer an issue (and anyway Mike is still needinfoed).

Severity: S2 → S3

I could never really figure this one out. We have a handful of ARM crashes where the raw crash reason was populated correctly, but two order of magnitude more ARM64 crashes. I suspect this is still an issue and I'm starting to think it's not related to the compiler (otherwise it would always fail, wouldn't it?).

See Also: → 1664138

I just tested this again and 32-bit ARM builds are still affected. Somehow the compiler removes the store to gMozCrashReason. Why does this only happen on 32-bit ARM is beyond my understanding, maybe some arch-specific peephole optimization thinks this is a dead store? Maybe it has to do with the memory model (or the compiler's assumptions about this)? Either way I'm trying to see if turning this into a volatile store fixes the issue (while I loathe using volatile sometimes there's really no way around it).

Assignee: nobody → gsvelto
Status: NEW → ASSIGNED

Making the variable volatile does the trick. Let me know if it's too ugly or if there's a better way.

Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/4593ab51f340
Ensure MOZ_CRASH() stores the reason in crash reports on 32-bit ARM/Android builds r=glandium
Status: ASSIGNED → RESOLVED
Closed: 11 months ago
Resolution: --- → FIXED
Target Milestone: --- → 115 Branch

FYI I've verified that this fixed the problem, recent versions store the proper value after a crash.

Gabriele, do you think this MOZ_CRASH fix is worth uplifting to a 114 dot release (that is already planned for next week or the week after)? It would make crash reports more useful and seems low risk, since the browser is already crashing.

Flags: needinfo?(gsvelto)

Yes, it's such a small change

Flags: needinfo?(gsvelto)

Comment on attachment 9336075 [details]
Bug 1681846 - Ensure MOZ_CRASH() stores the reason in crash reports on 32-bit ARM/Android builds r=glandium

Beta/Release Uplift Approval Request

  • User impact if declined: Crash reports from 32-bit Android devices will be harder to debug because they won't have the MOZ_CRASH reason string.

The patch author, gsvelto, verified the fix in Nightly: https://bugzilla.mozilla.org/show_bug.cgi?id=1681846#c12. The patch has been baking on Nightly 115 for seven days.

  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This change is not risky. It's a one-line change in Firefox's crash handler. Firefox is already crashing when this new line of code is run, so nothing worse can happen.
  • String changes made/needed:
  • Is Android affected?: Yes
Attachment #9336075 - Flags: approval-mozilla-release?

Comment on attachment 9336075 [details]
Bug 1681846 - Ensure MOZ_CRASH() stores the reason in crash reports on 32-bit ARM/Android builds r=glandium

Approved for the release branch, thanks.

Attachment #9336075 - Flags: approval-mozilla-release? → approval-mozilla-release+
Flags: needinfo?(mh+mozilla)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: