Closed Bug 1655196 Opened 4 years ago Closed 4 years ago

High ratio of ERROR_NO_MINIDUMP_HEADER

Categories

(GeckoView :: General, defect, P1)

Unspecified
All
defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1644486

People

(Reporter: agi, Unassigned)

References

(Blocks 1 open bug)

Details

Talking to :gsvelto today he pointed out that on Android we have a high ratio of ERROR_NO_MINIDUMP_HEADER (currently our top crash report) and there might be something fundamentally broken in our crash reporting.

Flags: needinfo?(gsvelto)

Is there anything recent reporting those issues? I thought this was largely https://github.com/mozilla-mobile/android-components/issues/7129

Limiting to 79b9 gets me 66 crash reports.

Hm nightly has nearly 2000 reports since the start of the month. Which is well after the A-C fix I mentioned.

Some sort of Nightly only assertion from A-C's libcrash then?

If I restrict the search to recent versions the problem isn't as bad as I thought but the volume is still higher than it should be. Nightly builds from July have ~5% of the crash repotrs being empty (see this query). Beta builds have around ~10% (see this other one). That's still a lot and it's worth investigating. On desktop platforms empty crash reports don't make it in the top 50 crashes so this shouldn't be different on Android.

Priority: -- → P1

I've been looking at crashes in the hope of finding a pattern but couldn't. There's two reasons why this might be happening: either we're running out of memory when trying to write out the minidump, or we're running out of disk space. That's the two major issues I can think of which would lead to a failure in writing it out. An OOM situation should be catastrophic though, the Android kernel should kill the main process right away rather than have some allocations fail so I tend to think it might be the latter reason. To investigate this I'll try and cook up a patch that detects the amount of free storage on the partition where the minidump is being written and stores it as an annotation. If data from that annotation is not conclusive then I'll have to modify our crash-writing logic to report more granular errors and store those in an annotation too.

Flags: needinfo?(gsvelto)
Assignee: nobody → gsvelto
Status: NEW → ASSIGNED

I filed bug 1666733 to track the work required on Breakpad. I will use this bug for the analysis when that's done.

Assignee: gsvelto → nobody
Status: ASSIGNED → NEW
Depends on: 1666733

Quick update here: we're replacing the affected code with Rust. The first bits landed in bug 1620993 but they're not enabled on GeckoView/Fenix because we've enabled the new code only on x86/x86-64. ARM/AArch64 support is underway; once it lands I'm curious of seeing the crash volume here and if it's still high we'll add the relevant checks in the code to figure out what's going on. Having to work with Rust will make this much easier.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.