Open Bug 1644486 Opened 4 years ago Updated 2 months ago

Crash in [@ EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER] for Android

Categories

(Fenix :: Crash Reporting, defect, P3)

Product:

Component:

Platform:

All

Android

Type:

defect

Priority:

P3

Severity:

S2

Tracking

(firefox78 wontfix, firefox79 wontfix, firefox80 wontfix, firefox96 wontfix, firefox97 wontfix, firefox98 wontfix, firefox100 wontfix, firefox101 wontfix, firefox102 wontfix, firefox106 wontfix, firefox107 wontfix, firefox108 wontfix)

Status:

NEW

Tracking Flags:

Tracking

Status

firefox78

---

wontfix

firefox79

---

wontfix

firefox80

---

wontfix

firefox96

---

wontfix

firefox97

---

wontfix

firefox98

---

wontfix

firefox100

---

wontfix

firefox101

---

wontfix

firefox102

---

wontfix

firefox106

---

wontfix

firefox107

---

wontfix

firefox108

---

wontfix

People

(Reporter: fluffyemily, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: crash, Whiteboard: [geckoview:2022h2?])

Crash Data

Emily Toop (:fluffyemily)

Reporter

Description

•

4 years ago

This bug is for crash report bp-abe9b05f-1857-42c0-9836-04e6c0200609.

Stefan Arentz | :st3fan | ⏰ EST | he/him

Comment 1

•

4 years ago

I may have identified this bug as being related to mozilla.components.browser.engine.gecko.fetch.GeckoViewFetchClient or geckoview.GeckoWebExecutor.fetch.

For some reason this one does not show up properly in either Sentry or Socorro but the Play Store has good crash reports, all in the area of Fetch.

Here is one trace:

java.lang.IllegalArgumentException: 
 
  at org.mozilla.geckoview.GeckoWebExecutor.fetch (GeckoWebExecutor.java:12)
 
  at mozilla.components.browser.engine.gecko.fetch.GeckoViewFetchClient.fetch (GeckoViewFetchClient.kt:70)
 
  at mozilla.components.feature.downloads.AbstractFetchDownloadService.performDownload$feature_downloads_release (AbstractFetchDownloadService.kt:12)
 
  at mozilla.components.feature.downloads.AbstractFetchDownloadService.startDownloadJob$feature_downloads_release (AbstractFetchDownloadService.kt:3)
 
  at mozilla.components.feature.downloads.AbstractFetchDownloadService$onStartCommand$1.invokeSuspend (AbstractFetchDownloadService.kt:5)
 
  at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith (ContinuationImpl.kt:2)
 
  at kotlinx.coroutines.DispatchedTask.run (DispatchedTask.kt:19)
 
  at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely (CoroutineScheduler.kt:1)
 
  at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run (CoroutineScheduler.kt:14)

More at https://play.google.com/apps/publish/?account=7083182635971239206#AndroidMetricsErrorsPlace:p=org.mozilla.firefox_beta&appid=4972447553788559254&appVersion=2015744455,2015744453,2015744451,2015744449&clusterName=apps/org.mozilla.firefox_beta/clusters/f47f35ec&detailsAppVersion=2015744455,2015744453,2015744451,2015744449&detailsSpan=7

Stefan Arentz | :st3fan | ⏰ EST | he/him

Comment 2

•

4 years ago

If these Fetch crashes are indeed bundled under the NativeCodeCrash in Sentry then we may want to address this ASAP with an update because the volume has increased 5x in the past day.

Emily Toop (:fluffyemily)

Reporter

Updated

•

4 years ago

Assignee: nobody → agi

Severity: -- → S2

Priority: -- → P1

[ex-Mozilla] Agi Sferro | :agi

Comment 3

•

4 years ago

I don't think Fetch is solely responsible for this, we have 2200 reports in the last week for this in nightly https://crash-stats.mozilla.org/topcrashers/?product=Fenix&version=0.0a1 but the play console doesn't show any crashes for Fetch in the same timeframe.

Emily Toop (:fluffyemily)

Reporter

Comment 4

•

4 years ago

We have seen a huge reduction in the incidents of this bug since June 7.

[ex-Mozilla] Agi Sferro | :agi

Updated

•

4 years ago

Crash Signature: [@ EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER] → [@ EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER] [@ EMPTY: no crashing thread identified]

status-firefox78: --- → affected

status-firefox79: --- → affected

status-firefox80: --- → affected

[ex-Mozilla] Agi Sferro | :agi

Updated

•

4 years ago

Depends on: 1655196

Kevin Brosnan [Ex-Mozilla]

Updated

•

4 years ago

See Also: → https://github.com/mozilla-mobile/android-components/issues/7129

Wayne Mery (:wsmwk)

Updated

•

4 years ago

OS: Unspecified → Android

Summary: Crash in [@ EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER] → Crash in [@ EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER] for fenix

Amedyne Moya [:amoya]

Updated

•

4 years ago

Whiteboard: [geckoview:m83]

Amedyne Moya [:amoya]

Updated

•

4 years ago

Whiteboard: [geckoview:m83]

[ex-Mozilla] Agi Sferro | :agi

Updated

•

4 years ago

Severity: S2 → S3

Priority: P1 → P2

[ex-Mozilla] Agi Sferro | :agi

Comment 5

•

4 years ago

Mostly waiting for Bug 1666733 at this point.

Assignee: agi → nobody

(No longer employed by Mozilla) Aaron Klotz

Comment 8

•

4 years ago

Gabriele, do we have any new information about this?

Flags: needinfo?(gsvelto)

Gabriele Svelto [:gsvelto]

Comment 9

•

4 years ago

Sadly not yet. We've implemented error reporting for minidump generation in bug 1666733 but it's enabled only with the oxydized minidump generator. Said minidump generator is only implemented for x86 and x86-64 at the moment and the ARM/AArch64 implementation in bug 1689358 is stuck because we first need to upgrade the libc crate in Gecko. So sadly I must report that this has stalled for now but I hope we'll be able to move it forward soon(ish).

Flags: needinfo?(gsvelto)

Gabriele Svelto [:gsvelto]

Comment 10

•

3 years ago

This is a signature change caused by switching Socorro's stack walker to the new oxidized version. On the topic of oxidation we haven't enable ARM minidump generation yet - and thus proper error recording - but we should be able to do it before the end of January.

Crash Signature: [@ EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER] [@ EMPTY: no crashing thread identified] → [@ EMPTY: no crashing thread identified; EmptyMinidump] [@ EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER] [@ EMPTY: no crashing thread identified]

Chris Peterson [:cpeterson]

Updated

•

3 years ago

status-firefox96: --- → affected

status-firefox97: --- → affected

status-firefox98: --- → affected

Depends on: 1689358

Dianna Smith [:diannaS]

Updated

•

3 years ago

status-firefox78: affected → wontfix

status-firefox79: affected → wontfix

status-firefox80: affected → wontfix

status-firefox96: affected → wontfix

[ex-Mozilla] Agi Sferro | :agi

Comment 11

•

3 years ago

This crash is blowing up in nightly right now. Raising priority.

Looks like the increase started with 20220209095640.

Severity: S3 → S2

Priority: P2 → P1

[ex-Mozilla] Agi Sferro | :agi

Comment 12

•

3 years ago

Enabling the GPU process landed within that nightly, it might be related. https://hg.mozilla.org/mozilla-central/rev/f93a4ff5c045531102de678c93951deb095137d6

Emily Toop (:fluffyemily)

Reporter

Updated

•

3 years ago

Whiteboard: [geckoview:m99]

[ex-Mozilla] Agi Sferro | :agi

Comment 13

•

3 years ago

•

I can reliably reproduce this crash on a low-end device (Samsung A5) doing this:

pm clear org.mozilla.fenix
Open Fenix, load cnn.com, put Fenix to background
Open gmail load a few emails
Go back to fenix -> "Sorry Fenix has crashed" tab

If I disable the GPU process, I don't get a crash, my guess is that we're not handling GPU process kills correctly.

[ex-Mozilla] Agi Sferro | :agi

Comment 14

•

3 years ago

We rolled back Bug 1331109 which should hopefully make this crash go away.

[ex-Mozilla] Agi Sferro | :agi

Comment 15

•

3 years ago

Opened Bug 1755375 to handle GPU process kills correctly, which caused the spike in crashes in this bug.

[ex-Mozilla] Agi Sferro | :agi

Updated

•

3 years ago

Priority: P1 → P2

Whiteboard: [geckoview:m99]

Gabriele Svelto [:gsvelto]

Updated

•

3 years ago

See Also: → 1757854

Chris Peterson [:cpeterson]

Updated

•

3 years ago

Crash Signature: [@ EMPTY: no crashing thread identified; EmptyMinidump] [@ EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER] [@ EMPTY: no crashing thread identified] → [@ EMPTY: no crashing thread identified] [@ EMPTY: no crashing thread identified; EmptyMinidump] [@ EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER] [@ EMPTY: no crashing thread identified; MissingThreadList]

Component: General → Stability

Product: GeckoView → Fenix

Summary: Crash in [@ EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER] for fenix → Crash in [@ EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER] for Android

Gabriele Svelto [:gsvelto]

Comment 16

•

3 years ago

FYI we'll soon have detailed error information about the most common error that causes failures in minidump generations on Android. Given enough nightly users a couple of weeks from now we should get to the bottom of this.

Chris Peterson [:cpeterson]

Comment 17

•

3 years ago

(In reply to Gabriele Svelto [:gsvelto] from comment #16)

FYI we'll soon have detailed error information about the most common error that causes failures in minidump generations on Android. Given enough nightly users a couple of weeks from now we should get to the bottom of this.

Hi Gabriele, do you see any changes in Android minidump errors? AFAICT, Socorro reports roughly the same number of "EmptyMinidump" crash reports from Android Nightly 102.0a1 (933) as 101.0a1 (954):

https://crash-stats.mozilla.org/search/?release_channel=nightly&signature=EmptyMinidump&product=Focus&product=Fenix&date=%3E%3D2021-11-23T17%3A22%3A00.000Z&date=%3C2022-05-23T17%3A22%3A00.000Z&_facets=product&_facets=version&_facets=signature&_facets=build_id&_sort=-date&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-version

status-firefox100: --- → affected

status-firefox101: --- → affected

status-firefox102: --- → affected

Flags: needinfo?(gsvelto)

Gabriele Svelto [:gsvelto]

Comment 18

•

3 years ago

•

Yes, we finally have the reasons why we're failing to write out the minidumps, see this.

So out of the 34 recent crashes one thirds has a No threads left to suspend (out of X) error and another third has Error during init phase: IO error for file /proc/<pid>/auxv: Permission denied (os error 13) errors.

Regarding the first error it's probably happening because it's too late to write a minidump. I wonder if we could experiment with sending a SIGSTOP instead ~~or use ptrace() with PTRACE_ATTACH~~. Regarding the latter error we can probably do away with the contents of the auxiliary vector and still write a mostly complete minidump. Additionally we might ptrace the auxiliary vector directly out of the crashed process if reading the corresponding /proc file fails. I'll file bugs for both.

Edit: We already use PTRACE_ATTACH on every thread, sending it to the PID only stops that thread not the whole process but it's possible to SIGSTOP a process and then attach when the threads have already been stopped.

Flags: needinfo?(gsvelto)

Gabriele Svelto [:gsvelto]

Comment 19

•

3 years ago

Filed minidump-writer issue #27 for handling the auxiliary vector, that should be a relatively easy fix.

Gabriele Svelto [:gsvelto]

Comment 20

•

3 years ago

... and filed minidump-writer issue #28 for the thread suspension problem.

Chris Peterson [:cpeterson]

Updated

•

3 years ago

Hardware: Unspecified → All

See Also: → https://github.com/rust-minidump/minidump-writer/issues/27, https://github.com/rust-minidump/minidump-writer/issues/28

Chris Peterson [:cpeterson]

Updated

•

2 years ago

Whiteboard: [geckoview:2022h2?]

Chris Peterson [:cpeterson]

Updated

•

2 years ago

Depends on: 1588530

Dianna Smith [:diannaS]

Updated

•

2 years ago

status-firefox100: affected → wontfix

status-firefox101: affected → wontfix

status-firefox102: affected → wontfix

status-firefox97: affected → wontfix

status-firefox98: affected → wontfix

Gabriele Svelto [:gsvelto]

Updated

•

2 years ago

Depends on: 1793784

Gabriele Svelto [:gsvelto]

Comment 21

•

2 years ago

Signature change

Crash Signature: [@ EMPTY: no crashing thread identified] [@ EMPTY: no crashing thread identified; EmptyMinidump] [@ EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER] [@ EMPTY: no crashing thread identified; MissingThreadList] → [@ EMPTY: no crashing thread identified] [@ EMPTY: no crashing thread identified; EmptyMinidump] [@ EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER] [@ EMPTY: no crashing thread identified; MissingThreadList] [@ EMPTY: no frame data ava…

Olivia Hall [:olivia]

Updated

•

2 years ago

See Also: → 1710940

Chris Peterson [:cpeterson]

Updated

•

2 years ago

Component: Stability → Crash Reporting

Chris Peterson [:cpeterson]

Updated

•

2 years ago

status-firefox106: --- → affected

status-firefox107: --- → affected

status-firefox108: --- → affected

Chris Peterson [:cpeterson]

Updated

•

2 years ago

See Also: → 1245570

Sylvestre Ledru [:Sylvestre]

Updated

•

2 years ago

Duplicate of this bug: 1803899

Sylvestre Ledru [:Sylvestre]

Comment 23

•

2 years ago

Gab, should bug 1360392 merged into this one ?

Gabriele Svelto [:gsvelto]

Comment 24

•

2 years ago

No, they're different issues as the fixes are different for mobile and desktop. Unfortunately the crash signatures are the same as we can't tell them apart.

Kevin Brosnan [Ex-Mozilla]

Updated

•

2 years ago

No longer duplicate of this bug: 1803899

Sylvestre Ledru [:Sylvestre]

Comment 25

•

2 years ago

ok, sorry, i missed the information in the subject!

Chris Peterson [:cpeterson]

Comment 26

•

2 years ago

Dropping priority from P2 to P3 because this bug is not currently actionable for the Android engineering team. We're waiting for the new ARM minidump writer in bug 1689358.

Priority: P2 → P3

Dianna Smith [:diannaS]

Comment 27

•

1 year ago

•

Any new priority on this for either mobile now that bug 1689358 is resolved?

Flags: needinfo?(royang)

Gabriele Svelto [:gsvelto]

Comment 28

•

1 year ago

Yeah, I'm now actively working on bug 1620998 which should allow me to eliminate the last part of the minidump generation pipeline that cause these failures. It will take a few months as it's quite a bit of work, but it's been actively worked.

Dianna Smith [:diannaS]

Comment 29

•

1 year ago

thanks gsvelto!

Flags: needinfo?(royang)

Dianna Smith [:diannaS]

Updated

•

1 year ago

status-firefox106: affected → wontfix

status-firefox107: affected → wontfix

status-firefox108: affected → wontfix

Comment hidden (offtopic)

Comment hidden (offtopic)

Comment hidden (obsolete)

Comment hidden (offtopic)

Ryan VanderMeulen [:RyanVM]

Comment 34

•

2 months ago

Denis, please stop adding links to all these individual reports. We have access to all of them already and it isn't helping to resolve this problem. Per comment 28, the root cause of this specific issue is already understood and being worked on.

Denis Müller [:Webworkr]

Comment 35

•

2 months ago

(In reply to Ryan VanderMeulen [:RyanVM] from comment #34)

Denis, please stop adding links to all these individual reports. We have access to all of them already and it isn't helping to resolve this problem. Per comment 28, the root cause of this specific issue is already understood and being worked on.

Thanks for the feedback!
OK, if it doesn't add any value, then of course I won't do it in future. It also saves me work.

You need to log in before you can comment on or make changes to this bug.