Fenix crashes on startup on Android 5 devices
Categories
(Fenix :: General, defect, P1)
Tracking
(firefox113 unaffected, firefox114+ verified, firefox115+ verified)
Tracking | Status | |
---|---|---|
firefox113 | --- | unaffected |
firefox114 | + | verified |
firefox115 | + | verified |
People
(Reporter: mlobontiuroman, Assigned: zmckenney)
References
(Regression)
Details
(Keywords: crash, regression)
Attachments
(6 files)
Steps to reproduce
- Install Fenix Beta 114.0b8, or Beta 114.0b9, or the latest Nightly 115.0a1, on a device with Android 5.
- Open Fenix.
Expected behavior
Fenix can be opened.
Actual behavior
Fenix crashes. Cannot open about:crashes to get the details.
Device information
- Firefox version: Nightly 115.0a1 from 5/26, Beta 114.0b8, Beta 114.0b9
- Android devices: Samsung Galaxy Tab A6 (Android 5.1.1), and Xiaomi mi4i (Android 5.0.2)
- NOT reproducible on RC 113.2.0
Reporter | ||
Updated•1 years ago
|
Reporter | ||
Updated•1 years ago
|
Reporter | ||
Comment 1•1 years ago
|
||
I've added a crash log, maybe it helps.
Comment 2•1 years ago
|
||
Hello, the issue is only reproducible with:
- 114.0b9 and 114.0b8;
We were not able to reproduce it with: - 114.0b1, 114.0b3, 114.9b4, 114.0b7;
- 113.0b8, 113.0b9;
- RC 113.2.0;
Tested with: - Huawei MediaPad M2 (Android 5.1.1)
- Samsung Galaxy Tab A6 (Android 5.1.1)
Reporter | ||
Comment 3•1 years ago
|
||
This crash is now reproducible also on RC 114.0 build 2, with Samsung Galaxy Tab A6 (Android 5.1.1).
Old LG Leon (5.1.1) cannot run the latest Nightly 115.0a1 as well. FF just fails to start. There is no crash report either.
Updated•1 years ago
|
Comment 6•1 years ago
|
||
The bug is marked as tracked for firefox114 (beta). However, the bug still isn't assigned.
:amoya, could you please find an assignee for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit BugBot documentation.
Assignee | ||
Updated•1 years ago
|
Updated•1 years ago
|
Assignee | ||
Comment 7•1 years ago
|
||
First update while I continue to investigate this issue:
This is occurring across both x86 and AArch64 for API level 22 (with API level 21 tested only on x86 but also confirmed). The crash signature does not appear the same as the interposer issue we have seen before when running debug local builds of GV in Fenix. Unfortunately, the error given is not helpful yet.
I'm doing a manual bisection of GV local builds to find which revision caused the crashing and will update as soon as I have answers.
Comment hidden (obsolete) |
Assignee | ||
Comment 9•1 years ago
|
||
I checked a lot of builds today and there are 2 crashes that can occur which involve the interposer work previously done. Previously when crashing around the interposer we would see a call to get or set of the env from any library and it would crash on startup. The stack trace would show the native library (it could be any) called one of these methods and we would know to look at the interposer.
In these crashes the only stack trace we could gather was 2023-06-01 11:06:21.977 17661-17690 libc org.mozilla.fenix.debug A Fatal signal 11 (SIGSEGV), code 128, fault addr 0x0 in tid 17690 (Gecko)
There was some difficulty trying to get the tombstones from the CI builds which were failing as well but this is what the backtrace with symbols showed:
backtrace:
#00 pc 0000007568e1106c <unknown>
#01 pc 0000000000502250 /data/app/~~lJMmSH_Uxma-2dA9gAtNMg==/org.mozilla.geckoview_example-DF6N6UOkyAofIMBPgoUCxg==/lib/arm64/libxul.so!libxul.so (offset 0x502000) (BuildId: 4200bd12824e0527d37033861c395d5d68ef562f)
It seems the patch to fix the interposer which allows for the direct libc lookup when libmozglue has not been linked is now crashing Android 5.0 and 5.1 (which is only throwing the generic libc error message above). The crash location is HERE.
Backing out both interposer revisions fixes the issue on Android 5.0 and 5.1 for updated default
branch.
Comment 10•1 years ago
|
||
Looping in Alexandre as well in case he can help.
Comment 11•1 years ago
|
||
So it was a bit unclear how to get an android build reproducing, but I think I have something. I'll see how much I can try and help there.
06-02 12:33:54.760 1183 1183 I DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
06-02 12:33:54.760 1183 1183 I DEBUG : Build fingerprint: 'Android/sdk_google_phone_x86/generic_x86:5.1.1/LMY48X/6695563:userdebug/test-keys'
06-02 12:33:54.760 1183 1183 I DEBUG : Revision: '0'
06-02 12:33:54.760 1183 1183 I DEBUG : ABI: 'x86'
06-02 12:33:54.760 1183 1183 I DEBUG : pid: 4531, tid: 4551, name: Gecko >>> org.mozilla.geckoview_example <<<
06-02 12:33:54.760 1183 1183 I DEBUG : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x63626974
06-02 12:33:54.761 1183 1183 I DEBUG : eax 6362696c ebx ae410bd4 ecx 93af7d42 edx 00000000
06-02 12:33:54.761 1183 1183 I DEBUG : esi b77d23dc edi ae281ba7
06-02 12:33:54.762 1183 1183 I DEBUG : xcs 00000073 xds 0000007b xes 0000007b xfs 0000006f xss 0000007b
06-02 12:33:54.762 1183 1183 I DEBUG : eip ae308f34 ebp a0bff988 esp a0bff960 flags 00010292
06-02 12:33:54.762 1183 1183 I DEBUG :
06-02 12:33:54.762 1183 1183 I DEBUG : backtrace:
06-02 12:33:54.762 1183 1183 I DEBUG : #00 pc 00020f34 /data/app/org.mozilla.geckoview_example-1/lib/x86/libmozglue.so
06-02 12:33:54.762 1183 1183 I DEBUG : #01 pc 0002ddef /data/app/org.mozilla.geckoview_example-1/lib/x86/libmozglue.so
06-02 12:33:54.762 1183 1183 I DEBUG : #02 pc 0002bdac /data/app/org.mozilla.geckoview_example-1/lib/x86/libmozglue.so
06-02 12:33:54.762 1183 1183 I DEBUG : #03 pc 00b9acca /data/dalvik-cache/x86/data@app@org.mozilla.geckoview_example-1@base.apk@classes.dex
06-02 12:33:54.798 1183 1183 I DEBUG :
06-02 12:33:54.798 1183 1183 I DEBUG : Tombstone written to: /data/tombstones/tombstone_00
Comment 12•1 years ago
|
||
I am setting it as a release blocker since we officially support Android 5 and have a few users.
Comment 13•1 years ago
|
||
Finally have been able to get a working Android Studio ... Crash is at https://searchfox.org/mozilla-central/rev/87ba454e5c68ff77dff9acb9d7b0b51d6df12d11/mozglue/linker/ElfLoader.cpp#122 for me, which is indeed called by https://searchfox.org/mozilla-central/rev/87ba454e5c68ff77dff9acb9d7b0b51d6df12d11/mozglue/interposers/InterposerHelper.h#47
Comment 14•1 years ago
|
||
So I think we should just have a proper call to the wrapper on https://searchfox.org/mozilla-central/rev/87ba454e5c68ff77dff9acb9d7b0b51d6df12d11/mozglue/interposers/InterposerHelper.h#44
Comment 15•1 years ago
|
||
at least with void* handle = __wrap_dlopen("libc.so", RTLD_LAZY);
it's opening geckoview_example
Comment 16•1 years ago
|
||
Updated•1 years ago
|
Comment 17•1 years ago
|
||
Comment 18•1 years ago
|
||
I am setting the tracking flag back from blocking
to +
as we have a temporary mitigation in place for next week (block installs and updates for Android 5 devices) and a real fix incoming that we will test in nightly and beta next week and can include in our weekly Android update of the app.
Comment 19•1 years ago
|
||
bugherder |
Assignee | ||
Comment 20•1 years ago
|
||
Flagging this early while I continue testing this patch across API levels but so far I've found that on API level 28 and 29 it no longer crashes at launch but I'm getting a 100% reproducible crash in regular browsing behavior. It looks like we're crashing in libc for multiple different reasons (gpu process crash, Android UI crash). Without the patch this does not appear to occur (or could potentially be flaky? I need to continue testing without the patch). Below is the STR, I've attached a video of it occurring, and I'm also attaching stack traces
STR in GV example:
- Download diff for revision
- Update
default
branch git apply your_diff_file.patch
- Setup testing device: Pixel 6 pro - Arm64 - API 28
./mach build
- Build and run GV example
- Browse to slickdeals.net
- When the iframe for "Sign in with Google" pops up click the X to close
Assignee | ||
Comment 21•1 years ago
|
||
Assignee | ||
Comment 22•1 years ago
|
||
Comment 23•1 years ago
|
||
(In reply to Zac McKenney [:zmckenney] from comment #20)
Flagging this early while I continue testing this patch across API levels but so far I've found that on API level 28 and 29 it no longer crashes at launch but I'm getting a 100% reproducible crash in regular browsing behavior. It looks like we're crashing in libc for multiple different reasons (gpu process crash, Android UI crash). Without the patch this does not appear to occur (or could potentially be flaky? I need to continue testing without the patch). Below is the STR, I've attached a video of it occurring, and I'm also attaching stack traces
STR in GV example:
- Download diff for revision
- Update
default
branchgit apply your_diff_file.patch
- Setup testing device: Pixel 6 pro - Arm64 - API 28
./mach build
- Build and run GV example
- Browse to slickdeals.net
- When the iframe for "Sign in with Google" pops up click the X to close
Your stack shows an unrelated crash: 2023-06-02 15:25:27.985 17581-17627 MOZ_Assert org.mozilla.geckoview_example A Assertion failure: sideBits == hit.mNode->GetFixedPosSides() (Fixed position side bits do not match), at /Users/mozilla/StudioProjects/gecko/gfx/layers/apz/src/WRHitTester.cpp:230
Updated•1 years ago
|
Assignee | ||
Comment 25•1 years ago
|
||
I agree that this appears unrelated which is why I had to triple check myself. I was finding multiple different reasons for the crash as well, I'll add another stack trace to an issue that at first glance also seems unrelated. Unfortunately, checking out default and doing a clean mach build then running GV example never crashes but as soon as I just make the changes in the patch for this bug it's 100% reproducible.
Maybe we could have someone else also confirm with their default checkout the no-crash and then run my STR? I really didn't believe it to be related either and began writing a new bug before realizing that it only was occurring for me with this patch. If someone else is able to confirm that would help though.
Assignee | ||
Comment 26•1 years ago
|
||
Updated•1 years ago
|
Comment 27•1 years ago
|
||
You do realize that on API levels > 22, the current patch does not make any changes? We still call dlopen()
directly (with one level of indirection maybe, but it's static inline
).
Either way, it's weekend and I am attending an event I can't investigate this.
Assignee | ||
Comment 28•1 years ago
|
||
Sorry for the added confusion, I'm on family vacation as well but now that it has officially landed I tested to see if it still crashes and it no longer does. I recognize that it was only supposed to run in API levels > 22 and I'm not sure why this was happening before when applying the diff for this but it does appear unrelated.
Comment 29•1 years ago
|
||
No problem, good to know it was just a simple mistake :)
Reporter | ||
Comment 30•1 years ago
|
||
Verified on the latest Fenix Nightly 116.0a1 from 6/6, and Beta 115.0b1 with the following devices:
- Huawei MediaPad M2 (Android 5.1.1), and
- Samsung Galaxy Tab A6 (Android 5.1.1).
Both apps could be opened and used, no crash occured.
Comment 31•1 years ago
|
||
Zac, could you request uplift to mozilla-release please? Thanks
Comment 32•1 years ago
|
||
Or Alexandre, as the patch author, Thanks
Comment 33•1 years ago
|
||
Comment on attachment 9337246 [details]
Bug 1835231 - Use dlopen() wrapper for Android <= 22 r?gsvelto!
Beta/Release Uplift Approval Request
- User impact if declined: instant crash
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: Yes
- If yes, steps to reproduce: install, try to start
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): fix easy
- String changes made/needed: no
- Is Android affected?: Yes
Updated•1 years ago
|
Comment 34•1 years ago
|
||
Comment on attachment 9337246 [details]
Bug 1835231 - Use dlopen() wrapper for Android <= 22 r?gsvelto!
Beta/Release Uplift Approval Request
- User impact if declined: instant crash
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: Yes
- If yes, steps to reproduce: install, try to run
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): easy fix
- String changes made/needed: no
- Is Android affected?: Yes
Updated•1 years ago
|
Comment 35•1 years ago
|
||
Comment on attachment 9337246 [details]
Bug 1835231 - Use dlopen() wrapper for Android <= 22 r?gsvelto!
Approved for our 114.0.1 release, thanks.
Comment 36•1 years ago
|
||
bugherder uplift |
Comment 37•1 years ago
•
|
||
Verified as fixed on the latest RC 114.1.0 and on latest Fenix Nightly 116.0a1 from 06/09 and Beta 115.0b3 as well with the following devices:
- Huawei MediaPad M2 (Android 5.1.1)
- Samsung Galaxy Tab A6 (Android 5.1.1)
Description
•