Closed Bug 1598068 Opened 5 years ago Closed 4 years ago

SIGSEGV; avc denied open for /dev/ashmem when targeting SDK version 29+

Categories

(GeckoView :: General, defect, P1)

70 Branch
Unspecified
All
defect

Tracking

(firefox75 fixed)

RESOLVED FIXED
mozilla75
Tracking Status
firefox75 --- fixed

People

(Reporter: jess.schallenberg, Assigned: snorp)

References

Details

(Whiteboard: [geckoview:m75][geckoview:m76])

Attachments

(3 files)

Attached file logcat

User Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:70.0) Gecko/20100101 Firefox/70.0

Steps to reproduce:

Created a new blank app in Studio, then followed this getting started guide from the docs: https://mozilla.github.io/geckoview/consumer/docs/geckoview-quick-start - tried both latest stable and nightly

Actual results:

App immediately crashes with a SIGSEGV after failing to open /dev/ashmem (see logcat)

Expected results:

It should load the page passed in in session.loadUri() call

Flags: needinfo?(agi)

Jess, what device are you running this in?

Flags: needinfo?(agi) → needinfo?(jess.schallenberg)

(In reply to :Agi | ⏰ PST | he/him from comment #1)

Jess, what device are you running this in?

It's a OnePlus 6T, OxygenOS / Android 10 (all stock, no root, locked bootloader)

Flags: needinfo?(jess.schallenberg)
Rank: 1
Priority: -- → P3

Have the same problem with multiple different devices tested.

Priority: P3 → --
Priority: -- → P2
Whiteboard: [geckoview:m75]

This can be reproduced building with targetSdkVersion 29 for Android 10. 28 seems to be the last non-breaking version.

Summary: SIGSEGV; avc denied open for /dev/ashmem → SIGSEGV; avc denied open for /dev/ashmem when targeting SDK version 29+

Indeed, the Android 10 release notes state[1]

Apps targeting Android 10 cannot directly use ashmem (/dev/ashmem) and must instead access shared memory via the NDK’s ASharedMemory
class. In addition, apps cannot make direct IOCTLs to existing ashmem file descriptors and must instead use either the NDK’s ASharedMemory
class or the Android Java APIs for creating shared memory regions. This change increases security and robustness when working with shared
memory, improving performance and security of Android overall.

Guess we need to start using ASharedMemory.

[1] https://developer.android.com/about/versions/10/behavior-changes-10

Assignee: nobody → snorp

Apps targeting SDK 29 are not allowed to open /dev/ashmem directly, and
instead must use NDK functions. Those functions are only available in
SDK 26 and higher, so we need this shim to use the functions if they
are available, else fallback to opening /dev/ashmem directly.

Pushed by jwillcox@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/eba60d849030
Add ashmem abstraction to mozglue and use it everywhere r=glandium,jld
https://hg.mozilla.org/integration/autoland/rev/5fb8d24977eb
Increase Android targetSdk to 29 r=geckoview-reviewers,aklotz

I'm not the toolchain expert here, but… that's not supposed to happen with weak symbols?

Yeah, it looks like LTO blew this up, and I don't understand why. I thought perhaps adding used to the attributes would help, but it did not[1]. Glandium, any ideas here?

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=0d8c6a0cc453477fdf0ae57d3f7fab19ac4e0ca2

Flags: needinfo?(snorp) → needinfo?(mh+mozilla)

It looks like a bug in BFD ld. It doesn't happen with gold, and I presume it wouldn't happen with more recent versions of GNU ld.

Flags: needinfo?(mh+mozilla)

(In reply to Mike Hommey [:glandium] from comment #14)

It looks like a bug in BFD ld. It doesn't happen with gold, and I presume it wouldn't happen with more recent versions of GNU ld.

Assuming we don't want to change the toolchain, will you approve a patch with the dlopen() approach?

Flags: needinfo?(mh+mozilla)

I'm tempted to say add -fuse-ld=gold to the LDFLAGS for mozglue only, on those specific Android (aarch64?) builds.

Flags: needinfo?(mh+mozilla)

This is being worked on, so bumping priority to P1

Rank: 1
Priority: P2 → P1

Apparently there's still some problem, as cppunittest and xpcshell are segfaulting.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=9a7bbbd3daca58ca182dd5cd1c3dab6839b3f2e0&selectedJob=289947487

I can't seem to repro this locally so far.

glandium, this is segfaulting in the system linker for some reason, and I have no idea why. How do you feel about dlopen() now?

02-26 23:30:09.330 3115 3115 F libc : Fatal signal 11 (SIGSEGV), code 1, fault addr 0x61b60 in tid 3115 (TestPrintf)
02-26 23:30:09.330 1298 1298 W : debuggerd: handling request: pid=3115 uid=0 gid=0 tid=3115
02-26 23:30:09.382 3116 3116 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
02-26 23:30:09.382 3116 3116 F DEBUG : Build fingerprint: 'Android/sdk_phone_x86_64/generic_x86_64:7.0/NYC/4174735:userdebug/test-keys'
02-26 23:30:09.382 3116 3116 F DEBUG : Revision: '0'
02-26 23:30:09.382 3116 3116 F DEBUG : ABI: 'x86_64'
02-26 23:30:09.382 3116 3116 F DEBUG : pid: 3115, tid: 3115, name: TestPrintf >>> /data/local/tests/cppunittests/b/TestPrintf <<<
02-26 23:30:09.382 3116 3116 F DEBUG : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x61b60
02-26 23:30:09.383 3116 3116 F DEBUG : rax 00007c8b12fc82b0 rbx 0000000000000028 rcx 00007c8b12a7c2ff rdx 00007c8b12fc82c0
02-26 23:30:09.383 3116 3116 F DEBUG : rsi 000003743f85e3c0 rdi 0000000000000028
02-26 23:30:09.383 3116 3116 F DEBUG : r8 00007c8b12fc82c0 r9 00007fff52827d28 r10 000000000000004d r11 00007c8b12b23808
02-26 23:30:09.383 3116 3116 F DEBUG : r12 0000000000000000 r13 0000000000000001 r14 0000000000000000 r15 00007c8b10f83a28
02-26 23:30:09.383 3116 3116 F DEBUG : cs 0000000000000033 ss 000000000000002b
02-26 23:30:09.383 3116 3116 F DEBUG : rip 0000000000061b60 rbp 00007fff52826770 rsp 00007fff52826758 eflags 0000000000000206
02-26 23:30:09.384 3116 3116 F DEBUG :
02-26 23:30:09.384 3116 3116 F DEBUG : backtrace:
02-26 23:30:09.384 3116 3116 F DEBUG : #00 pc 0000000000061b60 <unknown>
02-26 23:30:09.384 3116 3116 F DEBUG : #01 pc 0000000000004bfa /system/lib64/libbase.so
02-26 23:30:09.384 3116 3116 F DEBUG : #02 pc 0000000000004bef /system/lib64/libbase.so
02-26 23:30:09.384 3116 3116 F DEBUG : #03 pc 0000000000004bf0 /system/lib64/libbase.so
02-26 23:30:09.384 3116 3116 F DEBUG : #04 pc 0000000000011866 /system/bin/linker64 (__dl__ZN6soinfo10call_arrayEPKcPPFvvEmb+374)
02-26 23:30:09.384 3116 3116 F DEBUG : #05 pc 000000000000f888 /system/bin/linker64 (__dl__ZN6soinfo17call_constructorsEv+136)
02-26 23:30:09.384 3116 3116 F DEBUG : #06 pc 000000000000f888 /system/bin/linker64 (__dl__ZN6soinfo17call_constructorsEv+136)
02-26 23:30:09.384 3116 3116 F DEBUG : #07 pc 000000000000f888 /system/bin/linker64 (__dl__ZN6soinfo17call_constructorsEv+136)
02-26 23:30:09.384 3116 3116 F DEBUG : #08 pc 000000000000f888 /system/bin/linker64 (__dl__ZN6soinfo17call_constructorsEv+136)
02-26 23:30:09.384 3116 3116 F DEBUG : #09 pc 000000000000f888 /system/bin/linker64 (__dl__ZN6soinfo17call_constructorsEv+136)
02-26 23:30:09.384 3116 3116 F DEBUG : #10 pc 000000000000f888 /system/bin/linker64 (__dl__ZN6soinfo17call_constructorsEv+136)
02-26 23:30:09.384 3116 3116 F DEBUG : #11 pc 000000000000f888 /system/bin/linker64 (__dl__ZN6soinfo17call_constructorsEv+136)
02-26 23:30:09.384 3116 3116 F DEBUG : #12 pc 0000000000014e32 /system/bin/linker64 (__dl__ZL29__linker_init_post_relocationR19KernelArgumentBlocky+3554)
02-26 23:30:09.384 3116 3116 F DEBUG : #13 pc 0000000000013fbd /system/bin/linker64 (__dl___linker_init+605)
02-26 23:30:09.384 3116 3116 F DEBUG : #14 pc 000000000000bba7 /system/bin/linker64 (_start+7)
02-26 23:30:09.384 3116 3116 F DEBUG : #15 pc 0000000000000000 <unknown>

Flags: needinfo?(mh+mozilla)

Can you do another try with no other change than switching to gold?

Flags: needinfo?(mh+mozilla) → needinfo?(snorp)

(In reply to Mike Hommey [:glandium] from comment #20)

Can you do another try with no other change than switching to gold?

You mean for the whole build? I tried that with --enable-linker=gold, and the configure step fails in automation[1] (but works here). What else should I try?

[1] https://treeherder.mozilla.org/logviewer.html#?job_id=290430445&repo=try

Flags: needinfo?(snorp) → needinfo?(mh+mozilla)

I mean, just the -fuse-ld=gold part of the patch.

Flags: needinfo?(mh+mozilla) → needinfo?(snorp)

(In reply to Mike Hommey [:glandium] from comment #22)

I mean, just the -fuse-ld=gold part of the patch.

Ah. Try run going here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=13ac5456fd4b9f3189abdf20b23ad2c329db0371

It looks like cppunittest has already completed, so only a problem when there are weak symbols.

Flags: needinfo?(snorp)
Flags: needinfo?(mh+mozilla)

I guess go with the dlsym version, then :(
I'd suggest making it thread-safe by wrapping static variables in functions. e.g.

void* libhandle() {
  static void* handle = dlopen(...);
  return handle;
}

void (*)(...) func() {
  static void (*f)(...) = (void (*)(...)) dlsym(libhandle(), "func");
  return f;
}
Flags: needinfo?(mh+mozilla)
Whiteboard: [geckoview:m75] → [geckoview:m75][geckoview:m76]
Pushed by jwillcox@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2209084da439
Add ashmem abstraction to mozglue and use it everywhere r=glandium,jld
Status: UNCONFIRMED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla75
Pushed by jwillcox@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/777d941a0f0e
Increase Android targetSdk to 29 r=geckoview-reviewers,aklotz
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: