Closed
Bug 1108228
Opened 10 years ago
Closed 10 years ago
test_tone.cpp failed with new Bionic branch for ICS devices on B2G
Categories
(Core :: Audio/Video, defect)
Tracking
()
RESOLVED
FIXED
2.2 S3 (9jan)
Tracking | Status | |
---|---|---|
b2g-v2.2 | --- | fixed |
People
(Reporter: _AtilA_, Assigned: _AtilA_)
References
Details
Attachments
(2 files)
Bug 1056337 is about upgrading ICS toolchain to a newer gcc-4.8 based one, in order to upgrade it, we needed to apply some upstream patches to Bionic so we decided to branch the original one [1].
Now test_tone.gcc unit test is failing because some kind of heap corruption (I guess):
Remote debugging from host 127.0.0.1
__dl__start () at bionic/linker/arch/arm/begin.S:35
35 mov r0, sp
(gdb) c
Continuing.
stream started
stream stopped
Program received signal SIGSEGV, Segmentation fault.
0x42bf4910 in ?? ()
(gdb) bt
#0 0x42bf4910 in ?? ()
#1 0x4007cbf4 in __cxa_finalize (dso=0x0) at bionic/libc/stdlib/atexit.c:158
#2 0x4007cf90 in exit (status=0) at bionic/libc/stdlib/exit.c:57
#3 0x4007cf90 in exit (status=0) at bionic/libc/stdlib/exit.c:57
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
This is the only test failing so far, and I'm using an ICS emulator build with some patches for the mainfest file to point to our Bionic ICS branch.
I'm going to investigate how things are working in Fennec, because they must be using a similar enviornment.
[1] https://github.com/mozilla-b2g/platform_bionic/tree/b2g-4.0.4_r2.1
Comment 2•10 years ago
|
||
Hi Wayne:
Please help check this problem first.
Thanks!!
Shawn
Flags: needinfo?(sku) → needinfo?(waychen)
Comment 3•10 years ago
|
||
(In reply to Juan Gomez [:_AtilA_] from comment #0)
> Program received signal SIGSEGV, Segmentation fault.
> 0x42bf4910 in ?? ()
> (gdb) bt
> #0 0x42bf4910 in ?? ()
> #1 0x4007cbf4 in __cxa_finalize (dso=0x0) at bionic/libc/stdlib/atexit.c:158
> #2 0x4007cf90 in exit (status=0) at bionic/libc/stdlib/exit.c:57
> #3 0x4007cf90 in exit (status=0) at bionic/libc/stdlib/exit.c:57
Can you post the results of running 'info shared' in gdb when this crash happens?
Flags: needinfo?(atilag)
Assignee | ||
Comment 4•10 years ago
|
||
Ok, this is the link to the memory map when crash happens: http://pastebin.mozilla.org/8099004
it always point to this memmory area.
The crash is always happening at the end of the program, when main() exits and all destructors and callbacks registered with the atexit() or __cxa_atexit() functions are being called.
I checked that no one is calling atexit() (neither __cxa_atexit).
To call these destructors/callbacks, __cxa_finalize() iterate over an array of elements (structs) that have function pointers [1], in reverse order (from 260 to 0). It always crash in the second iteraction (array index 259) because the function pointer points to 0x42bf4910 which is not a valid memory region.
Examining the array before it crashes, showed me that the next element (258) is invalid too, but the third one (257) is ok as well as the rest of the elements (256, 255, 254 ... at least a bunch of them).
[1] https://github.com/mozilla-b2g/platform_bionic/blob/b2g-4.0.4_r2.1/libc/stdlib/atexit.c#L137
Flags: needinfo?(atilag)
Comment 5•10 years ago
|
||
__cxa_finalize runs the static destructors, so this probably means that the module that registered the static destructor has somehow been unloaded before we exit from main. Can you do an info shared at the beginning of main or even better after having called cubeb_init() to see if there is something there that disappears by the time we crashed? And can you also please get a memory map listing at that time?
Comment 6•10 years ago
|
||
Let's just drop the dependency here, disable this test, and sort it out after-the-fact. I mentioned this in #developers, but I think what you want to do is:
if not (CONFIG['MOZ_WIDGET_TOOLKIT'] == 'gonk' and CONFIG['ANDROID_VERSION'] == '15'):
GeckoCppUnitTests([
'test_tone'
])
Assignee | ||
Comment 7•10 years ago
|
||
Ok, found the problem.
As Eshan pointed out, libcubeb is dynamically loading libOpenSLES.so in cubeb_init() and unloading it in cubeb_destroy(). In fact, the pointer that causes the crash is pointing into the memory address range previously ocuppied by this library before it gets unloaded via cubeb_destroy() -> dlclose().
This is a Bionic bug, so I'm now investigating if this was already reported (it should) and how to bring a possible fix to our fork.
Assignee | ||
Comment 8•10 years ago
|
||
UPDATED: Fixed, and even found a valid commit in the AOSP repository, so we just need to cherry-pick.
The fix basically makes sure that __on_dlclose()[1] is inserted into the FINI_ARRAY so it gets called by dlclose(). There's only one file needed to patch: crtbegin_so.c. This file only exists in the two branches created for bug 1056337.
I'll PR with the fixes in a few hours.
[1] __on_dlclose() calls __cxa_finalize().
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → atilag
Assignee | ||
Comment 9•10 years ago
|
||
Cherry-picked from AOSP: 5ed48a4d7fece002afbbd2bd981563aea6e52e24
Attachment #8543431 -
Flags: review?(mwu)
Updated•10 years ago
|
Flags: needinfo?(waychen)
Comment 11•10 years ago
|
||
Ugh, typos! Great find, Juan!
Comment 12•10 years ago
|
||
(In reply to Juan Gomez [:_AtilA_] from comment #8)
> The fix basically makes sure that __on_dlclose()[1] is inserted into the
> FINI_ARRAY so it gets called by dlclose(). There's only one file needed to
> patch: crtbegin_so.c.
If this is part of the crtstuff that's linked into every shared library, doesn't that mean that all shared libraries that might be dlclosed need to be relinked? And, if so, does that apply only to the dlclosed library itself, or would its dependencies (if not otherwise needed) also be implicitly dlclosed? This wouldn't be a problem on the emulator, where we have source for everything, but physical devices typically have some closed-source libraries which could be affected.
Assignee | ||
Comment 13•10 years ago
|
||
Yes, all shared libraries should be relinked if they were wrong, but they are not, so we don't need to. Let me explain, we needed to fork from AOSP Bionic used in ICS because some other PIC-compatible problem showed up while compiling with gcc-4.8. We fixed this problem applying a patch from other cherry-pick (maybe a commit for JB releases, I didn't check it out) but it looks like this new patch has introduced another bug, the one we are trying to fix here. The master branch from Bionic is not affected by this bug so all the precompiled libreries are fine, because they were linked against this version.
Comment 14•10 years ago
|
||
Comment on attachment 8543433 [details] [review]
PR: make sure __on_dlclose() actually gets called on dlclose() (branch: b2g-4.0.4_r2.1)
Nice!
Attachment #8543433 -
Flags: review?(mwu) → review+
Updated•10 years ago
|
Attachment #8543431 -
Flags: review?(mwu) → review+
Assignee | ||
Updated•10 years ago
|
Keywords: checkin-needed
Comment 15•10 years ago
|
||
b2g_ics_strawberry: https://github.com/mozilla-b2g/platform_bionic/commit/1a2a32eda22ef2cd18f57f423a5e7b22a105a6f8
b2g-4.0.4_r2.1: https://github.com/mozilla-b2g/platform_bionic/commit/e2b3733ba3fa5e3f404e983d2e4142b1f6b1b846
Status: NEW → RESOLVED
Closed: 10 years ago
status-b2g-v2.2:
--- → fixed
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → 2.2 S3 (9jan)
You need to log in
before you can comment on or make changes to this bug.
Description
•