Closed Bug 959931 Opened 11 years ago Closed 11 years ago

Unwinder loads libmozglue.so a second time, resulting in dangling sigaction hook

Categories

(Core :: Gecko Profiler, defect)

All
Android
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla30

People

(Reporter: jchen, Assigned: jchen)

References

Details

Attachments

(1 file, 2 obsolete files)

The special case [1] in the Android unwinder tries to dlopen libmozglue.so using a full path. This was to ensure that faulty.lib could locate the library. However, this is no longer necessary: because libmozglue.so is always loaded, if we simply pass in "libmozglue.so", faulty.lib will return the handle to the already loaded library.

The problem with passing in a full libmozglue path is that, on newer Android versions, the path "/data/data/org.mozilla.fennec/lib" is actually a symlink to e.g. "/data/app-lib/org.mozilla.fennec-1". Therefore, the real full path for libmozglue.so is actually "/data/app-lib/org.mozilla.fennec-1/libmozglue.so". If we dlopen("/data/data/org.mozilla.fennec/lib/libmozglue.so"), faulty.lib will get confused about the symlink. Instead of returning the already loaded library, it will try to load libmozglue.so a second time using the symlinked path.

Loading libmozglue.so a second time is problematic, because during its static initialization, it hooks into the libc.so sigaction function [2]. The hooked sigaction jumps straight into a function inside libmozglue.so [3]. Because the second libmozglue.so is only used by the unwinder, it's quickly unloaded when the unwinder finishes. However, it leaves behind the hooked sigaction function that still tries to jump into the now-unloaded libmozglue.so. Any future calls to sigaction result in segfaults inside what appears to be unallocated memory.

There are several possible solutions to this issue. We can pass in a relative "libmozglue.so" to dlopen in the unwinder code. We can fix faulty.lib to handle symlinks. We can make libmozglue unhook sigaction when it's unloaded. I think the simplest way is to just remove the special case in the unwinder.

[1] http://mxr.mozilla.org/mozilla-central/source/tools/profiler/local_debug_info_symbolizer.cc?rev=7d5863279812#99
[2] http://mxr.mozilla.org/mozilla-central/source/mozglue/linker/ElfLoader.cpp?rev=8c947073f4ea#975
[3] http://mxr.mozilla.org/mozilla-central/source/mozglue/linker/ElfLoader.cpp?rev=8c947073f4ea#1110
This patch fixes the above crash when unwinding is turned on in the ANRReporter. I also verified that a valid handle is still returned when passing the relative path "libmozglue.so" to dlopen.
Attachment #8360203 - Flags: review?(jseward)
I can't verify the results, using M-C of a couple of hours ago.

Firstly, M-C by itself no longer segfaults for me.  I think that
might be because of the landing of the 3rd (smallest) patch of
bug 951431, although I am not sure about that.

Secondly .. handling of libmozglue.so is apparently broken anyway
and the patch makes it worse.  I did the following 4 tests:

(1) without patch, execute direct from APK
(2) without patch, MOZ_LINKER_EXTRACT=1
(3) with patch, execute direct from APK
(4) with patch, MOZ_LINKER_EXTRACT=1

and only in case (2) did symbol data from libmozglue.so get read
successfully.  In all other cases, it fails with

E/Profiler( 4572): 2014-01-16 17:48:40: local_debug_info_symbolizer.cc:121:
INFO: ReadSymbolData_APK: Unable to get size for ELF file 'libmozglue.so'

Can you verify the above (1) .. (4) ?  I think I did the tests right, but
there's always some doubt ..
Comment on attachment 8360203 [details] [diff] [review]
Don't special-case libmozglue.so when unwinding (v1)

Review of attachment 8360203 [details] [diff] [review]:
-----------------------------------------------------------------

r- because it causes regressions as per comment 2, per my testing.
Attachment #8360203 - Flags: review?(jseward) → review-
Okay I see now that the previous patch was incomplete.

This patch uses a different approach to avoid loading libmozglue.so a second time. If a full path is given to dlopen() but the custom linker only knows a file name for that library, the linker will compare file names instead of trying to compare full paths (which would always fail because the linker doesn't have a full path to compare to). In addition, the linker will also update its records to have the full path, so that future __dl_get_mappable_length() and __dl_mmap() calls will succeed.
Attachment #8360203 - Attachment is obsolete: true
Attachment #8365431 - Flags: review?(mh+mozilla)
Comment on attachment 8365431 [details] [diff] [review]
Try harder to find existing library (v1)

Review of attachment 8365431 [details] [diff] [review]:
-----------------------------------------------------------------

Too hackish for my taste. I'd rather see ReadSymbolData_ANDROID use plain open/mmap/close for libmozglue.so instead, until the linker has the right support to make this unnecessary (which is not trivial, but necessary for other reasons anyways, so it will happen some day).
Attachment #8365431 - Flags: review?(mh+mozilla) → review-
> Too hackish for my taste. I'd rather see ReadSymbolData_ANDROID use plain
> open/mmap/close for libmozglue.so instead, until the linker has the right
> support to make this unnecessary (which is not trivial, but necessary for
> other reasons anyways, so it will happen some day).

Okay. It's hackish either way :)  But here's the patch to use open/mmap.
Attachment #8365431 - Attachment is obsolete: true
Attachment #8367421 - Flags: review?(jseward)
Comment on attachment 8367421 [details] [diff] [review]
Use open/mmap for libmozglue unwinding (v1)

Looks OK to me, and appears to work per, my brief testing.
Attachment #8367421 - Flags: review?(jseward) → review+
https://hg.mozilla.org/mozilla-central/rev/c6d28c1dcfee
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: