Closed
Bug 1484828
Opened 6 years ago
Closed 6 years ago
The linux shared library code doesn't preserve enough information to do accurate symbolication of Android libraries whose first mapping has a non-zero vaddr
Categories
(Core :: Gecko Profiler, enhancement)
Tracking
()
RESOLVED
FIXED
mozilla64
People
(Reporter: mstange, Assigned: mstange)
References
Details
Attachments
(3 files)
There's a problem with how the code in shared-libraries-linux.cc fills the fields of the SharedLibrary objects. This currently makes accurate symbolication impossible for certain Android system libraries, like libESXGLESv2_adreno.so. Here's our dl_iterate_callback: https://searchfox.org/mozilla-central/rev/83562422ecb0f5683da7a9a26ce05ce62bc0c882/tools/profiler/core/shared-libraries-linux.cc#129-156 It records the range of addresses that are covered by a library's mappings, but it does not remember the offset to the library's base address. But during symbolication, for every pc address we need to subtract the library's base address in order to get a value that can be meaningfully compared to the addresses in the library's symbol table. Or, more specifically: Our SharedLibrary object has these fields: uintptr_t mStart; uintptr_t mEnd; uintptr_t mOffset; And our current dl_iterate_callback makes it so that mOffset is always zero. It needs to not be zero for libraries whose first mapping has a non-zero vaddr. Notes: > Current status: > > filteredThread.libs.find(l => l.name == "libESXGLESv2_adreno.so") > { > name: "libESXGLESv2_adreno.so", > path: "/system/vendor/lib/egl/libESXGLESv2_adreno.so", > debugName: "libESXGLESv2_adreno.so", > debugPath: "/system/vendor/lib/egl/libESXGLESv2_adreno.so", > breakpadId: "E83024E3490FD596F3EBD25DCE1A68370", > arch: "", > start: 0xa57b4000, > end: 0xa5c00ff9, > offset: 0, > } > > /proc/9201/maps: a57b4000-a5ab5000 r-xp 00000000 fd:00 3008 /system/vendor/lib/egl/libESXGLESv2_adreno.so > /proc/9201/maps: a5ab5000-a5bf2000 r--p 00300000 fd:00 3008 /system/vendor/lib/egl/libESXGLESv2_adreno.so > /proc/9201/maps: a5bf2000-a5c00000 rw-p 0043d000 fd:00 3008 /system/vendor/lib/egl/libESXGLESv2_adreno.so > > ph [2] for /system/vendor/lib/egl/libESXGLESv2_adreno.so with base address 0xa57a2000, offset 0x0, vaddr 0x12000, memsz 0x300750 > ph [3] for /system/vendor/lib/egl/libESXGLESv2_adreno.so with base address 0xa57a2000, offset 0x3007d8, vaddr 0x3137d8, memsz 0x14b821 > > $ .mozbuild/android-ndk-r15c/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin/arm-linux-androideabi-readelf -l libESXGLESv2_adreno.so > [...] > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > PHDR 0x000034 0x00012034 0x00012034 0x00120 0x00120 R 0x4 > INTERP 0x000154 0x00012154 0x00012154 0x00013 0x00013 R 0x1 > [Requesting program interpreter: /system/bin/linker] > LOAD 0x000000 0x00012000 0x00012000 0x300750 0x300750 R E 0x1000 > LOAD 0x3007d8 0x003137d8 0x003137d8 0x14a5dc 0x14b821 RW 0x1000 > DYNAMIC 0x43b750 0x0044e750 0x0044e750 0x00150 0x00150 RW 0x4 > NOTE 0x000168 0x00012168 0x00012168 0x00038 0x00038 R 0x4 > GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0 > EXIDX 0x1e71a0 0x001f91a0 0x001f91a0 0x08a30 0x08a30 R 0x4 > GNU_RELRO 0x3007d8 0x003137d8 0x003137d8 0x13c828 0x13c828 RW 0x8 > > Example frame address in process virtual memory: 0xa584ede3 > > 0xa584ede3 - 0xa57a2000 = 0xacde3 > > 000acda0 T EsxContext::GlClear(unsigned int) <-- correct! > 000ace00 T EsxContext::GlClearColor(float, float, float, float) > > So, want: > > { > name: "libESXGLESv2_adreno.so", > path: "/system/vendor/lib/egl/libESXGLESv2_adreno.so", > debugName: "libESXGLESv2_adreno.so", > debugPath: "/system/vendor/lib/egl/libESXGLESv2_adreno.so", > breakpadId: "E83024E3490FD596F3EBD25DCE1A68370", > arch: "", > start: 0xa57a2000 + 0x00012000, > end: 0xa57a2000 + 0x00012000 + 0x300750, > offset: 0x00012000, > }
Assignee | ||
Comment 1•6 years ago
|
||
Assignee | ||
Comment 2•6 years ago
|
||
Depends on D3834
Assignee | ||
Comment 3•6 years ago
|
||
I don't understand what's happening in this code and would appreciate some guidance. It's possible that there are two different meanings of the word "offset" that are clashing here. SharedLibrary::GetOffset() means firstMappingStart - baseAddress. I get a feeling that the "zero offset" check here cares more about the mapping offset phdr.p_offset, but I'm not sure. Anyway, the purpose of this patch is the following: Now that the previous patch sets SharedLibrary::mOffset to something non-zero, this check breaks unwinding for the affected libraries. With the check removed, unwinding seems to work mostly fine, as evidenced by this profile: https://perfht.ml/2MBKzON But removing the check might break something else that I'm not seeing. Depends on D3835
Comment 4•6 years ago
|
||
It took some time staring at this code and trying to remember what was going on with the profiler 5 years ago, but I think I've got it. (In reply to Markus Stange [:mstange] from comment #3) > I don't understand what's happening in this code and would appreciate some > guidance. It's possible that there are two different meanings of the word > "offset" that are clashing here. > SharedLibrary::GetOffset() means firstMappingStart - baseAddress. > I get a feeling that the "zero offset" check here cares more about the > mapping offset phdr.p_offset, but I'm not sure. Yes, the problem I was having (and what that TODO comment about the dynamic linker was referring to) was with the case where no mapping covers file offset 0 (p_offset, not p_vaddr), meaning that I couldn't access the ELF headers to look for the PT_ARM_EXIDX. And I couldn't just get the phdrs directly from dl_iterate_phdr because we didn't have dl_iterate_phdr on B2G[1]. There are magic number checks in the EHTable constructor, so if it gets a pointer to something that isn't an ELF file header it should skip the library. It would be nice to explicitly avoid that case if possible, but it's not essential. As for the case we're encountering here, where p_offset 0 is loaded at non-zero p_vaddr: I thought we had to deal with that on B2G (ICS libc, maybe?) and had gotten it working, but I suppose either it bit-rotted or I misremember. [1] https://searchfox.org/mozilla-central/rev/42fe9dcb23ad00e54140dc5ae4ad3a068f040056/tools/profiler/shared-libraries-linux.cc#84-85
Comment 5•6 years ago
|
||
Comment on attachment 9002596 [details] Bug 1484828 - Set SharedLibrary::mOffset to firstMappingStart - baseAddress. r=glandium Mike Hommey [:glandium] has approved the revision.
Attachment #9002596 -
Flags: review+
Assignee | ||
Comment 6•6 years ago
|
||
Thanks for the fast reviews! And sorry that I'm so slow to update these patches. I'm looking at them again now.
Assignee | ||
Comment 7•6 years ago
|
||
I've addressed the comments. Not sure if moz-phab re-requested review on the patches.
Assignee | ||
Comment 8•6 years ago
|
||
I haven't found out how to re-request review in Phabricator yet - Jed, could you take another look?
Flags: needinfo?(jld)
Comment 10•6 years ago
|
||
Pushed by mstange@themasta.com: https://hg.mozilla.org/integration/autoland/rev/9d52b6dad6f5 Set SharedLibrary::mOffset to firstMappingStart - baseAddress. r=glandium https://hg.mozilla.org/integration/autoland/rev/5af47e5b67f0 Rename some fields in EHABIStackWalk.cpp. r=jld https://hg.mozilla.org/integration/autoland/rev/9b246aa2b9cd Don't skip libraries with non-zero offsets. r=jld
Comment 11•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/9d52b6dad6f5 https://hg.mozilla.org/mozilla-central/rev/5af47e5b67f0 https://hg.mozilla.org/mozilla-central/rev/9b246aa2b9cd
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
status-firefox64:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Updated•6 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•