Open Bug 1342076 Opened 7 years ago Updated 2 years ago

make symbolication work on Linux

Categories

(Core :: mozglue, enhancement, P5)

enhancement

Tracking

()

People

(Reporter: ehsan.akhgari, Unassigned)

References

()

Details

Attachments

(6 obsolete files)

dladdr() used here isn't really good enough since it only gives you symbols names for dynamic symbols, and most of our symbols aren't dynamic.

We should probably switch to using the addr2line https://docs.rs/addr2line/0.2.1/addr2line/ library instead.
Two things to weigh:
- addr2line does its work with DWARF info. We never ship DWARF info (and it's massive)
- using something else, it would be possible to resolve symbols from the .symtab section instead of .dynsym section, but that requires the .symtab section to be there in the first place, and it's not there on release and beta (and maybe aurora, I never remember if it has profiling enabled)
Type: defect → enhancement
Priority: -- → P5

I want to solve this so that I can locally symbolicate Gecko profiles for bug 1569077. As I discovered (Mike knows the details), there are difficulties with adding Rust code to mozglue/, which is where StackWalk.cpp lives. I ended up writing some C++ (sorry!) to parse .symtab and DWARF .debug_line info. I tried to write as little as possible, but it still ended up being ~1500 lines. Happy to remove it later if/when the Rust issue is resolved. Technically only the .symtab parsing is required for the Gecko profile symbolication, but I figured I may as well get the line information too so I could fix Linux stack traces while I'm at it.

I'll want to use it from mozglue/baseprofiler/.

Summary: Consider using addr2line in MozDescribeCodeAddress POSIX implementation instead of dladdr → make symbolication work on Linux
Attachment #9084226 - Attachment is obsolete: true
Attachment #9084227 - Attachment is obsolete: true
Attachment #9084228 - Attachment is obsolete: true
Attachment #9084229 - Attachment is obsolete: true
Attachment #9084230 - Attachment is obsolete: true
Attachment #9084231 - Attachment is obsolete: true

After discussing with Mike, I've realised that we don't really need to fix up symbolication for callers of MozWalkStack in mozglue/. So we could indeed use Rust in libxul somewhere, and change all callers to MozDescribeCodeAddress in libxul to some new function that calls into Rust.

In the C++ code I wrote, I was careful to include a mode where no allocations were made. I thought it was best to avoid allocations if we're trying to symbolicate during a stack trace under the signal handler, since our state might be compromised. I figured if I could limit myself to a few system calls -- fstat, fopen, mmap, munmap -- then there's a much lower chance of running into trouble. Some things wouldn't work in this mode, e.g. if the debug sections are compressed, or if we didn't have enough address space to map the library into memory, but that's probably OK.

addr2line doesn't have a mode where it performs no allocations. I can write my own code calling in to gimli that avoids allocations. But the ELF parser used, goblin, uses some vectors to store some results of parsing ELF headers. So I would be wary of using that under ah_crap_handler.

No longer blocks: 1569077

For now I've given up on solving this for MozDescribeCodeAddress and just added what I needed in bug 1573090.

No longer blocks: 1573090
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: