Closed Bug 1725990 Opened 10 months ago Closed 9 months ago

Dump symbols for executable ares in ELF files to clarify crash signatures with frames that fall within those areas

Categories

(Toolkit :: Crash Reporting, task)

task

Tracking

()

RESOLVED FIXED
93 Branch
Tracking Status
firefox93 --- fixed

People

(Reporter: kbrosnan, Assigned: gsvelto)

References

(Blocks 1 open bug, )

Details

Attachments

(1 file)

https://crash-stats.mozilla.org/signature/?signature=%3Cunknown%20in%20libxul.so%3E

Brought this up in #stability and Andrew McCreight said it should be added to the skiplist.

Here's an example crash: bp-e83a08af-8b9c-464d-984d-985c30210816

I think this one would end up as Interpret instead. The crashes I looked at were mostly normal JS engine crashes in Interpret or the like. I'm not sure whether something changed with crash reporting or with the JS engine to cause this change.

When you say "skiplist", what do you mean specifically? Do you want signature generation to add the frame to the signature and continue to the next frame (prefix list)? Or do you want signature generation to ignore the frame and continue to the next frame (irrelevant list)?

Flags: needinfo?(kbrosnan)

irrelevant list. Sorry, I got confused about the name.

Flags: needinfo?(kbrosnan)
Blocks: 1726576

From Matrix on Monday August 16th:

gsvelto: @kbrosnan: I've double checked the symbol file: we've got one lone "unknown in libxul.so" symbol at the very end of the module, that covers the area past the end covered by code for which we have symbols. So I'm 100% sure that the stack trace is bogus, the stack scanner finds a pointer that goes into libxul.so (but not in libxul.so's code) and jumps there. It probably must have picked up the address of a global variable from the stack and mistook it for an instruction pointer.

willkg: if that's the case, maybe we don't want to add <unknown in libxul.so> to the irrelevant list.

gsvelto: I'm wondering if we could change dump_syms to emit something more sensible for that particular symbol. "Probably invalid address in library_name" might be more helpful. Or something along the lines.

Gabriele: What do you want to do about this?

Flags: needinfo?(gsvelto)

I prefer modifying dump_syms because the crashes here are useless anyway: the first two frames on the stack are happening somewhere unknown (possibly Android's JIT'd code) and once we enter libxul.so we're clearly not where we're supposed to be. We two scanned frames, then a CFI frame (which is probably serendipitous) and then it's stack scanning again. I'd rather have these crashes with a big red flag in the signature that says "This isn't a good stack" rather than paper over it by jumping over a frame which will only lead us to another misleading one. Cam you keep this on hold until I roll a fix in dump_syms? I'll do it straight away.

Flags: needinfo?(gsvelto)

I filed a dump_syms issue to fix this. Analyzing the stacks here and the object from which we generated the symbol files I figured out that those frames are in the PLT. So those frames are likely wrong as I had guessed, but the stack walker picks them up anyway because the PLT is considered a CODE section in an ELF object, and thus it's part of the executable mapping. I'll change the PLT label emitted by dump_syms to make it clear what's going on. This will change the signatures here (and probably others too).

I like that plan a lot! I didn't intend to nag you--just wanted to make sure it didn't get dropped.

Do we have a better Bugzilla product/component for things like this where the fix is going to be in the symbols tooling? Is that Tecken::Symbols?

Let's move it to crash reporting because the dump_syms changes ultimately needs to land in m-c. I'll rename the bug accordingly.

Component: Processor → Crash Reporting
Product: Socorro → Toolkit
Summary: [skiplist] add <unknown in libxul.so> → Dump a symbol for the PLT in ELF files to clarify crash signatures with frames that fall within that area
Summary: Dump a symbol for the PLT in ELF files to clarify crash signatures with frames that fall within that area → Dump symbols for executable ares in ELF files to clarify crash signatures with frames that fall within those areas
Assignee: nobody → gsvelto
Status: NEW → ASSIGNED

I've landed a fix in dump_syms that dumps symbols for all executable areas that don't have one. Now it's just a matter of pulling it into m-c.

Improvements include:

  • Symbols are generated for executable sections that don't have them (e.g.
    .plt, .plt.got but also .fini and .init depending on the compiler being
    used)
  • PUBLIC symbols are not inserted at addresses covered by the range of an
    existing FUNC entry. This cuts down the size of symbols for GeckoView/Fenix
    builds by over 20%
Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0ee7ea150df4
Import a new dump_syms version r=calixte
Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → 93 Branch
You need to log in before you can comment on or make changes to this bug.