Open Bug 1763710 Opened 3 years ago Updated 1 year ago

debug tab should list any modules where corrupt_symbols is true

Categories

(Socorro :: Webapp, task, P2)

Tracking

(Not tracked)

People

(Reporter: willkg, Unassigned)

Details

Attachments

(1 file)

The Debug tab should list the modules where corrupt_symbols is true in the json_dump. That'd help us isolate problematic symbols files faster.

For example:

"modules": [
    ...
    {
      "base_addr": "0x000000010b1c7000",
      "cert_subject": null,
      "code_id": "d5ccb67ff046395ab3274c88e8c4c1c9",
      "corrupt_symbols": true,
      "debug_file": "XUL",
      "debug_id": "D5CCB67FF046395AB3274C88E8C4C1C90",
      "end_addr": "0x0000000112ad6000",
      "filename": "XUL",
      "loaded_symbols": true,
      "missing_symbols": false,
      "symbol_url": null,
      "version": null
    },
Priority: -- → P2

We should also denote that the symbol file is corrupt in the Modules tab. Right now, it suggests the symbols file is missing in the case where it was found, but not parseable.

I want more visibility into what happened during symbolication in the stackwalker so we can suss out issues especially during the GCP migration.

Processing should emit metrics for the outcomes of fetching and parsing symbols files. Or maybe we should make it available in Elasticsearch somehow. Or in postgres.

Blocks: 1687802
Assignee: nobody → willkg
Status: NEW → ASSIGNED

I can't figure out how to build a sym file that causes stackwalker to emit corrupt_symbols: true. From the code, it looks like it should do that with a ParseError and there are a few comments around emitting ParseError that suggest how to build a sym file that is corrupt. However, all the attempts I tried result in the module being marked as corrupt_symbols: false and missing_symbols: true.

I looked through the last week of crash reports and didn't see any with corrupt_symbols: true.

Maybe it doesn't happen? Maybe parse errors get treated as missing files?

I wrote up an issue in the rust-minidump repo for some help: https://github.com/rust-minidump/rust-minidump/issues/676

Markus triggered a parse error one of the ways I had tried. However, he was using a symbol path (e.g. symbols were on disk) and I was using a symbol server (e.g. symbols were fetched over http) and I'm pretty sure when the symbol supplier gets a ParseError, it tries the next symbol server and treats it as a missing symbol. I wrote up a comment here:

https://github.com/rust-minidump/rust-minidump/issues/676

Thus the stackwalker will never get corrupt_symbols: true in crash ingestion. I'm going to push this off until that gets figured out.

Assignee: willkg → nobody
Status: ASSIGNED → NEW

willkg merged PR #6197: "bug 1763710: prep work for looking at corrupt_symbols stuff" in bd257bc.

Figured I'd land some prep work I did. Pushing the rest of this off until issue 676 gets figured out.

Removing this from the GCP migration.

No longer blocks: 1687802
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: