Closed Bug 1900788 Opened 1 year ago Closed 1 year ago

Rework the Windows system symbols scraper

Categories

(Toolkit :: Crash Reporting, enhancement)

enhancement

Tracking

()

RESOLVED FIXED
128 Branch
Tracking Status
firefox128 --- fixed

People

(Reporter: gsvelto, Assigned: gsvelto)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Bug 1847520 changed the Windows symbol scraper to not use the list of missing symbols provided by the symbol server (as it was removed). Since we couldn't rely on that list anymore, we reverted to sampling Socorro, but because of Socorro's rate limitations and the script poor ability of retrying downloads, it takes a very long time to fetch crashes. This forced us to sample only 100 crashes per day, which are insufficient to cover the majority of missing symbols, meaning that we often take several days to catch up with a new Windows release or the like. The existing script also doesn't have a good failure mode, so it's hard to tell if it didn't find symbols or if something went wrong.

A new script should:

  • Sample at least 1k crashes, possibly across multiple channels
  • Print out expected errors (missing PDB files, missing CFI information, etc...)
  • Fail hard when encountering actual issues (e.g. dump_syms controlled crashes)

What Socorro rate limitations? Are you using the missing_symbols facet we implemented in bug #1862204?

Flags: needinfo?(gsvelto)

(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #1)

What Socorro rate limitations?

When downloading raw crashes without a token.

Are you using the missing_symbols facet we implemented in bug #1862204?

Not yet, I was just adapting one of the script I had been using locally to supplement the one in TC. Thanks for reminding me, I should be able to get rid of the sampling entirely and just use the facet.

Flags: needinfo?(gsvelto)

Why are you downloading raw crash data? How does that help in figuring out missing symbols? Can I see the code for this?

  • Sample 2k crashes, 500 for each release channel, this is somewhat slow as
    we're not using an API token (but we can add it later)
  • Use dump_syms' logic to fetch the files from the symbol servers, rather than
    using the Python logic
  • Fail hard if one of the steps fail, including unexpected dump_syms crashes
  • Print out a summary of all the actions that have been taken
Assignee: nobody → gsvelto
Status: NEW → ASSIGNED

(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #3)

Why are you downloading raw crash data? How does that help in figuring out missing symbols? Can I see the code for this?

I meant processed crash, not raw crash. I'm fresh out of a new bout of COVID and I still have some trouble concentrating so I mix things up. Check the patch, it's very simple, I'm just downloading a bunch of processed crashes.

Blocks: 1900808
Pushed by gsvelto@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b1a0569e7545 Rewrite the Windows symbol scraper to improve coverage r=gerard-majax
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 128 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: