Occasional crash reports with NULL debug ids for Mozilla-specific modules, maybe only on content process
Categories
(Toolkit :: Crash Reporting, defect, P3)
Tracking
()
People
(Reporter: smichaud, Unassigned)
References
Details
This bug is spun off from bug 1741287. It eventually became clear that bug 1741287 covers two distinct, unrelated bugs, as follows:
-
The build process for official builds sometimes fails to copy Mozilla-specific symbols for that build to the symbol server.
-
Sometimes the
debug_id
for Mozilla-specific modules (likeXUL
) is zeroed out in crash reports (perhaps only with content-process crashes). This prevents these modules from being symbolicated in those crash reports.
Bug 1741287 has now been DUPed to bug 1658531, which covers only issue #1. I'm opening this bug to deal with issue #2.
Neither issue exists on Windows. Issue #2 (this bug) may exist on Linux, though I haven't seen it. But I've seen many examples on macOS. So, at least for the moment, I'm limiting this report to macOS.
There's no way to search or facet on module debug ids. So it's difficult to search for crash reports that match this bug. At best you can search for bugs matching both issues #1 and #2, and look through them by hand to find examples of one or the other.
Here's the most recent example I can find for a mozilla-central nightly:
bp-756ce220-911c-41fc-b452-3b29c0211207
Here's a snippet from its "Modules" tab, showing Mozilla-specific modules with NULL debug ids:
SafariSafeBrowsing 0.0.0.0 5513EB53B5393D8D801DE64175D3A03C0 SafariSafeBrowsing
Ø libnss3.dylib 0.1.0.0 000000000000000000000000000000000 libnss3.dylib
Ø libmozglue.dylib 0.1.0.0 000000000000000000000000000000000 libmozglue.dylib
liblgpllibs.dylib 0.1.0.0 000000000000000000000000000000000 liblgpllibs.dylib
Ø XUL 0.1.0.0 000000000000000000000000000000000 XUL
libcorecrypto.dylib 0.1000.140.4 D211160DE22F344080541F5824519C7F0 libcorecrypto.dylib
This may be a bug in Breakpad code. When I have time I'll look through it for possible causes.
Comment 1•4 years ago
|
||
rust-minidump version of this crash report also reports "null" debug_ids, so unlikely to be a bug in the processor:
https://crash-stats.allizom.org/report/index/756ce220-911c-41fc-b452-3b29c0211207#tab-modules
Comment 2•4 years ago
|
||
(In reply to Steven Michaud [:smichaud] (Retired) from comment #0)
Here's a snippet from its "Modules" tab, showing Mozilla-specific modules with NULL debug ids:
SafariSafeBrowsing 0.0.0.0 5513EB53B5393D8D801DE64175D3A03C0 SafariSafeBrowsing Ø libnss3.dylib 0.1.0.0 000000000000000000000000000000000 libnss3.dylib Ø libmozglue.dylib 0.1.0.0 000000000000000000000000000000000 libmozglue.dylib liblgpllibs.dylib 0.1.0.0 000000000000000000000000000000000 liblgpllibs.dylib Ø XUL 0.1.0.0 000000000000000000000000000000000 XUL libcorecrypto.dylib 0.1000.140.4 D211160DE22F344080541F5824519C7F0 libcorecrypto.dylib
This may be a bug in Breakpad code. When I have time I'll look through it for possible causes.
Yeah, they're all empty. This definitely smells like a bug in the minidump writer. It's curious that it specifically affects Mozilla's libraries but not the system ones.
Reporter | ||
Comment 3•4 years ago
|
||
This bug may be limited to the content process. A quick search (of necessity by hand) didn't turn up any on the parent process.
Reporter | ||
Updated•4 years ago
|
Comment 4•4 years ago
•
|
||
Here's another interesting data point: I found a few content process crashes with these NULL debug_ids but not parent process crashes (and I've looked at several dozen). When we're doing out-of-process minidump generation and the module is not in the dyld shared cache we bail out early if we can't find the ID. Maybe we have to double-check the error handling in there to be sure we're not bailing out too early.
[edit] Hadn't seen Steven comment, glad we came to the same conclusion.
Comment 5•4 years ago
•
|
||
Here are some parent crashes for Firefox for release channel from 12/8/2021:
- bp-efe27f97-f732-4924-9c0d-9937d0211208
- bp-93e7a3fc-d2c0-4e06-ae13-a66090211208
- bp-7e52e713-f180-4e78-9b83-beb260211208
Here are some content crashes:
- bp-7446f41e-b51e-4eaa-882e-92c2f0211208
- bp-d0503ddc-9dc3-417f-8415-819570211208
- bp-d284750c-42d5-4e7c-a98f-d8dcc0211208
I threw together an STMO query. I don't know offhand what the access requirements are for it:
Reporter | ||
Comment 6•4 years ago
•
|
||
(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #5)
Here are some parent crashes for Firefox for release channel from 12/8/2021:
- bp-efe27f97-f732-4924-9c0d-9937d0211208
- bp-93e7a3fc-d2c0-4e06-ae13-a66090211208
- bp-7e52e713-f180-4e78-9b83-beb260211208
Here are some content crashes:
- 7446f41e-b51e-4eaa-882e-92c2f0211208
- d0503ddc-9dc3-417f-8415-819570211208
- d284750c-42d5-4e7c-a98f-d8dcc0211208
None of these are missing Mozilla-specific symbols, or have NULL debug ids for Mozilla-specific modules.
I threw together an STMO query. I don't know offhand what the access requirements are for it:
I don't seem to have permission to see these results (or to perform the query). If you're able, in this custom query, to search on module debug ids, I'd specify the following search criteria (to be ANDed together):
-
Signature contains "XUL@"
-
Release channel is "release" or "nightly"
-
Product is not "SeaMonkey"
-
XUL module debug id contains "000000000000000000000000000000000"
SeaMonkey needs to be excluded because its build process never copies Mozilla-specific symbols to the symbol server.
Comment 7•4 years ago
|
||
Thanks Will! I've narrowed down the query to only macOS crashes with a NULL XUL and this is what I get: https://sql.telemetry.mozilla.org/queries/83221
They're all content crashes save for a handful coming from a single machine. The assertion message in those parent process minidumps points to a potentially corrupted Firefox installation so given they're coming from a single user I'm fairly convinced this is a content-specific issue.
Comment 8•4 years ago
|
||
(In reply to Steven Michaud [:smichaud] (Retired) from comment #6)
Release channel is "release" or "nightly"
Product is not "SeaMonkey"
I haven't added those but all crashes I found come from Firefox and are on the release channel.
Comment 9•4 years ago
|
||
The severity field is not set for this bug.
:gsvelto, could you have a look please?
For more information, please visit auto_nag documentation.
Updated•4 years ago
|
Description
•