Closed Bug 1910438 Opened 8 months ago Closed 7 months ago

Stack traces are missing from almost all Glean crash pings from the windows crash reporter client

Categories

(Toolkit :: Crash Reporting, defect)

Firefox 130
All
Windows
defect

Tracking

()

RESOLVED FIXED
131 Branch
Tracking Status
firefox130 --- fixed
firefox131 --- fixed

People

(Reporter: afranchuk, Assigned: afranchuk)

Details

Attachments

(7 files)

As shown in the plot, the orange line (Windows Glean crash pings with stack traces) is way, way lower than expected (though oddly also non-zero).

There's no logic specific to the Glean implementation that would result in this behavior, as any error to convert a value results in the entire ping being discarded (as opposed to only the field). However, if the stack traces were simply missing, I would expect legacy Telemetry to look the same. At the moment, my theory is that Glean itself may be internally discarding the field for some reason.

I have confirmed in the logs that we encounter:

[WARN glean_core::error_recording] crash.stack_traces: Value did not match predefined schema

This is due to windows having an extra unloaded_modules field. I will change my approach to copy all data to avoid extra fields breaking the expected schema.

This ensures that we always match the metric schema.

Having multiple glean tests exposed some issues and races (since glean
is a global value, it must be tested serially). This changes mock data
to be an Arc as we can't depend on glean deterministically accessing
the data we pass to it, unfortunately.

Pushed by afranchuk@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7ef5ebb06679 pt1 - Improve the population of glean object metrics r=gsvelto https://hg.mozilla.org/integration/autoland/rev/5a587f1aa35d pt2 - Improve test behavior r=gsvelto https://hg.mozilla.org/integration/autoland/rev/00f2c12f3a27 pt3 - Matching changes to the CrashManager stack traces impl r=gsvelto
Status: ASSIGNED → RESOLVED
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: --- → 131 Branch

The patch landed in nightly and beta is affected.
:afranchuk, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox130 to wontfix.

For more information, please visit BugBot documentation.

Flags: needinfo?(afranchuk)

This ensures that we always match the metric schema.

Original Revision: https://phabricator.services.mozilla.com/D218215

Attachment #9418905 - Flags: approval-mozilla-beta?

Having multiple glean tests exposed some issues and races (since glean
is a global value, it must be tested serially). This changes mock data
to be an Arc as we can't depend on glean deterministically accessing
the data we pass to it, unfortunately.

Original Revision: https://phabricator.services.mozilla.com/D218216

Attachment #9418906 - Flags: approval-mozilla-beta?
Attachment #9418907 - Flags: approval-mozilla-beta?

beta Uplift Approval Request

  • User impact if declined: No stack traces in most windows crash reports, and potentially other platforms (more difficult for us to fix user bugs)
  • Code covered by automated testing: yes
  • Fix verified in Nightly: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: N/A
  • Risk associated with taking this patch: Low
  • Explanation of risk level: The changes fix a bug in stack trace glean metric recording. They do not affect any other code.
  • String changes made/needed: None
  • Is Android affected?: yes
Flags: needinfo?(afranchuk)
Attachment #9418905 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9418906 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9418907 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: