Closed Bug 1594577 Opened 6 years ago Closed 6 years ago

Try to record BHR hangs which precede forced shutdowns

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla72
Tracking Status
firefox72 --- fixed

People

(Reporter: alexical, Assigned: alexical)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

Currently if a user forcibly terminates the main process because it's unresponsive, we won't collect the BHR hang. I think we can remedy this by writing hangs out to disk if we pass a certain threshold of, say, eight seconds (the current definition of a "permahang" in BHR terms). At that point the overhead of trying to persist the stack to disk should be a drop in the pond, and it could provide some very valuable data about egregious hangs that users experience.

Assignee: nobody → dothayer
Status: NEW → ASSIGNED

In short - if a user forcibly terminates the browser because it seems
to be permanently hung, we currently do not get a change to record the
hang. This is unfortunate, because these likely represent the most
egregious hangs in terms of user frustration. This patch seeks to
address that.

If a hang exceeds 8192ms (the current definition of a "permahang" in
existing BHR terms), then we decide to immediately persist it to disk,
in case we never get a chance to return to the main thread and
submit it. On the next start of the browser, we read the file from
disk on a background thread, and just submit it using the normal
mechanism.

Regarding the handling of the file itself, I tried to do the simplest
thing I could - as far as I can tell there is no standard simple
serialization mechanism available directly to C++ in Gecko, so I just
serialized it by hand. I didn't take any special care with endianness
or anything as I can't think of a situation in which we really care
at all about these files being transferable between architectures. I
directly used PR_Write / PR_Read instead of doing something fancy
like memory mapping the file, because I don't think performance is a
critical concern here and it offers a simple protection against
reading out of bounds.

Attached file Data Review Request
Attachment #9108837 - Flags: data-review?(chutten)
Comment on attachment 9108837 [details] Data Review Request DATA COLLECTION REVIEW RESPONSE: Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate? Yes. This collection is Telemetry and is documented in [its documentation](https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/data/backgroundhangmonitor-ping.html). Is there a control mechanism that allows the user to turn the data collection on and off? Yes. This collection is Telemetry so can be controlled through Firefox's Preferences. If the request is for permanent data collection, is there someone who will monitor the data over time? Yes, Doug Thayer is responsible. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under? Category 1, Technical. Is the data collection request for default-on or default-off? Default on for Nightly only. Does the instrumentation include the addition of any new identifiers? No. Is the data collection covered by the existing Firefox privacy notice? Yes. Does there need to be a check-in in the future to determine whether to renew the data? No. This collection is permanent. --- Result: datareview+
Attachment #9108837 - Flags: data-review?(chutten) → data-review+
Pushed by dothayer@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/bd42216f7b63 Record hangs which precede forced shutdowns r=froydnj
Pushed by dothayer@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c7b4e89f5ee6 Record hangs which precede forced shutdowns r=froydnj
Pushed by dothayer@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/eb8741186210 Record hangs which precede forced shutdowns r=froydnj
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla72
Regressions: 1599024
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: