Closed Bug 1593140 Opened 5 years ago Closed 5 years ago

Keep track of replaying process progress to detect hangs

Categories

(Core Graveyard :: Web Replay, enhancement)

enhancement
Not set
normal

Tracking

(firefox72 fixed)

RESOLVED FIXED
mozilla72
Tracking Status
firefox72 --- fixed

People

(Reporter: bhackett1024, Assigned: bhackett1024)

Details

Attachments

(1 file)

Hang detection is pretty crude: if we send a replaying process a manifest and it doesn't finish in 30 seconds, it is considered hanged and forcibly terminated. Many record/replay crash reports are due to these hangs, and many or all of them seem like false positives --- one or more threads are busy operating instead of sitting and waiting for a lock.

It would be nice to overhaul this so that a replaying process can take any amount of time without being marked as hanged, as long as it is making measurable progress. The patch I'll attach in a bit uses this strategy --- when a replaying process is processing a manifest and can't rewind, we send it ping messages periodically, getting back a response describing how much progress it has made. pings are sent at least 2 seconds apart and if we send 10 without the process making any progress, we can be more confident that it has hanged. If the child might rewind then we can't send it messages, and use a strategy similar to our current one. There are only a few cases where a child is allowed to rewind, so this limitation shouldn't be a problem.

Pushed by bhackett@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/43b1b62048cd
Keep track of replaying process progress to detect hangs, r=jlast.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla72
Regressions: 1593437
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: