Keep track of replaying process progress to detect hangs
Categories
(Core Graveyard :: Web Replay, enhancement)
Tracking
(firefox72 fixed)
Tracking | Status | |
---|---|---|
firefox72 | --- | fixed |
People
(Reporter: bhackett1024, Assigned: bhackett1024)
Details
Attachments
(1 file)
Hang detection is pretty crude: if we send a replaying process a manifest and it doesn't finish in 30 seconds, it is considered hanged and forcibly terminated. Many record/replay crash reports are due to these hangs, and many or all of them seem like false positives --- one or more threads are busy operating instead of sitting and waiting for a lock.
It would be nice to overhaul this so that a replaying process can take any amount of time without being marked as hanged, as long as it is making measurable progress. The patch I'll attach in a bit uses this strategy --- when a replaying process is processing a manifest and can't rewind, we send it ping messages periodically, getting back a response describing how much progress it has made. pings are sent at least 2 seconds apart and if we send 10 without the process making any progress, we can be more confident that it has hanged. If the child might rewind then we can't send it messages, and use a strategy similar to our current one. There are only a few cases where a child is allowed to rewind, so this limitation shouldn't be a problem.
Assignee | ||
Comment 1•5 years ago
|
||
Pushed by bhackett@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/43b1b62048cd Keep track of replaying process progress to detect hangs, r=jlast.
Comment 3•5 years ago
|
||
bugherder |
Updated•4 years ago
|
Description
•