ShutDownKill content process mitigation work

NEW
Unassigned

Status

()

Core
DOM: Content Processes
2 years ago
2 years ago

People

(Reporter: jimm, Unassigned)

Tracking

(Depends on: 1 bug, {meta})

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(e10s+, firefox49 affected)

Details

(Reporter)

Description

2 years ago
Of the total content crashes we experience when running under e10s, preliminary data indicates around 85% of these are triggered by the content process being terminated by the parent due to long shutdown times.


https://gist.github.com/chutten/870e97c789096edb2752a67b23d3a079
(Reporter)

Updated

2 years ago
Depends on: 1269998
(Reporter)

Updated

2 years ago
Depends on: 1271333
(Reporter)

Updated

2 years ago
tracking-e10s: --- → ?
Summary: ShutDonwKill content process mitigation work → ShutDownKill content process mitigation work
We'll have to figure out how far we want to go here. If a web page is running slowly, it's going to take some time to shut down the content process. We could try to stop all content JS. That would be a decent amount of work.
Depends on: 1270308, 1270628
(Reporter)

Updated

2 years ago
Depends on: 1218576
(Reporter)

Comment 2

2 years ago
Why don't we turn this timeout + kill off on nightly and see how bad the problem is? I've been running with this for a day and I'm not seeing any issues.
(Reporter)

Comment 3

2 years ago
dom.ipc.tabs.shutdownTimeoutSecs
(Reporter)

Updated

2 years ago
tracking-e10s: ? → +
Keywords: meta
Whiteboard: meta
Wouldn't that just transform one of these ShutDownKill crashes into a shutdown hang in the parent process? I guess the difference is that we wait 60 seconds in the parent versus 5 seconds in the child. Maybe we could just boost the timeout to 15 seconds or something?
(Reporter)

Updated

2 years ago
Depends on: 1182927
(Reporter)

Comment 5

2 years ago
(In reply to Bill McCloskey (:billm) from comment #4)
> Wouldn't that just transform one of these ShutDownKill crashes into a
> shutdown hang in the parent process? I guess the difference is that we wait
> 60 seconds in the parent versus 5 seconds in the child. Maybe we could just
> boost the timeout to 15 seconds or something?

Yep, in triage we discussed moving it out a bit on Nightly to see if the ShutDownKill numbers fall off. IMHO content occasionally taking a little more than 5 seconds to shutdown doesn't surprise me. That timeout feels a little aggressive.
(Reporter)

Updated

2 years ago
Depends on: 1269961
(Reporter)

Comment 6

2 years ago
Here's a try push with the timeout turned off - 

https://treeherder.mozilla.org/#/jobs?repo=try&author=jmathies@mozilla.com&selectedJob=20665837

I don't see a lot of issues here. shutdown failures that do show up appear to be common intermitents.
Depends on: 1276383
Depends on: 1277067
You need to log in before you can comment on or make changes to this bug.