Of the total content crashes we experience when running under e10s, preliminary data indicates around 85% of these are triggered by the content process being terminated by the parent due to long shutdown times. https://gist.github.com/chutten/870e97c789096edb2752a67b23d3a079
Summary: ShutDonwKill content process mitigation work → ShutDownKill content process mitigation work
We'll have to figure out how far we want to go here. If a web page is running slowly, it's going to take some time to shut down the content process. We could try to stop all content JS. That would be a decent amount of work.
Why don't we turn this timeout + kill off on nightly and see how bad the problem is? I've been running with this for a day and I'm not seeing any issues.
tracking-e10s: ? → +
Wouldn't that just transform one of these ShutDownKill crashes into a shutdown hang in the parent process? I guess the difference is that we wait 60 seconds in the parent versus 5 seconds in the child. Maybe we could just boost the timeout to 15 seconds or something?
(In reply to Bill McCloskey (:billm) from comment #4) > Wouldn't that just transform one of these ShutDownKill crashes into a > shutdown hang in the parent process? I guess the difference is that we wait > 60 seconds in the parent versus 5 seconds in the child. Maybe we could just > boost the timeout to 15 seconds or something? Yep, in triage we discussed moving it out a bit on Nightly to see if the ShutDownKill numbers fall off. IMHO content occasionally taking a little more than 5 seconds to shutdown doesn't surprise me. That timeout feels a little aggressive.
Here's a try push with the timeout turned off - https://firstname.lastname@example.org&selectedJob=20665837 I don't see a lot of issues here. shutdown failures that do show up appear to be common intermitents.
You need to log in before you can comment on or make changes to this bug.