Last Comment Bug 574905 - Increase hang detector timeout
: Increase hang detector timeout
Status: VERIFIED FIXED
: verified1.9.2
Product: Core
Classification: Components
Component: Plug-ins (show other bugs)
: 1.9.2 Branch
: All All
: -- normal (vote)
: mozilla2.0b1
Assigned To: Justin Dolske [:Dolske]
:
Mentors:
Depends on: 560932
Blocks: 562630 574906
  Show dependency treegraph
 
Reported: 2010-06-25 20:36 PDT by Justin Dolske [:Dolske]
Modified: 2014-05-06 08:01 PDT (History)
35 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
.6+
.6-fixed


Attachments
Patch v.1 (878 bytes, patch)
2010-06-25 20:36 PDT, Justin Dolske [:Dolske]
shaver: review+
shaver: superreview+
shaver: approval1.9.2.6+
shaver: approval1.9.2.7+
Details | Diff | Splinter Review

Description Justin Dolske [:Dolske] 2010-06-25 20:36:07 PDT
Created attachment 454218 [details] [diff] [review]
Patch v.1

+++ This bug was initially created as a clone of Bug #560932 +++

Over if bug 560932 we discussed increasing the "hung plugin" timeout, but ultimately decided to leave it set at 10 seconds, as the value of OOPP is decreased if it takes painfully long to kill a hung plugin.

However, now that 3.6.4 has shipped, we are seeing an increasing number of reports that some users are unable to play Farmville, because Farmville hangs the browser long enough for out timeout to trigger and kill it.

As an interm solution, we can increase the timeout and then look for a better solution. The SUMO article suggests that 30 seconds is a good value, so let's hit this with a big hammer and make it 45s.
Comment 1 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2010-06-25 20:49:47 PDT
What about the situation has changed since the first time? Don't we think that users would notice if Firefox were unresponsive for 20, or 30, or 45 seconds?

It's possible that the Flash app is still responsive, if it's in windowed mode: it's also possible that it can still do animation and audio even when it's not responsive. But I think we have a problem where OOPP is allowing Flash to eat a lot more CPU than it used to, and changing the timeout is a really unfortunate tradeoff.

That said, I don't know how we can diagnose the Flash-consuming-CPU thing. Maybe we need some tbeachball-ish numbers IPP and OOPP (OOPP need beachball numbers for both processes).
Comment 2 Justin Dolske [:Dolske] 2010-06-25 21:01:03 PDT
My (second-hand) information is that we're seeing a significant volume of support requests for the issue, whereas the scope of the problem wasn't clear before. The SUMO article recommends increasing the timeout to 30 seconds, which I assume has been found to work by users experiencing this problem (Cww, can you confirm?).

http://support.mozilla.com/en-US/kb/The+Adobe+Flash+plugin+has+crashed

So this is the small+simple fix suitable for a rapid chemspill release, shaver asked me to generate the patch ASAP.

Does it seem sensible to land this now for the chemspill, and look at a better way to fix the problem for the next 3.6.x release?
Comment 3 Mike Shaver (:shaver -- probably not reading bugmail closely) 2010-06-25 21:01:39 PDT
Users would notice, and they do with Farmville per their support, but then it finishes initialization and they're happily on their way.  They're fine with it taking a while to start up their game (many games take more than 10s to load outside of the browser!) and aren't trying to do other things with their browser in parallel.

Why do we think that this is related to elevated CPU consumption?
Comment 4 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2010-06-25 21:08:11 PDT
In all of the (two) cases I've been able to catch so far (which usually involves hulu + farmville or similar windowless cases), Flash is spinning its event loop pretty regularly, but the event loop is so backed up that we don't get around to begin delivering our message until the timeout has expired.
Comment 5 Mike Shaver (:shaver -- probably not reading bugmail closely) 2010-06-25 21:58:06 PDT
I don't understand why we would start timing before we deliver the message whose reply we're waiting for, but I am merely a simple caveman.  I *do* understand that there's a massive regression in user experience for a meaningful number of users of an extremely popular property, and that a raised timeout remedies it.  We can and should investigate better event loop behaviour, perhaps even for the next non-chemspill release, but for now we need to get out of a bad hole before it gets any worse.
Comment 6 Reed Loden [:reed] (use needinfo?) 2010-06-25 23:17:23 PDT
GECKO1924_20100413_RELBRANCH:
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/d58a3937538c

1.9.2 default:
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/fcc48654336d
Comment 7 Martijn Wargers [:mwargers] (not working for Mozilla) 2010-06-26 01:53:41 PDT
When a plugin is really slow in Google Chrome, I get a dialog with the option to close the plugin or continue.
Comment 8 Marc Rost 2010-06-26 02:41:03 PDT
My $0.02
Originally a 10s timeout made a lot of sense considering that we had no actual data to go with. AFAIK that number didn't come from the mountain top on a stone tablet. Now that we have information that it's prematurely killing working plug-ins, it's obviously time to re-evaluate.
We have to increase the timeout value. My preference would be to go with 30s and wait for feedback. But whatever value with try next, once we have real information that the timeout is too long we will again need to re-evaluate.
Comment 9 Justin Dolske [:Dolske] 2010-06-26 14:47:09 PDT
Pushed to trunk: http://hg.mozilla.org/mozilla-central/rev/18c4baeeba95
Comment 10 Daniel Cater 2010-06-26 17:20:38 PDT
This goes way beyond my understanding of IPC, but is there really nothing that the browser can do or say whilst a plugin is hanging? Some indication that waiting for the timeout will bring the browser back to life would be good, rather than the user clicking a few times until he decides to (or the OS offers to) kill it.
Comment 11 - 2010-06-27 04:09:40 PDT
I can't believe Farmville is solely responsible for a Firefox update *facepalm*
Comment 12 Jan 2010-06-27 05:29:49 PDT
Using the priority idea from https://bugzilla.mozilla.org/show_bug.cgi?id=558555 maybe could allow to make detection more reliable. I'd hate to have to wait 45 seconds until I can use firefox again. Most users will probably assume it crashed during that time and kill it.
Comment 13 Mike Beltzner [:beltzner, not reading bugmail] 2010-06-27 09:38:14 PDT
(In reply to comment #11)
> I can't believe Farmville is solely responsible for a Firefox update *facepalm*

It wasn't just happening in Farmville.
Comment 14 Jeff Rivett 2010-06-27 10:47:06 PDT
(In reply to comment #11)
> I can't believe Farmville is solely responsible for a Firefox update *facepalm*

Actually, I think this could be a good example of how a change to the impact of a problem can push it to the top of the queue.  A lot of people play Farmville.  To ignore those people for any length of time could have a significant effect on Firefox's share of browser users.  The problem already existed, but the perceived impact suddenly changed, giving it a much higher priority.

http://www.knowledgetransfer.net/dictionary/ITIL/en/Impact.htm
Comment 15 mt 2010-06-27 11:11:35 PDT
(In reply to comment #13)
> (In reply to comment #11)
> > I can't believe Farmville is solely responsible for a Firefox update *facepalm*
> 
> It wasn't just happening in Farmville.

it's happening to people like me actually doing some work on my not so old machine like compiling in a bunch of vm's while browsing. In my case it's not a case of single/dual core cpu but one of a very busy harddisk.
Comment 16 qpconsulting 2010-06-27 14:24:25 PDT
I'm a flex developer and I use Flex builder to debug flash programs. In previous Firefox versions when I run the Flex builder debugger to debug through a flash plug-in it doesn't have the timeout problem. Now with this release 3.6.6, all of a sudden when I stop at a breakpoint insided the Flex debugger, after some seconds (45 according to the description?) the Flash plug-in just reports a "crash" and thus disconnects the Flex debugging session. This is detrimental to my development. Is there a way to revert back to previous version or turn off this timer to not allow a timeout? Thanks
Comment 17 Ryan Jones 2010-06-27 14:27:02 PDT
(In reply to comment #16)
> I'm a flex developer and I use Flex builder to debug flash programs. In
> previous Firefox versions when I run the Flex builder debugger to debug through
> a flash plug-in it doesn't have the timeout problem. Now with this release
> 3.6.6, all of a sudden when I stop at a breakpoint insided the Flex debugger,
> after some seconds (45 according to the description?) the Flash plug-in just
> reports a "crash" and thus disconnects the Flex debugging session. This is
> detrimental to my development. Is there a way to revert back to previous
> version or turn off this timer to not allow a timeout? Thanks

If it is for your own version you can set the dom.ipc.plugins.timeoutSecs key to any value you want in your about:config.
Comment 18 XtC4UaLL [:xtc4uall] 2010-06-27 14:47:02 PDT
(In reply to comment #17)
> If it is for your own version you can set the dom.ipc.plugins.timeoutSecs key
> to any value you want in your about:config.

As an Addition: Setting dom.ipc.plugins.timeoutSecs to -1 disables the Hang Detector.

Further Questions should be directed to https://support.mozilla.com as this is a Bugtracker and not a Forum. Else you're free to file a new Bugreport.
Comment 19 Emil Ivanov 2010-06-27 15:44:06 PDT
(In reply to comment #0)
> However, now that 3.6.4 has shipped, we are seeing an increasing number of
> reports that some users are unable to play Farmville, because Farmville hangs
> the browser long enough for out timeout to trigger and kill it.

Why not let the user decide if the page is taking too long?
Like the script alert for slow javascript. There could be a prompt do you want to terminate this plugin and this will be solved once for all...
Comment 20 sdmarathe 2010-06-28 00:03:31 PDT
45 sec is too long a timeout for a hung script - not to mention it takes that much longer to crash the firefox and may hog more and more memory in vain. Why not ask the user in case of a long running flash script - just like we ask the user input for a long running Javascript?
Comment 21 Martin Stránský 2010-06-28 03:42:59 PDT
Do all the calls into plug-in have to be synchronous? Some plug-in calls (like NPP_SetWindow/HandleEvent) do not seem to return any important value (except some error) so why we need to wait when they're finished?
Comment 22 Mike Beltzner [:beltzner, not reading bugmail] 2010-06-28 08:58:44 PDT
(In reply to comment #20)
> 45 sec is too long a timeout for a hung script - not to mention it takes that
> much longer to crash the firefox and may hog more and more memory in vain. Why
> not ask the user in case of a long running flash script - just like we ask the
> user input for a long running Javascript?

Please file a bug on that issue; I think it's worth looking into.
Comment 23 Kevin Newman 2010-06-28 09:10:24 PDT
Has anyone tested that this is not a new problem from Flash 10.1? I have noticed that Flash 10.1 performs about half as good as Flash 10 in plugin based browsers, vs. IE (30fps on the linked example, vs. 60 in IE). Before spending too much time adding a workaround for this in Firefox, you may want to look into that.

Suggestion: downgrade to Flash 10 (mind the security hole) and test. Then upgrade to Flash 10.1 and retest.

Note the issue I've been seeing is on Windows - on Mac Flash 10.1 runs great in Firefox, but lousy in everything else.

You can check that with this example if that helps (it's my blog, but I don't have any ads):

http://www.unfocus.com/2010/06/23/the-pixels-explode-explode/

direct link to the swf (if it's useful, please feel free to add it to the ticket):

http://www.unfocus.com/PixelExploder/Explode.swf
Comment 24 Michael Kraft [:morac] 2010-06-28 12:02:25 PDT
I agree with comment #23.  I think the hang crashes are more of a Flash bug than a Firefox bug.

I've noticed Flash 10.1 has severely degraded performance so the majority of the problems probably had less to do with the Flash hang detection and more to do with people upgrading to Flash 10.1

As an example, Hulu now takes a minute to load videos with Flash 10.1 and uses a whopping 1 GB of memory compared with 256 MB with Flash 10.  Even increase the timeout to 45 seconds doesn't always allow Hulu videos to load.
Comment 25 qpconsulting 2010-06-28 12:08:19 PDT
(In reply to comment #18)
> (In reply to comment #17)
> > If it is for your own version you can set the dom.ipc.plugins.timeoutSecs key
> > to any value you want in your about:config.
> 
> As an Addition: Setting dom.ipc.plugins.timeoutSecs to -1 disables the Hang
> Detector.
> 
> Further Questions should be directed to https://support.mozilla.com as this is
> a Bugtracker and not a Forum. Else you're free to file a new Bugreport.

This solution works great for my purpose. Thanks
Comment 26 Tony Chung [:tchung] 2010-06-28 13:10:39 PDT
Verified interval change to 45 on 3.6.6
Comment 27 Tony Chung [:tchung] 2010-06-30 22:28:58 PDT
Verified also on Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.7pre) Gecko/20100630 Namoroka/3.6.7pre
and 
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.7pre) Gecko/20100630 Namoroka/3.6.7pre (.NET CLR 3.5.30729)
Comment 28 tiberiu.ovidiu.dumitrescu 2010-07-02 01:14:53 PDT
I don't know if anyone is following what's happening on the support forum:
https://support.mozilla.com/en-US/forum/1/710861#threadId715141

Maybe this is not such a good fix...
Comment 29 bugzilla 2010-07-09 05:16:00 PDT
As a result of this plugin handling, it has created #574905
Comment 30 bugzilla 2010-07-09 05:20:51 PDT
I'm a retard. See #577656
Comment 31 JD 2010-10-08 12:12:17 PDT
Please re-open as I am still getting premature timeouts, and
FF says something like ... did not respond ...

I am on FF firefox-3.6.10-1.fc13.i686

There is another reason why this timeout should be user settable in Preferences.

I run a firewall which dumps packets that are not part of an ESTABLISHED session
that was established by FF, for example.
So, when FF times out prematurely relative to the speed of a web sites response
time, the response packet arrives, and since the session was terminated by FF,
the firewall dumps the packet as unsolicited incoming packet.
My log shows hundreds of such dumped packets from slow web sites.

Please reopen this bug and allow the user to adjust timeout in Preferences, according to user's needs.
Comment 32 Josh Matthews [:jdm] 2010-10-08 12:14:07 PDT
JD: this is a bug about plugins hanging, and not related to the problem you're describing.
Comment 33 JD 2010-10-08 12:26:37 PDT
In that case, should I open a new bug?

Note You need to log in before you can comment on or make changes to this bug.