Last Comment Bug 340345 - DOM timeouts can fire during a sync XMLHttpRequest
: DOM timeouts can fire during a sync XMLHttpRequest
Status: RESOLVED FIXED
[needed to fix bug 479143]
: fixed1.9.1
Product: Core
Classification: Components
Component: DOM (show other bugs)
: Trunk
: x86 Linux
: P1 normal with 4 votes (vote)
: mozilla1.9.1
Assigned To: Ben Turner (not reading bugmail, use the needinfo flag!)
:
Mentors:
http://landfill.mozilla.org/ryl/testX...
: 412418 (view as bug list)
Depends on: 479143 480793 481955 562901
Blocks: 333198
  Show dependency treegraph
 
Reported: 2006-06-04 15:21 PDT by Boris Zbarsky [:bz]
Modified: 2013-04-04 13:53 PDT (History)
25 users (show)
jst: blocking1.9.1+
jst: wanted1.9.1+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Testcase (721 bytes, text/html)
2006-06-04 15:22 PDT, Boris Zbarsky [:bz]
no flags Details
Testcase that includes timings (824 bytes, text/html)
2006-06-04 15:37 PDT, Boris Zbarsky [:bz]
no flags Details
Simple XUL overlay that can resolve the problem (not relevant to this bug) (1.08 KB, application/vnd.mozilla.xul+xml)
2007-01-03 12:36 PST, Dan Fabulich
no flags Details
Simple cross-browser test case (599 bytes, text/html)
2008-08-21 21:28 PDT, BAM
no flags Details
Reentrant click test (938 bytes, text/html)
2008-11-05 12:38 PST, robinpelgrim
no flags Details
Patch (4.77 KB, patch)
2008-12-08 14:41 PST, robinpelgrim
bzbarsky: review-
Details | Diff | Splinter Review
Patch, v1 (8.75 KB, patch)
2009-02-10 00:58 PST, Ben Turner (not reading bugmail, use the needinfo flag!)
no flags Details | Diff | Splinter Review
Patch, v1.1 (7.95 KB, patch)
2009-02-10 15:31 PST, Ben Turner (not reading bugmail, use the needinfo flag!)
jst: review-
jst: superreview-
Details | Diff | Splinter Review
Patch, v1.2 (8.68 KB, patch)
2009-02-12 13:16 PST, Ben Turner (not reading bugmail, use the needinfo flag!)
jonas: review+
jst: review+
jst: superreview+
Details | Diff | Splinter Review
backout patch for 1.9.1 (8.75 KB, patch)
2009-03-02 12:36 PST, Olli Pettay [:smaug]
no flags Details | Diff | Splinter Review

Description Boris Zbarsky [:bz] 2006-06-04 15:21:40 PDT
BUILD: Current trunk build (but the issue predates the threadmanager landing)

STEPS TO REPRODUCE:
1)  Download attached testcase to a file:// URI (so the enablePrivilege call will
    work)
2)  Load it from file://.
3)  Click the button.
4)  Grant UniversalBrowserRead

EXPECTED RESULTS:  The output says:

  starting
  finished
  timeout

Or at least this is what's expected given run-to-completion.

ACTUAL RESULTS:  The output says:

  starting
  timeout
  finished
Comment 1 Boris Zbarsky [:bz] 2006-06-04 15:22:13 PDT
Created attachment 224380 [details]
Testcase
Comment 2 Brendan Eich [:brendan] 2006-06-04 15:27:32 PDT
$64K question is what does IE do.  I speculated recently in another bug that it violates run-to-completion.  I was thinkingn of XHR in particular.

/be
Comment 3 Boris Zbarsky [:bz] 2006-06-04 15:31:58 PDT
One other note.  I suspect there _was_ a slight behavior change with threadmanager here.  Before threadmanager landed, I bet we fired the timeout right before returning from the send() call.  Now we fire it as scheduled.

I'll try to write a testcase to see whether this is indeed what happens.
Comment 4 Boris Zbarsky [:bz] 2006-06-04 15:37:39 PDT
Created attachment 224381 [details]
Testcase  that includes timings

This shows that we fire the timeout as scheduled, both before and after threadmanager.
Comment 5 Boris Zbarsky [:bz] 2006-06-04 16:12:33 PDT
http://landfill.mozilla.org/ryl/testXMLHttpRequestSync1.html has a testcase that works in both IE and Mozilla that can be used for testing.  For what it's worth, IE output looks something like:

starting: 0
readyState = 1
readyState = 2
readyState = 3
readyState = 4
finished: 3154
timeout: 3164

The fact that we don't fire onreadystatechanged for a sync XMLHttpRequest is a known issue.  But it looks like IE in fact fires the timeout after the XMLHttpRequest completes.
Comment 6 Brendan Eich [:brendan] 2006-06-04 16:31:55 PDT
(In reply to comment #5)
> The fact that we don't fire onreadystatechanged for a sync XMLHttpRequest is a
> known issue.  But it looks like IE in fact fires the timeout after the
> XMLHttpRequest completes.

Good for IE, my suspicion was wrong.

What's the bug on the onreadystatechange difference, does anyone know the number?

/be
Comment 7 :Gavin Sharp [email: gavin@gavinsharp.com] 2006-06-04 16:38:01 PDT
(In reply to comment #6)
> What's the bug on the onreadystatechange difference, does anyone know the
> number?

Bug 313646.
Comment 8 Robert Sayre 2006-06-04 21:11:42 PDT
Comment #5 was data from IE7. IE6 is similar:

starting: 0
readyState = 1
readyState = 2
readyState = 3
readyState = 4
finished: 3187
timeout: 3234

Comment 9 Boris Zbarsky [:bz] 2006-06-04 21:14:13 PDT
Note that timeouts set before we put up an alert or prompt have a similar issue, as expected.
Comment 10 Jesse Ruderman 2006-06-05 04:43:12 PDT
Does the same problem exist for networking events (image loads, async xhr) triggering JavaScript events during sync xhr?
Comment 11 Darin Fisher 2006-06-05 07:40:49 PDT
> This shows that we fire the timeout as scheduled, both before and after
> threadmanager.

bz: because timers always posted their event to the youngest event queue.
Comment 12 Boris Zbarsky [:bz] 2006-06-05 08:19:38 PDT
Jesse:  I'd guess yes, but someone would need to create a testcase to verify.
Comment 13 Dan Fabulich 2007-01-03 12:36:28 PST
Created attachment 250369 [details]
Simple XUL overlay that can resolve the problem (not relevant to this bug)

This XUL overlay certainly isn't the "right" way to fix this bug; I'd be happy to re-code it if only I knew where to begin.
Comment 14 Dan Fabulich 2007-01-03 12:37:26 PST
ack, I attached that to the wrong bug, sorry.  :-(
Comment 15 BAM 2008-08-21 07:16:20 PDT
There's been no movement on this in over a year.  It is now a live issue in Firefox 3 (although it works fine in Firefox 2).  Is this issue going to be corrected?  Is there some way to override the behavior to make it perform like Firefox 2/IE?
Comment 16 Jonas Sicking (:sicking) PTO Until July 5th 2008-08-21 14:43:50 PDT
Honestly, I'm tempted to say that we should define this behavior as the intended one. I think safari does the same, and I don't see how we could reliably stop everything during sync load without going back to the old behavior of having the whole browser lock up while we're sync loading.
Comment 17 BAM 2008-08-21 21:28:49 PDT
Created attachment 334995 [details]
Simple cross-browser test case

This is a simple test page that works on most browsers.
Expected result: 0123456789[TIMEOUT]
Safari 3, IE6, and Firefox 2 produce the expected result
Firefox 3 actual result: 0123[TIMEOUT]456789
Comment 18 Unknown W. Brackets 2008-08-21 21:37:37 PDT
(In reply to comment #17)
> Firefox 3 actual result: 0123[TIMEOUT]456789

Strange, for me I see 0[TIMEOUT]123... and, additionally, Firefox 3 is the only one that takes a while to complete the test.  The others (Safari, IE6, IE7, even older Firefox) block the browser, but complete in a fraction of the time.

-[Unknown]
Comment 19 BAM 2008-08-21 21:40:58 PDT
As can be seen in my attached test case, Safari does in fact block the timeout from executing until the thread has completed, as does Firefox 2 and IE7.  Unless someone can point me to a doc that specifies otherwise, I feel the behavior of all major browsers EXCEPT Firefox 3 should be considered the intended behavior of setTimeout and this issue should be corrected to perform as it did in previous versions of Firefox.

As per my actual results for Firefox 3: it fires the timeout randomly.  Download it to your own system and the callback request will execute faster, so the timeout will fire later.
Comment 20 Jonas Sicking (:sicking) PTO Until July 5th 2008-08-22 01:16:41 PDT
There is no specification one way or another on this, but I think that's irrelevant. What matters is getting a useful interoperable behavior.

What about other types of events? Does for example mouse events fire during the sync load?
Comment 21 Arnaud 2008-08-22 01:29:06 PDT
I think that in most "desktop" GUI frameworks, timer events and other GUI events are serialized. (The only exception may be during nested event loop for modal dialogs.)
This would be nice to have a similar behavior in FF3.
Comment 22 Jonas Sicking (:sicking) PTO Until July 5th 2008-08-22 02:12:40 PDT
Synchronous XHR is exactly a nested event loop. And it has to be since we want to allow the user to interact with the browser while the synchronous load is in progress.

So my question remains, does safari and/or opera fire UI events such as mouse events during synchronous XHR?

I'm pretty sure IE doesn't, but in order to implement this they freeze the entire UI which their users hate.
Comment 23 BAM 2008-08-22 02:14:46 PDT
As my example showed, Safari does freeze, just like IE and Firefox 2.
Comment 24 Arnaud 2008-08-22 02:31:42 PDT
(In reply to comment #22)
> Synchronous XHR is exactly a nested event loop. And it has to be since we want
> to allow the user to interact with the browser while the synchronous load is in
> progress.
> 
Yes, but a synchronous XHR is not a GUI operation like a modal dialog, so it should interfere with the event loop.
Howevere I agree that it is not a good idea to have blocking calls in the main GUI thread.
More generally, JS code should not be able to block the browser GUI, either with synchronous IO or with long computation loops. Maybe the JS engine should live in its own thread, not in the main GUI thread of the browser, but I suppose it would have a large impact on the existing platform.
(NB: It seems that IE8 tabs can live in their own processes, so the browser chrome is probably isolated from blocking JS code.)
Comment 25 BAM 2008-08-28 13:47:25 PDT
Testing with IE8 beta 2, my test page returns 0[TIMEOUT]123456789
If the thread had blocked the setTimeout calls, it would have returned 0123456789[TIMEOUT]

Seems IE8 performs as Firefox 3 does.  So much for my sjax email client!

Question: is there any way for me to cause a delay in a js thread in Firefox 3, like using some Firefox specific method to access the underlying NSPR thread functionality, like PR_WaitCondVar, to freeze the javascript thread without spiking the cpu or making any network calls?
Comment 26 Boris Zbarsky [:bz] 2008-08-28 14:19:59 PDT
"the javascript thread" is the UI thread.  So no, there is not, because that would completely lock the user out of interacting the the browser.  Something we try hard to avoid.
Comment 27 Jesse Ruderman 2008-09-12 02:52:01 PDT
See also bug 333198, Suppress Input events for web content during synchronous XMLHttpRequest loads.
Comment 28 Laurens Holst 2008-09-15 07:38:40 PDT
So I guess then effectively synchronous loads have been removed. You should remove the ‘async’ callback syntax then, as the synchronous one is more intuitive to use. Or at least no longer recommend people to use it.
Comment 29 Boris Zbarsky [:bz] 2008-09-15 08:35:12 PDT
You're confusing "synchronous" and "non-reentrant".  The load is still synchronous: control is not returned to the calling code until the load completes.  This bug is about the fact that the new setup allows reentrancy, which is indeed a bug because web JS doesn't expect it.  But the presence of this bug doesn't mean the load is "async".
Comment 30 BAM 2008-09-15 08:38:41 PDT
Since IE8 seems to also allow reentancy, should this issue be defined as official new browser behavior and marked wontfix?
Comment 31 Sjoerd Mulder 2008-09-24 04:41:57 PDT
Testing it on my machine (winXP with IE 8.0.6001.18241 beta 2) the result is:
0123456789[TIMEOUT]

So IE8 seems behaving just fine.
Comment 32 robinpelgrim 2008-11-03 02:33:03 PST
Well, our web-application is broken.
Please, stay compatible with other browsers.

That the entire browser is blocking during sync xmlhttprequest has nothing to do with xmlhttprequest itself. This is just an easy (but wrong) patch.

We don't have (synchronized) {
}
in javascript. So do not make it more complex!!!!

(Sometimes you just have to wait)
Comment 33 Laurens Holst 2008-11-04 02:06:49 PST
Ditto for us, this is a known Firefox 3 compatibility issue in Backbase, which we can not resolve ourselves. It would be nice if this were fixed sometime soon.
Comment 34 BAM 2008-11-04 06:33:26 PST
This needs to be clarified:  according to the previous discussion in this ticket, and the observed behavior of IE8 (which displays the same behavior), this is NOT A BUG.  This is the future of web browsers apparently, designed to allow UI interaction without freezing the browser.  This is designed to provide a richer UI experience in pages that require synchronous retrieval of data from a server.  This will NOT be fixed by the browser vendor, and must be reported to software vendors whose products do not function correctly with the new behavior so that they can adjust their software to accommodate for this behavior.

Side note: I would like nothing more than for this to never have been introduced in Firefox 3 or IE8; I had a SJAX email client that I had to extensively modify to accommodate for this behavior.  However, this change does make sense... locking a browser while doing a synchronous callback resembles the browser crashing, which is not acceptable to most mainstream users.
Comment 35 Arnaud 2008-11-04 07:10:55 PST
Reentrant event handling is not an effective solution to the GUI locking problem. You can still freeze with an infinite computation loop during event handling. Ideally no JS code should be executed in the browser EDT (like Chrome?).

Also the current FF3 behavior does not really match http://www.whatwg.org/specs/web-apps/current-work/multipage/browsers.html#event:

> An event loop must continually run through the following steps for as long as it exists:
>   1. Run the oldest task on one of the event loop's task queues. The user agent may pick any task queue.
>   2. Remove that task from its task queue.
>   3. If necessary, update the rendering or user interface of any Document or browsing context to reflect the current state.
>   4. Return to the first step of the event loop.
Comment 36 Sjoerd Mulder 2008-11-04 07:15:58 PST
(In reply to comment #34)
> This needs to be clarified:  according to the previous discussion in this
> ticket, and the observed behavior of IE8 (which displays the same behavior),
> this is NOT A BUG.  This is the future of web browsers apparently, designed to
> allow UI interaction without freezing the browser.  This is designed to 


Can you please read all the comments ;-)... This is a severe issue in Firefox 3 ONLY! The behavior does NOT happen in IE8. (tested it just again to make sure,

Result IE8: 0123456789[TIMEOUT]
Result Fx3: 0[TIMEOUT]123456789
Comment 37 BAM 2008-11-04 07:21:58 PST
I don't know what build ur testing with, but back in August when I created that simple test page, I tested it and got the following (perhaps u should be more dilligent in reading the comments, lol, jk):

------- Comment #25 From BAM 2008-08-28 13:47:25 PST -------
Testing with IE8 beta 2, my test page returns 0[TIMEOUT]123456789
If the thread had blocked the setTimeout calls, it would have returned
0123456789[TIMEOUT]

Seems IE8 performs as Firefox 3 does.  So much for my sjax email client!
Comment 38 Sjoerd Mulder 2008-11-04 07:33:31 PST
Can somebody else confirm that this issue is not (or is) happening on IE8... on my machine (winXP with IE 8.0.6001.18241 beta 2) The behavior of IE8 is: 0123456789[TIMEOUT], like it should...
Comment 39 Boris Zbarsky [:bz] 2008-11-04 07:45:21 PST
I would like to urge everyone here to read https://bugzilla.mozilla.org/page.cgi?id=etiquette.html

As things stand, I'm going to stop following this bug (and therefore be less likely to work on a fix), because the e-mail spew is just not worth it.  Unfortunately, I'm the reporter, so can't remove myself from the cc, but I will simply filter out all emails about this bug, as well as all bugzilla mail from the last several commenters.

I _would_ like to point out that comment 34 is wrong, as is comment 35 (a 5-second test would have shown that we interrupt long-running loops as needed).  The point about running pages in a separate thread or process from the UI is valid, of course, and is the long-term plan.
Comment 40 Arnaud 2008-11-04 08:13:04 PST
> I _would_ like to point out that comment 34 is wrong, as is comment 35 (a
> 5-second test would have shown that we interrupt long-running loops as needed).

I have tried this:

<html>
<body>
<button onclick="myHandler();">Loop</button>
<script>
function myHandler() {
	while (true);	
}
</script>
</body>
</html>

When you click "Loop", the GUI is frozen for at least 2 minutes in FF3 and the button is displayed as "pressed".
Comment 41 robinpelgrim 2008-11-05 12:38:51 PST
Created attachment 346516 [details]
Reentrant click test

I adjusted the timeout test a bit, to create a more 'realistic' situation in webapplications. 
The simple onclick - synch htmlrequest combination leads now to reentrant behaviour.

Suppose the repeated clicks are selection on subsequent treeview/listbox items.
In de Firefox 3 situation the first click ends last. [I you are lucky, the first selected treeview/listbox item is selected (i.s.o. the last), if not,
everything is messed up :(. ]


To get this example manageable for real world situations i'll have to:
-check after every sync XHR for reentrancy and abort
or
-skip reentrancy events or queue them (redispatch?)
or
-split function in two parts and replace sync XHR in async XHR

You see, all options means a lot of work in existing applications depending on a 'real' sync httprequest.
(We have to delay our planned Firefox support because of this issue)

But more important:

Firefox2, Google Chrome, IE 6, IE 7, IE 8, Safari and Opera behave the expected way (and did it since XHR was there)!!!

In my opinion the Firefox 3 behaviour is a serious regression!

(It is up to web developers to make the web application responsive.
The use of 'real' sync XHR in general does not make an application less responsive when responses are quick (enough).)
Comment 42 robinpelgrim 2008-12-08 14:41:34 PST
Created attachment 351989 [details] [diff] [review]
Patch

Don't expect too much of this patch, but at least the normal behaviour (that what was used in FireFox and is used in other browsers) is back.
It's a start.
(Today was my first c++ experience in years)
Comment 43 Jeff Walden [:Waldo] (remove +bmo to email) 2008-12-09 01:17:11 PST
Please take a look at the last two sections of:

http://developer.mozilla.org/en/docs/Getting_your_patch_in_the_tree

Posting a patch without requesting review of someone won't necessarily get the patch reviewed and in the tree, although if someone happens to be watching you might get lucky.  If you have any questions that document doesn't clear up, ask me here or via email.
Comment 44 Boris Zbarsky [:bz] 2009-01-16 07:55:57 PST
So... does that patch bring back the "UI is locked up during the sync XHR" behavior?  I would expect that it does, right?
Comment 45 robinpelgrim 2009-01-16 08:04:08 PST
Yes, the behaviour every other browser has/had!
Comment 46 BAM 2009-01-16 08:07:16 PST
IE8 beta has the same behavior as Firefox 3.  Like it or not, the ui and javascript run in seperate threads in the next generation of browsers.  I highly doubt this behavior will be changed.
Comment 47 Boris Zbarsky [:bz] 2009-01-16 08:08:36 PST
Comment on attachment 351989 [details] [diff] [review]
Patch

That's not acceptable behavior from a user point of view, sorry.

I'm fine with suppressing timeouts during the sync XHR, but that will have to be handled similarly to the way bug 111097 is handled.
Comment 48 robinpelgrim 2009-01-16 09:47:44 PST
Sorry, the new behaviour gives buggy web pages and makes sync requests not usefull anymore. This will create pages that sometimes will work and sometimes will not, depending on the users clicking speed.
Don't expect me to keep track of an event queue.

Why is it not acceptable in a user point of view. It was before.

That the browser does not respond is the browser's problem. Not the/my application's (web) page. If I want a sync request on my page it should be sync and not some kind of 'sync' which isn't.
The current solution just doesn't make sense at all. It only breaks the web.

Maybe a 'better' solution will be that only timeout events/page load events are handled (since they are controlable), and ui events are blocked.
In this case sync means "no user action handled".

Where can I find the discussion on why this new behaviour is implemented?
It's totally backward incompatible.

I just don't get this. Sorry.
Comment 49 robinpelgrim 2009-01-16 10:37:43 PST
(In reply to comment #46)
> IE8 beta has the same behavior as Firefox 3.  Like it or not, the ui and
> javascript run in seperate threads in the next generation of browsers.  I
> highly doubt this behavior will be changed.

This is not true:
(Simple cross-browser test case)
Firefox 3: 0[TIMEOUT]123456789
IE8 Beta: 0123456789[TIMEOUT]
Comment 50 Boris Zbarsky [:bz] 2009-01-16 10:48:22 PST
> Why is it not acceptable in a user point of view. It was before.

It wasn't.  The fact that the UI froze up was a considered a serious bug, and a lot of effort went into fixing it.  See more on this below.

> That the browser does not respond is the browser's problem. Not the/my
> application's (web) page.

Did I say that it should be your problem?

> The current solution just doesn't make sense at all. It only breaks the web.

Did I say that we don't need to fix this bug?  It needs to be fixed.  It's just that what you posted is not the right fix...

> In this case sync means "no user action handled".

Other way around.  Sync should not block user actions like closing the page.  Sync should block interaction with the page in general, and should block timeouts.

> Where can I find the discussion on why this new behaviour is implemented?

Bug 326273 is the bug where nested event queues went away.  I should note that it's not as if this bug were a _desired_ consequence; it was just an oversight.  As far as the freezing issue not being acceptable, see bug     330771, bug 190313, bug 273578.  Note that at least some of these were resolved in the Fx2 timeframe, so reintroducing the freezing would be a regression from that.  Unfortunately, the way things worked in Fx2 relied on nested event queues, which got removed, per above.

> It's totally backward incompatible.

Yes, agreed.  This bug certainly neeeds to be fixed.
Comment 51 robinpelgrim 2009-01-16 11:50:05 PST
Thanks for your explanation.

I see the problem now.

FYI:
Safari (3.1.2) also is freezing the browser.
Opera and Chromium behave correctly (only freezing the page, not the browser).

I agree we need a better solution but I think that the 'hanging browser' behavior' is a better starting point than the async sync behavior.
Because the 'hanging browser' at (slow) sync requests is (mostly) caused by bad programming an should be replaced (at that point) by async requests.
The async sync behavior punishes everyone.

Well, I have to look deeper into the event handling in FireFox and try to get it behave like in Chromium and Opera.
Comment 52 Boris Zbarsky [:bz] 2009-01-16 12:06:07 PST
For what it's worth, maybe the right approach is something more like this:

1)  Expose nsGlobalWindow::SuspendTimeouts on nsPIDOMWindow.  ResumeTimeouts is
    already exposed there.
2)  When starting the sync XHR loop, get the top window associated with the XHR
    (basically GetTop() on the window the XHR is attached too) and
    SuspendTimeouts on that.
3)  When the sync loop finishes, ResumeTimeouts on that same top window.

That won't help if you have multiple windows and timeouts in one poke another and the other has a sync XHR going.  But it'll be a lot better than now, and shouldn't be too bad to do, I hope.

If you'd rather I tried to do it, I can try to make time, but if you're already looking at this and willing to give it a shot...
Comment 53 Rand Scullard 2009-02-04 15:36:13 PST
While you are considering this issue, I beg you to take a look at bug 444457, which seems to me to be the same issue. That issue forced us to block Firefox 3 users from our application framework, which relies on synchronous AJAX calls for some core functionality. (We support every other major browser; we have only had to block FF3.)
Comment 54 Mike Shaver (:shaver -- probably not reading bugmail closely) 2009-02-04 15:50:02 PST
This seems like a bad regression that we should fix in 3.1 if not 3.0.x -- renominating for blocking after consultation with some content experts!
Comment 55 Jesse Costello-Good 2009-02-04 16:31:11 PST
We are having problems with TIBCO General Interface and Firefox 3.1 beta. When jit is on (as it is by default in about:config) we see race conditions that should be impossible. It appears that this is due to timeouts being called while synchronous XHR requests are pending. Removing all synchronous requests or turning the jit off both fix the issue. 

While we are aware of the performance issues with synchronous XHR requests, we (like Dojo and others) provide for synchronous lazy class loading in our JavaScript library. Therefore it is not always possible to load everything asynchronously. At the same time we use window.setTimeout(0) extensively to improve perceived UI performance. So we consider this a critical show-stopper currently existing in 3.1b2.
Comment 56 Jesse Costello-Good 2009-02-04 16:32:07 PST
If anyone can fix this issue, you can Ben!
Comment 57 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-02-04 16:38:36 PST
Per jst, need sicking's input to make a blocking decision.
Comment 58 Jonas Sicking (:sicking) PTO Until July 5th 2009-02-05 23:40:11 PST
We could certainly fix the narrow definition of this bug as filed, with fairly small amount of code and low risk. If all we do is to block timers for the current window when a sync XHR is in progress. I think jst mentioned something about that we already have an API that allows disabling and reenabling timers for a given window 

This wouldn't fix the more generic issue of events firing during sync XHRs. You'd still get things like mouse events, resize events and events from other async XHRs in progress.

However fixing timers might be enough to fix the most common problems, so it's something that would make sense to try to do for 3.1b3
Comment 59 robinpelgrim 2009-02-06 00:37:18 PST
(In reply to comment #58)
> We could certainly fix the narrow definition of this bug as filed, with fairly
> small amount of code and low risk. If all we do is to block timers for the
> current window when a sync XHR is in progress. I think jst mentioned something
> about that we already have an API that allows disabling and reenabling timers
> for a given window 
> 
> This wouldn't fix the more generic issue of events firing during sync XHRs.
> You'd still get things like mouse events, resize events and events from other
> async XHRs in progress.
> 
> However fixing timers might be enough to fix the most common problems, so it's
> something that would make sense to try to do for 3.1b3

Please, believe me, the most common problems are caused by mouse and click events and NOT by timeout events (since these are in your own control).

When blocking timer events you can still press a button, so what does that solve?
Comment 60 Boris Zbarsky [:bz] 2009-02-06 05:33:19 PST
robinpelgrim@zonnet.nl, mouse and click events are bug 333198, not this bug.  This bug is really about people who are having issues specifically due to timeouts.  Don't hijack bugs.
Comment 61 robinpelgrim 2009-02-06 06:31:58 PST
Well please, join both bugs. Since it's the same issue.
Comment 62 Boris Zbarsky [:bz] 2009-02-06 06:34:55 PST
It's not the same issue, since it needs to be fixed in very different (and largely independent) ways.
Comment 63 Johnny Stenback (:jst, jst@mozilla.com) 2009-02-06 14:55:06 PST
Blocking to do the minimal fix here, to disable timeouts in the XHRs window while doing sync stuff.
Comment 64 Rand Scullard 2009-02-09 06:32:29 PST
Bug 333198 may indeed be the bug related to mouse and click events, but where is the activity around that bug? I see no sign. When we saw that there was progress being made on THIS bug, we had some hope that both issues would be fixed by the same effort. This now does not appear to be the case. We understand the importance of clean issue tracking and have no desire to hijack this bug, and we furthermore have no doubt that fixing the timeout issue is important. That said, we are very anxious to see some attention paid to the larger issue of bug 333198.
Comment 65 BAM 2009-02-09 08:22:13 PST
I circumvented this in two ways... stopped using timeouts (would love to be able to again, but not sure other browsers in the future wouldn't mimic this functionality/bug), and showing a modal dialog while the sync call is occuring, stopping the user from interacting with the screen (not an alert, more like the modal dialog at www.componentart.com)
Comment 66 Brendan Eich [:brendan] 2009-02-09 10:57:43 PST
Rand: I pinged developers and expect an update, but in any event your comment is better placed in bug 333198.

I realize there's a lot of angst about the bugs (or underlying bug, Robin Pelgrim and I are trying to have a productive email exchange about whether and how bugs should match symptoms and causes here). Rest assured that I'm going to keep after the problems reported here and in bug 333198 and make sure they get fixed. In the mean time, please try to avoid adding noise and me-too comments.

/be
Comment 67 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-02-10 00:58:07 PST
Created attachment 361481 [details] [diff] [review]
Patch, v1

First stab. Seems to work just fine.

My first testcase used a timeout to do the sync xhr and I think that was lucky - I was experiencing crashes when reenabling timeouts from a timeout. Worked around that by using a runnable (which seems to have some precedent in the timeout code). I'm also using weak refs mainly not to piss off the cycle collector but also since we use a runnable to reenable the timeouts it seems safer not to hold the window alive if it has already been closed or if we're shutting down.
Comment 68 Olli Pettay [:smaug] 2009-02-10 01:37:40 PST
Have you tested cases when an event fired by XHR adds new timeouts before 
ResumeTimeouts is called?
Comment 69 Boris Zbarsky [:bz] 2009-02-10 06:16:44 PST
Why change SuspendTimeouts from being a void function?
Comment 70 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-02-10 15:31:06 PST
Created attachment 361652 [details] [diff] [review]
Patch, v1.1

Fixed the signature of SuspendTimeouts, and fixed a problem with timeouts created in the xhr event handlers.
Comment 71 Johnny Stenback (:jst, jst@mozilla.com) 2009-02-10 17:20:57 PST
Comment on attachment 361652 [details] [diff] [review]
Patch, v1.1

Looks good!
Comment 72 Jonas Sicking (:sicking) PTO Until July 5th 2009-02-10 18:01:20 PST
Are those nsGlobalWindow::SuspendTimeouts/ResumeTimeouts implementations correct? It seems like ResumeTimeouts does all the work it currently does even if mTimeoutsSuspendDepth is non-zero?

Also, would you mind rather than creating a new runnable, create something like nsRunnableMethod but that calls a function that returns an nsresult? I.e. something exactly like the current nsRunnableMethod but with a different typedef for Method. If you want to get really fancy you might even be able to templetize the return type and use a single class, but that might be overkill.
Comment 73 Johnny Stenback (:jst, jst@mozilla.com) 2009-02-10 18:19:41 PST
Comment on attachment 361652 [details] [diff] [review]
Patch, v1.1

Jonas is right, ResumeTimeout() at least needs to not resume timeouts except on last resume.
Comment 74 Olli Pettay [:smaug] 2009-02-12 06:23:09 PST
Comment on attachment 361652 [details] [diff] [review]
Patch, v1.1

This crashes when testing dom/tests/mochitest/ajax/
#0  0x00000032d7097581 in nanosleep () from /lib64/libc.so.6
#1  0x00000032d70973a4 in sleep () from /lib64/libc.so.6
#2  0x00002aaaaaaf4d99 in ah_crap_handler (signum=11)
    at /home/smaug/mozilla/mozilla_cvs/hg/mozilla/toolkit/xre/nsSigHandlers.cpp:149
#3  0x00002aaaaaaf5958 in nsProfileLock::FatalSignalHandler (signo=11) at nsProfileLock.cpp:216
#4  <signal handler called>
#5  nsGlobalWindow::IsInModalState (this=0x0)
    at /home/smaug/mozilla/mozilla_cvs/hg/mozilla/dom/src/base/nsGlobalWindow.cpp:5719
#6  0x00002aaab0b33c78 in nsGlobalWindow::RunTimeout (this=0x0, aTimeout=0x0)
    at /home/smaug/mozilla/mozilla_cvs/hg/mozilla/dom/src/base/nsGlobalWindow.cpp:7555
#7  0x00002aaab0b367d4 in nsGlobalWindow::TimerCallback (aTimer=<value optimized out>, aClosure=<value optimized out>)
    at /home/smaug/mozilla/mozilla_cvs/hg/mozilla/dom/src/base/nsGlobalWindow.cpp:8055
#8  0x00002aaaab28b587 in nsTimerImpl::Fire (this=0x41d30c0)
    at /home/smaug/mozilla/mozilla_cvs/hg/mozilla/xpcom/threads/nsTimerImpl.cpp:428
#9  0x00002aaaab28c0d6 in nsTimerEvent::Run (this=0x2752fd0)
    at /home/smaug/mozilla/mozilla_cvs/hg/mozilla/xpcom/threads/nsTimerImpl.cpp:520
#10 0x00002aaaab288285 in nsThread::ProcessNextEvent (this=0x66bfa0, mayWait=1, result=0x7fff411b56bc)
    at /home/smaug/mozilla/mozilla_cvs/hg/mozilla/xpcom/threads/nsThread.cpp:510
#11 0x00002aaaab245663 in NS_ProcessNextEvent_P (thread=0x0, mayWait=1) at nsThreadUtils.cpp:230
#12 0x00002aaab09c875a in nsXMLHttpRequest::Send (this=0x33b31a0, aBody=0x33b3230)
    at /home/smaug/mozilla/mozilla_cvs/hg/mozilla/content/base/src/nsXMLHttpRequest.cpp:2805
Comment 75 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-02-12 13:16:12 PST
Created attachment 362088 [details] [diff] [review]
Patch, v1.2

This fixes the reentrant case, oops. Can we do the runnable thing in another bug? It would be nice, for sure, but we need to get this landed soon.
Comment 76 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-02-12 16:25:42 PST
Filed bug 478304 for the nsRunnableMethod thing.
Comment 77 BAM 2009-02-12 23:32:11 PST
I'm all for this fix (heck, I woke this bug up after a year of inactivity).  However, I feel this particular change needs to be communicated out prior to the release of the correction.  Developers need to be warned that the Firefox core will be changing in a way that may require them to handle Firefox 3 in two different ways.  Just adding it to the release notes will possibly break a lot of people's applications because they did not have the time to accommodate for it in their apps by removing special cases for newer, fixed versions of Firefox 3.
Comment 78 Olli Pettay [:smaug] 2009-02-14 13:29:52 PST
(In reply to comment #52)
> That won't help if you have multiple windows and timeouts in one poke another
> and the other has a sync XHR going.
And won't help when plugins call JS.
Comment 79 Olli Pettay [:smaug] 2009-02-14 14:19:23 PST
So Safari sort-of blocks UI, but if events, which are dispatched during sync XHR, use alert(), lock ups or something else random badness may happen.
I need to test what happens if event listener opens new windows. (Will do that tomorrow once I have access to mac again.)

Opera blocks something, but user can interact for example with <select>.
The events related to interaction are just dispatched later.
If the events which are dispatched during sync XHR (readystatechange) open new
windows, those windows don't have the UI blocked in any way, and they can
modify the DOM of their window.opener.
Plugins are *not* disabled, so user can interact with them.
Comment 80 Olli Pettay [:smaug] 2009-02-14 14:37:15 PST
Opera seems to have the blocked UI only in the window which uses sync XHR.
So if the page uses iframe, that iframe UI does work while sync XHR is running.
(and it doesn't matter whether the iframe has a same-domain url or not.)
Comment 81 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-02-17 15:05:31 PST
Pushed changeset 5d6362d85fb4 to mozilla-central.
Comment 82 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-02-17 15:25:20 PST
Pushed changeset 06865a27657b to mozilla-1.9.1.
Comment 83 Jesse Costello-Good 2009-02-18 13:00:57 PST
Instead of unexpected behavior I now get a crash on the nightly builds.

ddeffac3-11cc-4bd2-8154-5b0052090218
Comment 84 Olli Pettay [:smaug] 2009-02-18 13:03:04 PST
Bent, that crash looks like comment 74 http://crash-stats.mozilla.com/report/index/ddeffac3-11cc-4bd2-8154-5b0052090218
Comment 85 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-02-18 15:51:30 PST
Looks like we'll call this crash bug 479143
Comment 86 Ben Turner (not reading bugmail, use the needinfo flag!) 2009-02-20 18:16:37 PST
I backed this out of branch until I can figure these crashes out.
Comment 87 Mike Beltzner [:beltzner, not reading bugmail] 2009-02-25 15:54:42 PST
This is needed to fix bug 479143, which is a P1 blocker, so this needs to be a P1 blocker AIUI.
Comment 89 Olli Pettay [:smaug] 2009-03-02 12:36:04 PST
Created attachment 364986 [details] [diff] [review]
backout patch for 1.9.1
Comment 90 Mike Beltzner [:beltzner, not reading bugmail] 2009-03-03 20:00:54 PST
and back in on branch today:

http://hg.mozilla.org/releases/mozilla-1.9.1/rev/2a69f2f01222
Comment 91 BAM 2009-06-17 07:09:08 PDT
And yes, you are all welcome for me bumping this ticket after a year and a half of inactivity.  Now off to my code to put special cases in to see what version of firefox 3 users have running, woo hoo!!!
Comment 92 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2010-05-02 16:29:08 PDT
*** Bug 412418 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.