Last Comment Bug 273578 - XMLHttpRequest with async = false takes 100% CPU until request finished
: XMLHttpRequest with async = false takes 100% CPU until request finished
Status: RESOLVED FIXED
: fixed1.8, perf, testcase
Product: Core
Classification: Components
Component: XML (show other bugs)
: Trunk
: x86 All
: P2 major with 2 votes (vote)
: mozilla1.8rc1
Assigned To: Darin Fisher
: Ashish Bhatt
Mentors:
Depends on: 190313 277056
Blocks: 218908
  Show dependency treegraph
 
Reported: 2004-12-07 09:32 PST by matiyam
Modified: 2007-09-02 01:28 PDT (History)
16 users (show)
asa: blocking1.8rc1-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Use WaitForEvent() instead of ProcessPendingEvent() (903 bytes, patch)
2005-01-22 01:02 PST, Boying Lu
no flags Details | Diff | Review
a testcase (3.61 KB, patch)
2005-01-25 19:35 PST, Boying Lu
no flags Details | Diff | Review
testcase (3.61 KB, text/plain)
2005-02-28 23:28 PST, Boying Lu
no flags Details
testcase (3.61 KB, application/x-gzip)
2005-02-28 23:42 PST, Boying Lu
no flags Details
simple fix (workaround) with 10 millisecond sleep in the loop (873 bytes, patch)
2005-10-14 00:19 PDT, Kees van Ginkel
jst: review+
bzbarsky: superreview+
Details | Diff | Review
v1 patch (1.03 KB, patch)
2005-10-18 12:45 PDT, Darin Fisher
asa: approval1.8rc1+
Details | Diff | Review

Description matiyam 2004-12-07 09:32:09 PST
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040616
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040616

Hi!

When I use a XMLHttpRequest object to fetch a file with async = false,
both Firefox and Mozilla take 100% of the CPU until the request
have been completed.

If I ask for a long .xml file, my machine almost freezes until the file
has been downloaded.

I have a small cgi script that just writes a small xml text, but with a delay
of 10 seconds. If the request is made with async=true, everything works flawlessly
(Mozilla just waits until the script have been downloaded).
But if the request is made with async=false, Mozilla takes 100% of the CPU until
the script is downloaded.


I guess it has something to do with the interaction between the parent
process and the thread that makes the new petition. Probably the parent
is using some "while" without using some blocking mechanism to wait
without using CPU.

I attach a small cgi that sleeps for a while, so Mozilla has to wait for a while.

Test cgi script:
xml.cgi
------------------
#!/bin/bash

echo "Content-type: text/xml"
echo
sleep 10
echo "<?xml version='1.0' encoding='ISO-8859-1'?><root></root>"

------------------

Test javascript code:

----------------------
function load_nonasyncronous(url)
{
        var o = XMLHttpRequest();
        o.open( "GET", url, false );
        o.send(null);
        alert("result: " + o.responseText);
}

function load_asyncronous(url)
{
        var o = XMLHttpRequest();
        o.onreadystatechange = function() {
                if (o.readyState == 4)
                        alert("result: " + o.responseText);

        o.open( "GET", url, true );
        o.send(null);
}
-----------------------

Thanks for such a great browser (and such a huge effort).

Reproducible: Always
Steps to Reproduce:
1.Write some js that fetches a xml from a CGI with some delay using
XMLHttpRequest and async = false
2. See CPU usage until the xml is fetched
3.

Actual Results:  
The CPU usage of Mozilla raises up to 100%

Expected Results:  
Mozilla should have waited for the file without using CPU
Comment 1 Heikki Toivonen (remove -bugzilla when emailing directly) 2004-12-07 16:39:48 PST

*** This bug has been marked as a duplicate of 190313 ***
Comment 2 matiyam 2004-12-09 09:39:19 PST
In Comment 1 Heikki marks this bug as a duplicate of Bug 190313 (XMLExtras
syncloading blocks UI (syncloadservice)), but i don't think this is really a
duplicate. It's not that the UI gets blocked (that could be a correct behaviour). 
The problem is that the CPU usage raises to 100%.

There's another bug like this one (Bug 240678 (Mozilla 1.7b (and Firefox 0.8) is
busy waiting on synchronous xmlhttprequest)), but that is already marked as a
duplicate of 190313.

Could someone verify this?

Comment 3 Heikki Toivonen (remove -bugzilla when emailing directly) 2004-12-12 22:29:34 PST
(In reply to comment #2)
> In Comment 1 Heikki marks this bug as a duplicate of Bug 190313 (XMLExtras
> syncloading blocks UI (syncloadservice)), but i don't think this is really a
> duplicate. It's not that the UI gets blocked (that could be a correct behaviour). 
> The problem is that the CPU usage raises to 100%.

The UI should never be blocked by JS running on a page. The reason it is blocked
is because all the XML processing happens in a tight loop that does not yield to
 the UI. Incidentally, that also manifests as 100% CPU.

I still believe this bug is duplicate of bug 190313.

The only case where this would not be a duplicate is if you can use the UI while
the CPU is pegged at 100% in the Mozilla process.
Comment 4 Boying Lu 2005-01-22 01:02:26 PST
Created attachment 172074 [details] [diff] [review]
Use WaitForEvent() instead of ProcessPendingEvent()

I have tested the patch with the nightly codes (2005/01/21) on my sparc
machine, it is one time faster than before
Comment 5 Boying Lu 2005-01-22 01:04:16 PST
Can anyone help me to verify if the patch works for you? Thanks 
Comment 6 Heikki Toivonen (remove -bugzilla when emailing directly) 2005-01-22 23:22:40 PST
(In reply to comment #4)
> Created an attachment (id=172074) [edit]
> Use WaitForEvent() instead of ProcessPendingEvent()
> 
> I have tested the patch with the nightly codes (2005/01/21) on my sparc
> machine, it is one time faster than before 

I believe WaitForEvent() works as well as ProcessPendingEvent(), but AFAIK it
has the same problem - you hit 100% CPU locking the UI until request finishes.
It *may* be slightly more efficient, but it is not the real solution IMO. Look
at bug 190313 for more ideas (I still maintain these are dupes).
Comment 7 Boying Lu 2005-01-23 21:49:03 PST
I think the root cause is in the following codes:
 // If we're synchronous, spin an event loop here and wait
 if (!(mState & XML_HTTP_REQUEST_ASYNC)) { 
    while (mState & XML_HTTP_REQUEST_SYNCLOOPING) {
         modalEventQueue->ProcessPendingEvents();
    }

when in sync mode, this thread will in a busy loop, which will waste a lot of
CPU time. I think the solution is to block current thread until another thread
posts an event in the modalEventQueue.
Comment 8 Heikki Toivonen (remove -bugzilla when emailing directly) 2005-01-25 12:02:24 PST
(In reply to comment #7)
> I think the root cause is in the following codes:

Yes (but I think I have explained this earlier in the dupe or some other bug)
this code was put in so sync loading would also work in embedded builds. Before,
sync loading was done in such a way that it did not block the UI (or cause 100%
CPU) but it only worked in the normal Mozilla builds, not in the embedded case.
As far as I know, the main question which still hasn't been resolved is what
kind of technology to use so that it will work without blocking the UI and still
work in embedded and normal case. The embedded case is problematic because there
Mozilla does not own the event loop.

> when in sync mode, this thread will in a busy loop, which will waste a lot of
> CPU time. I think the solution is to block current thread until another thread
> posts an event in the modalEventQueue.

No. The current thread is the UI, so you must not block it. But maybe it would
be possible to push the whole XMLHttpRequest into the network thread or something.
Comment 9 Boying Lu 2005-01-25 19:35:45 PST
Created attachment 172403 [details] [diff] [review]
a testcase
Comment 10 lilou 2005-02-25 14:10:25 PST
(In reply to comment #5)
> Can anyone help me to verify if the patch works for you? Thanks 

Hello
the patch https://bugzilla.mozilla.org/attachment.cgi?id=172074 works for me (i
apply the patch manually). The opensi application (http://www.opensi.org/) does
not work on windows without this patch (without the patch firefox or mozilla :
100% CPU  / with the patch and mozilla : % of use of CPU is OK).


In french : Description of the problem with opensi application without the patch :
http://forums.opensi.org/viewtopic.php?t=126&sid=d6e823a4e2a3765fc543726d8309fe06
Comment 11 Boying Lu 2005-02-28 23:28:51 PST
Created attachment 175905 [details]
testcase
Comment 12 Boying Lu 2005-02-28 23:42:38 PST
Created attachment 175907 [details]
testcase
Comment 13 Boying Lu 2005-02-28 23:50:38 PST
I don't think we can move XMLHttpRequest related codes to somewhere else.
Because these javascript codes run in javascript interpreter, which runs in the
current thread(UI thread). I don't know if there is some way to let a piece of
Javascript codes run in another thread.
Comment 14 Boris Zbarsky [:bz] 2005-03-01 08:36:43 PST
Boying, what do you mean, exactly?  The code that actually does the sync request
is in C++, not in JS....
Comment 15 Boying Lu 2005-03-07 04:42:41 PST
In the following snippet:
fetcher.open("POST",getServletURL(),false);
fetcher.setRequestHeader("Content-type","text/plain; charset=UTF-8");
try {
  fetcher.send(infotosent);
}catch (e) {
  window.alert("send error:"+e);
}

if (typeof fetcher.responseText=="string"&&fetcher.responseText.length>0) {
  window.alert("time taken:" +     
      (end-start)+"ms\nServerRespons:\n"+fetcher.responseText);
}
where fetcher is an XMLHttpRequest object and previous codes are trigged by a
button embedding in a HTML page. All above codes will be run in UI thread. If we
move the implementation of send() method  into an other thread. The UI thread
still should be blocked at the last "if" statment in sync mode because
fetcher.responseText contains servers response.
Comment 16 Boris Zbarsky [:bz] 2005-03-07 08:29:18 PST
Oh, I see.  That was in response to comment 8....

Yeah, I think we want to keep this on the UI thread.  If we can use the syncload
service, we should.  If we can't, we should probably make the minor tweaks
needed so we _can_ use it and use it.
Comment 17 Darin Fisher 2005-03-07 09:53:22 PST
The sync load service works by calling WaitForEvent.  That method calls PR_Wait
on a monitor.  That means that if called on the UI thread, it will lock up the
UI until some new PLEvent is added to the queue.  This means that UI events will
starve.  It is definitely suboptimal.  ProcessingPendingEvents works by allowing
all PLEvents and native events a chance to run, but it does not block waiting
for either.  What we really need is a way to say: block for PLEvents or native
events.

I think the event queue system should be revised to somehow communicate better
with the appshell to make this possible.

confirming bug
Comment 18 Harald Albrecht 2005-03-23 01:50:27 PST
In my JS-based application I see complete starvation and hang from time to time
when using Mozilla or Firefox builds with SVG enabled (trunk builds from
mid-March 2005 on Win2K): the HTTP request is sent but nothing more happens,
except 100% cpu load. What makes the thing even more worse is, say have Firefox
running and blocking, and now start a Mozilla instance running the same JS
application and using XMLHttpRequest, it blocks and hangs the same as the other
Firefox (!) instance. Note that the server is responding and has no problems, it
is just that both FF and Moz block for no appearant reason. Only after several
minutes of waiting it seems that some semaphore gets released, so FF and Moz
work again.
Comment 19 Brendan Eich [:brendan] 2005-04-18 14:04:14 PDT
bz, darin, anyone: who should own this bug?

/be
Comment 20 Boris Zbarsky [:bz] 2005-04-18 14:35:10 PDT
Someone who understands appshell, preferably...  It sounds like we want to go
back out to the OS-level event loop and wait on OS events, but with an event
queue pushed and without actually calling "return" from the function we're in.
Comment 21 Darin Fisher 2005-04-18 15:35:30 PDT
So, in other words, we need the code from OpenWindowJS ;-)

I can add this to my queue, but I'm already overburdened for Firefox 1.1.  I
would greatly appreciate some help.  Jonas? ;-)
Comment 22 Jonas Sicking (:sicking) 2005-04-18 16:37:10 PDT
Yeah, I'd be interested in working on this since I think it's an important bug.
I'm not sure how much time i'll have available just yet though. Should know more
in a week or two.
Comment 23 alexander :surkov 2005-06-07 03:31:52 PDT
Is it dupe of bug 190313?
Comment 24 Jonas Sicking (:sicking) 2005-06-07 03:47:49 PDT
no
Comment 25 Darin Fisher 2005-10-06 14:34:53 PDT
This is related to bug 309424
Comment 26 Asa Dotzler [:asa] 2005-10-10 16:53:15 PDT
Jonas, have you had any cycles to investigate here? Is there a possibility of a
low-risk fix?
Comment 27 Jonas Sicking (:sicking) 2005-10-11 15:09:35 PDT
I havn't investigated yet, but i really doubt there's anything lowrisk to go on
the branch.
Comment 28 Asa Dotzler [:asa] 2005-10-11 16:00:27 PDT
it's too late for anything but low-risk fixes. the time to take a fix for
something like this was months ago. 
Comment 29 Kees van Ginkel 2005-10-14 00:14:21 PDT
(In reply to comment #27)
> I havn't investigated yet, but i really doubt there's anything lowrisk to go
on the branch.

The fix I tried was a oneliner: Add a PR_SLEEP with 10 miliseconds in the loop.


    while (mState & XML_HTTP_REQUEST_SYNCLOOPING) {
      PR_Sleep( PR_MillisecondsToInterval(10));
      modalEventQueue->ProcessPendingEvents();
    }

Can you add this as a low risk fix for the 100% cpu and use the other defects
for the long term solutions as described by defect 190313 ?
Comment 30 Kees van Ginkel 2005-10-14 00:19:10 PDT
Created attachment 199529 [details] [diff] [review]
simple fix (workaround) with 10 millisecond sleep in the loop
Comment 31 Darin Fisher 2005-10-14 00:36:58 PDT
See also my patch in bug 309424.  Perhaps we should combine these two patches.
Comment 32 Kees van Ginkel 2005-10-17 13:07:49 PDT
(In reply to comment #31)
> See also my patch in bug 309424.  Perhaps we should combine these two patches.

Sounds like a fine idea, Who should review these ?
Comment 33 Johnny Stenback (:jst, jst@mozilla.com) 2005-10-17 16:04:44 PDT
Comment on attachment 199529 [details] [diff] [review]
simple fix (workaround) with 10 millisecond sleep in the loop

   if (!(mState & XML_HTTP_REQUEST_ASYNC)) {
     while (mState & XML_HTTP_REQUEST_SYNCLOOPING) {
+      PR_Sleep( PR_MillisecondsToInterval(20));

Seems reasonable, but that should be 10, not 20, right? And loose the space
after '('.

r=jst
Comment 34 Darin Fisher 2005-10-17 17:34:10 PDT
Comment on attachment 199529 [details] [diff] [review]
simple fix (workaround) with 10 millisecond sleep in the loop

>   // If we're synchronous, spin an event loop here and wait
>   if (!(mState & XML_HTTP_REQUEST_ASYNC)) {
>     while (mState & XML_HTTP_REQUEST_SYNCLOOPING) {
>+      PR_Sleep( PR_MillisecondsToInterval(20));
>       modalEventQueue->ProcessPendingEvents();
>     }

wouldn't it be better to only sleep if you are going to do
another iteration?  i.e., how about this:

      while (mState & XML_HTTP_REQUEST_SYNCLOOPING) {
	modalEventQueue->ProcessPendingEvents();

	// Be sure not to busy wait! (see bug 273578)
	if (mState & XML_HTTP_REQUEST_SYNCLOOPING)
	  PR_Sleep( PR_MillisecondsToInterval(10));
      }
Comment 35 Boris Zbarsky [:bz] 2005-10-17 19:08:12 PDT
Comment on attachment 199529 [details] [diff] [review]
simple fix (workaround) with 10 millisecond sleep in the loop

>Index: nsXMLHttpRequest.cpp
>   if (!(mState & XML_HTTP_REQUEST_ASYNC)) {
>     while (mState & XML_HTTP_REQUEST_SYNCLOOPING) {
>+      PR_Sleep( PR_MillisecondsToInterval(20));
>       modalEventQueue->ProcessPendingEvents();
>     }

Or even:

  if (!(mState & XML_HTTP_REQUEST_ASYNC)) {
    NS_ASSERTION(mState & XML_HTTP_REQUEST_SYNCLOOPING,
		 "Unexpected state");
    do {
      PR_Sleep( PR_MillisecondsToInterval(10));
      modalEventQueue->ProcessPendingEvents();
    } while (mState & XML_HTTP_REQUEST_SYNCLOOPING);

With either that or Darin's suggestion, sr=bzbarsky (note also that 10ms is
probably better than 20ms here, unless you had a good reason to pick the 20ms
number?).
Comment 36 Darin Fisher 2005-10-18 11:04:34 PDT
I'll check this in with the changes that I suggested.
Comment 37 Darin Fisher 2005-10-18 12:45:22 PDT
Created attachment 199966 [details] [diff] [review]
v1 patch

Here's the version of the patch that I commited on the trunk.
Comment 38 Darin Fisher 2005-10-18 12:45:57 PDT
fixed-on-trunk
Comment 39 Asa Dotzler [:asa] 2005-10-18 14:49:27 PDT
let's get this verified on the trunk before we consider taking this. 
Comment 40 Kevin Henrikson 2005-10-19 16:44:20 PDT
We've tested this fix with today's nightly build.  It works perfect.  No more
hangs.  Thanks for pushing this through, it would be a big help to have this in
1.5, as it's a big problem with our complex AJAX app on linux installs.

-Kevin
Zimbra
Comment 41 Darin Fisher 2005-10-19 17:06:40 PDT
This fix is very low risk.  I think we should take it for FF 1.5.
Comment 42 Brendan Eich [:brendan] 2005-10-19 18:32:47 PDT
Yeah, this fix is dope.  We should not be doing risk management by fix
decimation on this particular fix.

/be
Comment 43 Gervase Markham [:gerv] 2005-10-20 03:21:19 PDT
Brendan: I'm not sure what you mean by "this fix is dope". Is that a good thing?
"Dope" in the UK is marijuana. Is it OK to check it into the branch? If so, can
you approve the patch?

Gerv
Comment 44 Boris Zbarsky [:bz] 2005-10-20 07:25:07 PDT
"dope" is marijuana in the US too.  And it's a slang term for "a good thing",
generally... at least when one keeps in mind that Brendan is in California.  ;)
Comment 45 Brendan Eich [:brendan] 2005-10-20 08:20:42 PDT
I'm old school.  http://dope.urbanup.com/1449519

/be
Comment 46 Darin Fisher 2005-10-20 11:29:36 PDT
fixed1.8

Note You need to log in before you can comment on or make changes to this bug.