Closed Bug 146373 Opened 22 years ago Closed 17 years ago

100% CPU while JavaScript modifies page (loooong wait)

Categories

(Core :: DOM: Core & HTML, defect)

x86
All
defect
Not set
major

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: virkkila, Unassigned)

References

()

Details

(Keywords: perf)

Attachments

(2 files, 1 obsolete file)

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0rc2) Gecko/20020510
BuildID:    2002051009

If I surf to the url posted above, there is a list of classes, and you can
toggle wich you want to look at by clicking on of the checkboxes on or off. 

Doing this will however freeze the entire browser (and mail if it's open). It
will eventually un-freeze (melt?) and start working as usual, but I had to wait
at least 30s for this to happen. The window is not updating nor recieving any
events (typical freeze).

The fact that nothing gets updated (no painting is done etc.) will leave you
thinking your browser froze for good, and you are more likely to kill the
process and start mozilla up again than to actually wait for it to "start
working again".

Reproducible: Always
Steps to Reproduce:
1.Surf to http://www.go-mono.com/class-status-corlib.html
2.Click on "Missing" checkbox
3.Try using the browser, moving it around, having a window over it.

Actual Results:  The window contents are empty, or not redrawn. It will not
respond to anything. It is basically frozen/it hangs (for a while).

Expected Results:  The browser should remain responsive, do the painting, and
maybe the statusbar should say Processing javascript, please wait, which should
be interrupted it by clicking stop.
Confirming bug with Mozilla trunk binary 2002061707 on WinNT.
OS: Linux ---> All. My CPU is pinned at 100% for a good 10 seconds 
when I click on the "Missing" checkbox. In IE6, only 4 seconds;
Note: NN4.7 chokes badly on loading this site; don't try it!

The culprit: this page contains many <div> elements which may contain
many other <div> elements, which contain others... They all get updated
on the checkbox click, by a function which calls itself recursively
to walk the tree of <divs>. I have indicated the recursion below:


function viewAll (elt, dictTypes)
{
  var fView = false;
  var rgImages = getChildrenByTagName (elt, 'IMG');
  var cImages = rgImages.length;

  for (var iImage = 0; iImage < cImages; iImage++)
  {
    var strImage = trimSrc (rgImages [iImage].src);

    if (dictTypes [strImage])
    {
      fView = true;
      break;
    }
  }
	
  var rgElts = getChildrenByTagName (elt, 'DIV');
  var cElts = rgElts.length;
  if (cElts != 0)
  {
    var iElt;

    for (iElt = 0; iElt < cElts; iElt ++)
      fView |= viewAll (rgElts [iElt], dictTypes);  <<<<<<<<<<<<<<< RECURSION
  }

  elt.style.display = fView ? '' : 'none';
  return fView;
}


I do not know if the performance problem is the fault of the JS Engine
not handling the recursion well, or is due to the DOM access that occurs
during each call. In the JS Debugger, I saw the recursion go 5 levels deep !!!

I'm going to guess DOM and reassign there; please reassign back
if this is incorrect and we have a pure JS Engine performance issue -
Assignee: rogerl → jst
Component: JavaScript Engine → DOM Level 0
Keywords: perf
OS: Linux → All
QA Contact: pschwartau → desale
Summary: Javascript freezes program while it's modifying page (loooong wait) → 100% CPU while JavaScript modifies page (loooong wait)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Here is some profiling data from the Mozilla JS Debugger. I loaded
the site, turned on profiling, and clicked the "Missing" checkbox.
The first function called from the click handler is selectMissing(),
and we see it took 12.5 seconds to complete.

We can also see the 5 levels of recursion attained by the viewAll()
function. But the bulk of the time is taken up within this function
by calls to the function getChildrenByTagName(). We see that a total
of 9.96 seconds out of the 12.5 seconds is spent doing that:


clickHandler()          calls 1, total 0ms, min 0ms, max 0ms,
                                                                    avg 0ms
selectMissing()         calls 1, total 12500ms, min 12500ms, max 12500ms,
                                                                    avg 12500ms
toggleFilter()          calls 1, total 12500ms, min 12500ms, max 12500ms,
                                                                    avg 12500ms
filterTree()            calls 1, total 12485ms, min 12485ms, max 12485ms,
                                                                    avg 12485ms
viewAll()               calls 1606 (depth 5), total 12469ms, min 12469ms,
                                                      max 12469ms,  avg 7.76ms
trimSrc()               calls 4176, total 520ms, min 0ms, max 16ms,
                                                                    avg 0.12ms
getChildrenByTagName()  calls 3212, total 9961ms, min 16ms, max 94ms,
                                                                    avg 3.1ms



Here it is:

function getChildrenByTagName (elt, strTag)
{
  strTag = strTag.toLowerCase ();
  var rgChildren = new Array ();

  var eltChild = elt.firstChild;
  while (eltChild)
  {
    if (eltChild.tagName && eltChild.tagName.toLowerCase() == strTag)
      rgChildren.push (eltChild);

    eltChild = eltChild.nextSibling;
  }

  return rgChildren;
}


So it seems to boil down to this. What is taking all the time? 
Is it the JS Engine functionality, or the DOM property access
on |elt.firstChild|, |eltChild.tagName|, |eltChild.nextSibling|?
I believe jst will tell you that it is the firstChild/nextSibling accesses which
are much slower than accessing them via the childNodes array.
Bob's right, but there's more to this perf problem than the way the DOM tree is
traversed...
Attached file quantify result (obsolete) —
quantified on win2k.
Seems thread syncronize waste a lot time?
We only care about what happens on the main thread in cases like this (in 99+%
of them at least) so when running Quantify, please filter out everything that
doesn't run on the main thread. That'll get rid of all the thread
synchronisation fluff that some of the necko threads spend lots of time in and
better show what gecko really does.
Only quantify the main thread, 
the topest cpu eaters are:

js_Invoke()/js_Interpret()/js_InternalInvoke()/js_GetProperty()  99%
nsCOMPtr_base::assign_from_helper()	 46%	
nsCOMPtr_base::~nsCOMPtr_base() 	 19%
XPTC_InvokeByIndex()			 13%
Attachment #91217 - Attachment is obsolete: true
From that data...

F+D %     Calls F %  Function
100.00        0 0.00 .main_0.
98.79    129306 0.07 js_Invoke
98.79       175 0.13 js_Interpret
98.77     89973 0.02 js_InternalInvoke
98.71         5 0.00 JS_CallFunctionValue
89.85    216078 0.10 js_GetProperty
45.53   2199747 28.60 nsCOMPtr_base::assign_from_helper(...)
19.15   4617587 8.34 nsCOMPtr_base::~nsCOMPtr_base(void)
12.80     60170 5.75 XPTC_InvokeByIndex
9.46    1012759 0.04 JS_RemoveRootRT
9.43    1012759 0.09 js_RemoveRoot
9.10    2210473 7.10 PR_Lock
7.74    1012663 0.04 JS_AddNamedRootRT
7.70    1012749 0.12 js_AddRootRT
7.37    2210473 5.89 PR_Unlock
6.01      20112 0.01 js_SetProperty
2.64     136606 1.02 malloc
2.44    2250698 0.55 JS_DHashTableOperate
2.26    5011630 2.26 TlsGetValue
2.15     183801 2.15 HeapAlloc
2.13     113022 0.01 JS_malloc
1.79    1083203 0.09 JS_DHashTableRawRemove
1.70    2174572 1.70 nsQueryInterface::nsQueryInterface(nsQueryInterface const&)
1.70    1105530 1.70 memset
1.70    1083203 0.04 JS_DHashClearEntryStub
1.48      95923 0.16 nsHashtable::Get(nsHashKey *)
1.46     267755 0.09 js_LookupProperty
1.44          9 0.00 JS_MaybeGC
1.44          5 0.03 js_GC
1.44          5 0.00 js_ForceGC
1.44          5 0.00 JS_GC
1.40      43042 0.02 js_NewObject
1.25       8188 0.69 nsStaticCaseInsensitiveNameTable::Lookup(nsACString const&)
1.12    2174572 1.12 nsQueryInterface::nsQueryInterface(nsISupports *,UINT *)
1.09     103246 0.02 js_AllocGCThing
1.05      29832 0.00 js_ValueToObject
1.05      29832 0.00 js_ValueToNonNullObject
1.04      29832 0.00 js_StringToObject
1.01      46743 0.48 new(UINT)

... nothing really interesting there, we execute a *lot* of JS, and that's where
we spend most of the time, the fact that we spend a lot of time in the nsCOMPtr
destructor and constructor (indirectly) suggests that this data was gathered
using a 1.0 branch build, right? A trunk build should not spend nearly as much
time in the nsCOMPtr code while executing JS...
Yes, you are right. I am using a 1.0 release. I do not know what happen
in xpcom of trunk. 

I will to gather data from trunk then.
I have quantified trunk on solaris 8. It seems there is no particular 
hot point. The cpu time is distributed among a lot of functions.

This is the quantified result. All 0.00% stuff is removed in the list.
Mass-reassigning bugs to dom_bugs@netscape.com
Assignee: jst → dom_bugs
The url is not there anymore.
Duplicate of bug 112858 – simple but lengthy Javascript document.write loop hangs the complete browser.
comment 8 and the end of comment 9 suggest, I think, that trunk either had no problem, or at least no obvious room for improvement. Reporter comments  "cause was due to slow dom updating/walking, rather than a document.write. However since there is no testcase, and no useful data, ..."

so I'm closing incomplete
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: