Last Comment Bug 103649 - Performance problems when rendering a page with hundreds of iframes
: Performance problems when rendering a page with hundreds of iframes
Status: NEW
: perf, testcase
Product: Core
Classification: Components
Component: Layout: HTML Frames (show other bugs)
: Trunk
: x86 Windows 98
: P4 normal with 10 votes (vote)
: Future
Assigned To: Nobody; OK to take it and work on it
:
Mentors:
: 249823 (view as bug list)
Depends on:
Blocks: 71668
  Show dependency treegraph
 
Reported: 2001-10-08 08:48 PDT by Damian Yerrick
Modified: 2013-02-28 10:02 PST (History)
22 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
MOZILLA caused an invalid page fault... (499 bytes, text/plain)
2001-10-08 08:49 PDT, Damian Yerrick
no flags Details
Initial test case with 500 iframes (707 bytes, application/zip)
2001-10-08 08:51 PDT, Damian Yerrick
no flags Details
Test case zip file, this time with 1000 iframes (923 bytes, application/octet-stream)
2001-10-09 00:00 PDT, David Gerard
no flags Details
Profile of attachment 52519 (894.48 KB, text/html)
2001-10-09 11:51 PDT, Boris Zbarsky [:bz]
no flags Details
A plot of memory usage on a Athlon 900/ 384Mb ram. Linux kernel 2.4.9. 300 sek to open; another 1200 to close (6.35 KB, image/png)
2001-10-09 16:32 PDT, Dennis Haney
no flags Details
Consolidates context iterations in XPCJSRuntime::SyncXPCContextList (4.47 KB, patch)
2001-10-10 03:24 PDT, David Bradley
no flags Details | Diff | Splinter Review
Updated with jband's comments (4.75 KB, patch)
2001-10-15 08:24 PDT, David Bradley
jband_mozilla: review+
jst: superreview+
Details | Diff | Splinter Review

Description Damian Yerrick 2001-10-08 08:48:17 PDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)
BuildID:    2001100503

When I open a page with hundreds of <iframe> elements, Mozilla illegal ops and 
brings up Talkback.

Reproducible: Always
Steps to Reproduce:
1. Set Mozilla as your default Internet browser.
2. Open outer.html from the test case in a forthcoming attachment.


Actual Results:  MOZILLA caused an invalid page fault in
module MSVCRT.DLL at 015f:7800d16a.
(Continued in another forthcoming attachment.)

Expected Results:  Page is rendered, with each iframe usable.

This computer has 64 MB of RAM.
IE 6 has no problem with this page.

Talkback incident ids include
TB36308553G, TB36417833Q, and TB36418544X.
Comment 1 Damian Yerrick 2001-10-08 08:49:33 PDT
Created attachment 52518 [details]
MOZILLA caused an invalid page fault...
Comment 2 Damian Yerrick 2001-10-08 08:51:15 PDT
Created attachment 52519 [details]
Initial test case with 500 iframes
Comment 3 Boris Zbarsky [:bz] 2001-10-08 10:35:23 PDT
worksforme, linux build 2001-10-07-06 linux build.

I _do_ use about 90M ram to view that page, though.  Could you be just running
out of memory?  what are your virtual memory settings?
Comment 4 Christopher Hoess (gone) 2001-10-08 13:56:37 PDT
->evaughan, who took pollmann's bugs.
Comment 5 James Paige 2001-10-08 15:06:55 PDT
Wow. that was fun. 25 minutes of solid disk-thrashing, but no crash. Build
2001110803 Windows 95 (on a machine with 64 mb ram)
Comment 6 Olivier Cahagne 2001-10-08 15:22:17 PDT
wfm using build 2001100803 on Win2k. Albeit slow though.
Comment 7 David Gerard 2001-10-08 23:56:20 PDT
Slow is the word. I just tried a version with 1000 iframes (based on the test
case, but twice as long). WinNT4sp6a, PIII-600, 320 meg memory.

This was a wonderful stress test to watch in action. It took about 20 minutes of
CPU time. The VM size started at 190 meg, peaked at about 250 meg, slooooooowly
dropped to 210 meg, then climbed again to 290 meg. Stayed steady there for a
while and took several minutes before it was ready to display the page.

It would be interesting to know what was happening at what time - what was using
all that memory, what was being freed, etc.

The page with hundreds of iframes scrolls very slowly.
Closing the 1000-iframe page is even more fun: the VM size drops, drip by drip,
slowly, torturously, a few hundred K at a time, from 290 meg to 224 meg, over
the course of 17 minutes CPU time.

Note that opening or closing the page is mostly CPU-intensive - it uses a lot
of memory, but doesn't get the system thrashing; it's the CPU that's getting the
workout.

This test case is a wonderful Bad Example for performance testing under
pathological conditions, and will doubtless be rich in pointers to places that
could do with tightening up. In the meantime, I'm not surprised it died
horribly under Win98 ...

This one is a WFM too; perhaps it should be a different bug, a pathological
test case for anyone wishing to torture their machine.
Comment 8 David Gerard 2001-10-09 00:00:52 PDT
Created attachment 52692 [details]
Test case zip file, this time with 1000 iframes
Comment 9 [:jesup] on pto until 2016/8/1 Randell Jesup 2001-10-09 10:57:47 PDT
Even if this isn't a crasher, it's a perf bug.  Confirming on that basis.
Comment 10 Boris Zbarsky [:bz] 2001-10-09 11:51:23 PDT
Created attachment 52742 [details]
Profile of attachment 52519 [details]
Comment 11 Boris Zbarsky [:bz] 2001-10-09 11:52:42 PDT
Is it me, or are we spending lots of time locking and unlocking?  Not to mention
allocating memory for 500 docshells, and the like, of course....
Comment 12 [:jesup] on pto until 2016/8/1 Randell Jesup 2001-10-09 14:23:15 PDT
NS_NewHTMLContentSink ->
....
XPCJSRuntime::SyncXPCContextList ->
JS_DHashTableEnumerate ->
KillDeadContextsCB ->
JS_ContextIterator ->
js_UnlockRuntime/js_LockRuntime

is close to 1/2 the runtime, most of it in Lock/Unlock.


Much of the rest seems to be creating (root) frames and handling ContentInserted.
Comment 13 Boris Zbarsky [:bz] 2001-10-09 15:06:29 PDT
hmm.. To be precise, we have:

HTMLContentSink::Init ->
IsScriptEnabled ->
nsDocShell::GetScriptGlobalObject ->
nsDocShell::EnsureScriptEnvironment

Most of the time seems to be spent under there.  Over to XPConnect, ccing jst
and jband
Comment 14 Dennis Haney 2001-10-09 16:32:31 PDT
Created attachment 52811 [details]
A plot of memory usage on a Athlon 900/ 384Mb ram. Linux kernel 2.4.9. 300 sek to open; another 1200 to close
Comment 15 David Bradley 2001-10-09 22:17:10 PDT
I've come up with a patch that will reduce the number of calls to
JS_ContextIterator from about 5 mill to a little under 300,000. I need to do
some more testing and timings to make sure the numbers are correct and that
things behave normally.
Comment 16 John Bandhauer 2001-10-10 00:20:54 PDT
I'm happy to look at the code even if it is not for sure. 

Ultimately it *might* be best to just add hooks in the JS engine so xpconnect 
can track JSContext creation and destruction when that happens and not some time 
later.

There were subtle problems recycled JSContext pointers in the past when we we 
not calling SyncXPCContextList as often. 

On the other hand, we are pretty close to being able to not be surprized at all 
by JSContext construction and destruction. One really can't use xpconnect on a 
JSContext for which initClasses or initClassesWithNewWrappedGlobal has not been 
called. And, now that we have nsIXPConnect::releaseJSContext we could make it 
so that *all* destroyers of JSContext (that use xpconnect) call xpconnect to do 
their JSContext destruction. Once we can *trust* that xpconnect is being 
correctly informed about these events then we really don't need to do the 
aggressive work done by SyncXPCContextList (except maybe as a DEBUG only 
verification). I think that may be the best way to go.
Comment 17 David Bradley 2001-10-10 03:22:47 PDT
The following patch seems to reduce the number of context iterations. I think we
still have issues on the cleanup side. This patch basically puts in place a
marking scheme for the context map. Rather than iterating all the context in the
map and then for each one of those iterating the JS's contexts I used a mark and
sweep scheme. This consolidates the iteration of JS context to one loop and the
iteration of the context map to one loop. I got widely different numbers the two
tests runs I made with the original code. They ranged from 5 million to 50
million calls to JS_ContextIterator. In both runs with the patch I came in under
300k of calls. In any case, even going from 5 million to 300k is a decent
improvement.

This patch retains the same logic so is fairly low risk, assuming I coded it
correctly. jband's suggestion is probably the better ultimate solution. So I
think it's worth using this patch until we can validate we won't be surprised by
JSContext construction/destruction.
Comment 18 David Bradley 2001-10-10 03:24:10 PDT
Created attachment 52882 [details] [diff] [review]
Consolidates context iterations in XPCJSRuntime::SyncXPCContextList
Comment 19 David Gerard 2001-10-10 19:59:50 PDT
Just to make performance testers cry: I tried my 1000-iframe example in
IE 5.5. Same machine as before (PIII-600, 320 meg, WinNT4sp6a).

Time to render file: 9 seconds (Mozilla: 20 minutes), not taking 100%
CPU either.

Time to close window: straight away (Mozilla: 17 minutes at 100% CPU).

Ouch.
Anyone got IE 6 to hand?

(I can see it now: Microsoft HTML how-to: "Yes, hundreds of iframes on
a page are the new standard ...")
Comment 20 David Bradley 2001-10-10 20:41:57 PDT
Ok, a little follow up for today's findings.

I ran another series of tests. Basically the patch reduces the number calls to
js_ContextIterator by about 50 million. Quantify says the reduction in overall
time is about 75%. It's hard to get an identical run with Win2k. So many things
can affect performance. I'm a bit skeptical about that number I think it's less.

Leaving the page is supposedly faster, Quantify reports a 66% decrease in time.
Like above this is suspect.

Other than WaitForSingleObject blocking, SetWindowsPos, ReleaseSemaphore, new,
malloc, TlsGetValue are next on the list.

Not much I can do about SetWindowPos. 

The ReleaseSemaphore is used  via, 

NsThreadPool::DispatchRequest -\
                                \
PR_ExitMonitor --------------------> PR_Unlock -> PR_MD_UNLOCK
                                /
Js_UnlockRuntime =============-/

PR_ExitMonitor is called mainly by the GetNewOrUsed and a few other functions in
XPConnect and nsComponentManagerImpl::GetService

I know for XPConnect this locking just needs to be simple mutex/critical
section. I'm wondering why we're using such a heavy locking function to guard
access to maps and such. (I suspect there's something I'm missing)

Some other observations. RuleHash::AppendToTable gets called 329.658 times. This
seems to emanate from EnumRulesMatching. EnumRulesMatching is called 30,000
times, and everything else propigates from there.

The other thing that probably needs to be looked at is the memory consumption.
It takes about 90 megs to display this page. I don't know if that's acceptable
or not.
Comment 21 John Bandhauer 2001-10-11 00:04:01 PDT
dbradley: The patch looks good to me. Seems like a fine place for mark'n'sweep 
to me. Had I been in a mark'n'sweep state of mind when I wrote that code I may 
have thought of it myself. Thanks. I have only nits to complain about...

- might as well rename "KillDeadContextsCB" to "SweepContexts"
- please don't infect xpconnect with "if (" :)
- If it were me, I'd just do "MARK_FLAG = 4" with a comment about avoiding 
overlap if anything changes. There's some (slight) chance some compiler might 
want to use a smaller data type for the enum - no reason to expand unnecessarly.

On locking... xpconnect is using a monitor rather than a lock because there are 
a couple of places where the lock needs to be reentrant. The extra cost (over 
just using the underlying PRLock) is just an extra call to get the current 
thread (maps down to TlsGetValue). Locking and unlocking a PRLock (on Win32) 
maps down to Enter/LeaveCritivalSection. NSPR is structured to use funtion calls 
where we might prefer macros. But this is not really all that heavyweight. It is 
part of the cost we pay for cross platform

On comparing to IE... I speculate that IE is lazier about setting up a JS 
execution environment in iframes. I'd bet that if one ran some JS code in each 
iframe then the difference would be less dramatic (after we fix the glaring 
bits).

This, of course, leads to the question: could we get away with being lazier 
about initializing JSContexts etc. ?
Comment 22 David Bradley 2001-10-15 08:24:51 PDT
Created attachment 53569 [details] [diff] [review]
Updated with jband's comments
Comment 23 David Bradley 2001-10-15 09:31:50 PDT
Regarding the locking. The only thing that peaked my interest was the 511 calls
to ReleaseSemaphore, which seemed to cost a fair bit of time. This is only done
if asked for, so not a problem for the majority of locks/unlocks.
Comment 24 John Bandhauer 2001-10-15 09:57:04 PDT
Comment on attachment 53569 [details] [diff] [review]
Updated with jband's comments

r/sr=jband
Comment 25 Johnny Stenback (:jst, jst@mozilla.com) 2001-10-15 11:25:16 PDT
Comment on attachment 53569 [details] [diff] [review]
Updated with jband's comments

r/sr=jst
Comment 26 David Bradley 2001-10-17 08:44:09 PDT
Patch checked in (53569)

There's still more to be addressed so leaving this open.
Comment 27 David Bradley 2001-10-22 17:58:16 PDT
I did a little more digging using Purify. We are now to the point where we need
to look at reducing the complexity of data structures in some fashion to
conserve memory as well as increase speed. In any case this is outside the realm
of xpconnect and needs to be reassigned. One of the larger users of memory is
RuleHash::AppendRuleToTable. This allocates around 400k for 50 iFrames. I cut
the original sample of 500 iFrames down to 50 so I could run things in a
reasonable amount of time. The following are some other large consumers of memory:

[I] MIU: Memory use of 209865 bytes from 51 blocks allocated in PR_Malloc
        Distribution of blocks in use
        Allocation location
            malloc         [msvcrt.DLL]
            PR_Malloc      [prmem.c:50]
            PresShell::AllocateFrame(UINT,void * *) [nsPresShell.cpp:1789]
            nsStyleContext::SetStyle(nsStyleStructID,nsStyleStruct const&)
[nsStyleContext.cpp:465]
            nsRuleNode::WalkRuleTree(nsStyleStructID,nsIStyleContext
*,nsRuleData *,nsCSSStruct *) [nsRuleNode.cpp:1399]
            nsRuleNode::GetVisibilityData(nsIStyleContext *) [nsRuleNode.cpp:1077]
            nsRuleNode::GetStyleData(nsStyleStructID,nsIStyleContext *)
[nsRuleNode.cpp:4249]
            nsStyleContext::GetStyleData(nsStyleStructID) [nsStyleContext.cpp:384]
            nsBoxFrameInner::CacheAttributes(void) [nsBoxFrame.cpp:420]
            nsBoxFrame::Init(nsIPresContext *,nsIContent *,nsIFrame
*,nsIStyleContext *,nsIFrame *) [nsBoxFrame.cpp:377]

[I] MIU: Memory use of 209865 bytes from 51 blocks allocated in PR_Malloc
        Distribution of blocks in use
        Allocation location
            malloc         [msvcrt.DLL]
            PR_Malloc      [prmem.c:50]
            PresShell::AllocateFrame(UINT,void * *) [nsPresShell.cpp:1789]
            nsRuleNode::CreateRootNode(nsIPresContext *,nsIRuleNode * *)
[nsRuleNode.cpp:370]
            StyleSetImpl::ResolvePseudoStyleFor(nsIPresContext *,nsIContent
*,nsIAtom *,nsIStyleContext *,int,nsICSSPseudoComparator *) [nsStyleSet.cpp:942]
            nsPresContext::ResolvePseudoStyleWithComparator(nsIContent *,nsIAtom
*,nsIStyleContext *,int,nsICSSPseudoComparator *,nsIStyleContext * *)
[nsPresContext.obj:906]
            nsPresContext::ResolvePseudoStyleContextFor(nsIContent *,nsIAtom
*,nsIStyleContext *,int,nsIStyleContext * *) [nsPresContext.obj:884]
            nsCSSFrameConstructor::ConstructRootFrame(nsIPresShell
*,nsIPresContext *,nsIContent *,nsIFrame *&) [nsCSSFrameConstructor.obj:3500]
            StyleSetImpl::ConstructRootFrame(nsIPresContext *,nsIContent
*,nsIFrame *&) [nsStyleSet.cpp:1183]
            PresShell::InitialReflow(int,int) [nsPresShell.cpp:2619]

[I] MIU: Memory use of 208896 bytes from 51 blocks allocated in
nsSupportsArray::GrowArrayBy(int)
        Distribution of blocks in use
        Allocation location
            new(UINT)      [msvcrt.DLL]
            nsSupportsArray::GrowArrayBy(int) [nsSupportsArray.obj:194]
            CSSRuleProcessor::GetRuleCascade(nsIPresContext *,nsIAtom *)
[nsCSSStyleSheet.cpp:4492]
            CSSRuleProcessor::RulesMatching(nsIPresContext *,nsIAtom
*,nsIContent *,nsIAtom *,nsIStyleContext *,nsICSSPseudoComparator
*,nsIRuleWalker *) [nsCSSStyleSheet.cpp:4063]
            EnumPseudoRulesMatching [nsStyleSet.cpp:912]
            StyleSetImpl::ResolvePseudoStyleFor(nsIPresContext *,nsIContent
*,nsIAtom *,nsIStyleContext *,int,nsICSSPseudoComparator *) [nsStyleSet.cpp:945]
            nsPresContext::ResolvePseudoStyleWithComparator(nsIContent *,nsIAtom
*,nsIStyleContext *,int,nsICSSPseudoComparator *,nsIStyleContext * *)
[nsPresContext.obj:906]
            nsPresContext::ResolvePseudoStyleContextFor(nsIContent *,nsIAtom
*,nsIStyleContext *,int,nsIStyleContext * *) [nsPresContext.obj:884]
            nsCSSFrameConstructor::ConstructRootFrame(nsIPresShell
*,nsIPresContext *,nsIContent *,nsIFrame *&) [nsCSSFrameConstructor.obj:3500]
            StyleSetImpl::ConstructRootFrame(nsIPresContext *,nsIContent
*,nsIFrame *&) [nsStyleSet.cpp:1183]
[I] MIU: Memory use of 209865 bytes from 51 blocks allocated in PR_Malloc
        Distribution of blocks in use
        Allocation location
            malloc         [msvcrt.DLL]
            PR_Malloc      [prmem.c:50]
            PresShell::AllocateFrame(UINT,void * *) [nsPresShell.cpp:1789]
            nsRuleNode::Transition(nsIStyleRule *,nsIRuleNode * *)
[nsRuleNode.cpp:414]
            nsRuleWalker::Forward(nsIStyleRule *) [nsRuleWalker.cpp:95]
            ContentEnumFunc [nsCSSStyleSheet.obj:3917]
            RuleHash::EnumerateAllRules(int,nsIAtom *,nsIAtom *,nsVoidArray
const&,(*)(nsICSSStyleRule *,void *),void *) [nsCSSStyleSheet.cpp:634]
            CSSRuleProcessor::RulesMatching(nsIPresContext *,nsIAtom
*,nsIContent *,nsIStyleContext *,nsIRuleWalker *) [nsCSSStyleSheet.cpp:3965]
            EnumRulesMatching [nsStyleSet.cpp:766]
            StyleSetImpl::ResolveStyleFor(nsIPresContext *,nsIContent
*,nsIStyleContext *,int) [nsStyleSet.cpp:866]

[I] MIU: Memory use of 205750 bytes from 50 blocks allocated in PR_Malloc
        Distribution of blocks in use
        Allocation location
            malloc         [msvcrt.DLL]
            PR_Malloc      [prmem.c:50]
            PresShell::AllocateFrame(UINT,void * *) [nsPresShell.cpp:1789]
            nsRuleNode::ComputeBorderData(nsStyleStruct *,nsCSSStruct
const&,nsIStyleContext *,nsRuleNode *,RuleDetail::nsRuleNode const&,int)
[nsRuleNode.cpp:3083]
            nsRuleNode::WalkRuleTree(nsStyleStructID,nsIStyleContext
*,nsRuleData *,nsCSSStruct *) [nsRuleNode.cpp:1411]
            nsRuleNode::GetBorderData(nsIStyleContext *) [nsRuleNode.cpp:1186]
            nsRuleNode::GetStyleData(nsStyleStructID,nsIStyleContext *)
[nsRuleNode.cpp:4249]
            nsStyleContext::GetStyleData(nsStyleStructID) [nsStyleContext.cpp:384]
            nsStyleContext::CalcStyleDifference(nsIStyleContext *,int&,int)
[nsStyleContext.cpp:670]
            CaptureChange  [nsFrameManager.cpp:1551]

[I] MIU: Memory use of 131088 bytes from 2 blocks allocated in
orkinHeap::Alloc(nsIMdbEnv *,UINT,void * *)
        Distribution of blocks in use
        Allocation location
            new(UINT)      [msvcrt.DLL]
            orkinHeap::Alloc(nsIMdbEnv *,UINT,void * *) [orkinHeap.cpp:91]
            morkZone::zone_grow_at(morkEnv *,UINT) [morkZone.cpp:242]
            morkAtomSpace::MakeBookAtomCopyWithAid(morkEnv *,morkFarBookAtom
const&,UINT) [morkAtomSpace.cpp:201]
            morkStore::AddAlias(morkEnv *,morkMid const&,UINT) [morkStore.cpp:999]
            morkBuilder::OnAlias(morkEnv *,morkSpan const&,morkMid const&)
[morkBuilder.cpp:649]
            morkParser::ReadAlias(morkEnv *) [morkParser.cpp:1056]
            morkParser::ReadDict(morkEnv *) [morkParser.cpp:1336]
            ???            [ip=0x08e7e16c]
            morkParser::ReadContent(morkEnv *,BYTE) [morkParser.cpp:1428]
Comment 28 David Bradley 2001-10-23 05:56:40 PDT
Changing subject to better reflect the focus of this bug.
Reassigning, since the XPConnect issue has been addressed.
Purify/Quantify evidence points to CSS/Rules
Comment 29 Phil Schwartau 2001-10-23 11:01:23 PDT
Setting component to Compositor to reflect the new summary and new owner, 
kmcclusk@netscape.com
Comment 30 David Bradley 2001-10-25 05:26:20 PDT
I think bug 104336 may be of interest here.
Comment 31 Kevin McCluskey (gone) 2001-11-29 23:09:38 PST
Not sure why this was set to Compositor?

Changing to HTMLFrames. Downgrading severity from critical and removing crash. 
This is an extreme edge case and it doesn't crash if you have enough memory.

It took 70 seconds to load on AMD 750MHZ running WINXP. It rendered properly
afterward.
Comment 32 Kevin McCluskey (gone) 2002-01-30 11:09:57 PST
Bulk re-assigning all of Eric's HTMLFrame bugs to John.
Comment 33 John Keiser (jkeiser) 2002-12-16 17:22:26 PST
Performance problems with hundreds of iframes is not a priority right now.
Comment 34 Peter van der Woude [:Peter6] 2003-12-08 16:07:04 PST
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.6b) Gecko/20031124
Firebird/0.7+ (Nova: SVG,MNG,DOMi,Venkman)

The problem starts at around 40-45 iframes, which may look a lot, but I have
some pages where I need more than that .

same problem in Mozilla-Firebird and K-Meleon
Comment 35 Peter van der Woude [:Peter6] 2004-04-29 15:51:27 PDT
(In reply to comment #33)
> Performance problems with hundreds of iframes is not a priority right now.
Would it be now , more than a year later ?
It is not that the iframes won't render, but the iframe content won't, unless
you resize the window they are in

testcase: 
Open
http://home.planet.nl/~pa1six/mag_eur/mag_eur.html
push on the 1st tab (left column)
scroll down the tabs and select on of the last tabs.
You should see "nothing" happening.
Rightclick inside the displayed frame and notice the weird popup behaviour (side
effect)
Now marginally resize the window, there is the content!
Rge rightframe does scoll properly, but somehow Gecko is tolatlly unable to
render the content w/o extra tricks.
Comment 36 Brent Garber 2004-10-04 14:43:30 PDT
*** Bug 249823 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.