Open Bug 103649 Opened 23 years ago Updated 2 years ago

Performance problems when rendering a page with hundreds of iframes

Categories

(Core :: Layout: Images, Video, and HTML Frames, defect, P4)

x86
Windows 98
defect

Tracking

()

Future

People

(Reporter: d_yerrick, Unassigned)

References

Details

(Keywords: perf, testcase)

Attachments

(6 files, 1 obsolete file)

From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)
BuildID:    2001100503

When I open a page with hundreds of <iframe> elements, Mozilla illegal ops and 
brings up Talkback.

Reproducible: Always
Steps to Reproduce:
1. Set Mozilla as your default Internet browser.
2. Open outer.html from the test case in a forthcoming attachment.


Actual Results:  MOZILLA caused an invalid page fault in
module MSVCRT.DLL at 015f:7800d16a.
(Continued in another forthcoming attachment.)

Expected Results:  Page is rendered, with each iframe usable.

This computer has 64 MB of RAM.
IE 6 has no problem with this page.

Talkback incident ids include
TB36308553G, TB36417833Q, and TB36418544X.
worksforme, linux build 2001-10-07-06 linux build.

I _do_ use about 90M ram to view that page, though.  Could you be just running
out of memory?  what are your virtual memory settings?
->evaughan, who took pollmann's bugs.
Assignee: pollmann → evaughan
Keywords: crash, testcase
Wow. that was fun. 25 minutes of solid disk-thrashing, but no crash. Build
2001110803 Windows 95 (on a machine with 64 mb ram)
wfm using build 2001100803 on Win2k. Albeit slow though.
Slow is the word. I just tried a version with 1000 iframes (based on the test
case, but twice as long). WinNT4sp6a, PIII-600, 320 meg memory.

This was a wonderful stress test to watch in action. It took about 20 minutes of
CPU time. The VM size started at 190 meg, peaked at about 250 meg, slooooooowly
dropped to 210 meg, then climbed again to 290 meg. Stayed steady there for a
while and took several minutes before it was ready to display the page.

It would be interesting to know what was happening at what time - what was using
all that memory, what was being freed, etc.

The page with hundreds of iframes scrolls very slowly.
Closing the 1000-iframe page is even more fun: the VM size drops, drip by drip,
slowly, torturously, a few hundred K at a time, from 290 meg to 224 meg, over
the course of 17 minutes CPU time.

Note that opening or closing the page is mostly CPU-intensive - it uses a lot
of memory, but doesn't get the system thrashing; it's the CPU that's getting the
workout.

This test case is a wonderful Bad Example for performance testing under
pathological conditions, and will doubtless be rich in pointers to places that
could do with tightening up. In the meantime, I'm not surprised it died
horribly under Win98 ...

This one is a WFM too; perhaps it should be a different bug, a pathological
test case for anyone wishing to torture their machine.
Even if this isn't a crasher, it's a perf bug.  Confirming on that basis.
Blocks: 71668
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: perf
Is it me, or are we spending lots of time locking and unlocking?  Not to mention
allocating memory for 500 docshells, and the like, of course....
NS_NewHTMLContentSink ->
....
XPCJSRuntime::SyncXPCContextList ->
JS_DHashTableEnumerate ->
KillDeadContextsCB ->
JS_ContextIterator ->
js_UnlockRuntime/js_LockRuntime

is close to 1/2 the runtime, most of it in Lock/Unlock.


Much of the rest seems to be creating (root) frames and handling ContentInserted.
hmm.. To be precise, we have:

HTMLContentSink::Init ->
IsScriptEnabled ->
nsDocShell::GetScriptGlobalObject ->
nsDocShell::EnsureScriptEnvironment

Most of the time seems to be spent under there.  Over to XPConnect, ccing jst
and jband
Assignee: evaughan → dbradley
Component: HTMLFrames → XPConnect
QA Contact: amar → pschwartau
I've come up with a patch that will reduce the number of calls to
JS_ContextIterator from about 5 mill to a little under 300,000. I need to do
some more testing and timings to make sure the numbers are correct and that
things behave normally.
Status: NEW → ASSIGNED
I'm happy to look at the code even if it is not for sure. 

Ultimately it *might* be best to just add hooks in the JS engine so xpconnect 
can track JSContext creation and destruction when that happens and not some time 
later.

There were subtle problems recycled JSContext pointers in the past when we we 
not calling SyncXPCContextList as often. 

On the other hand, we are pretty close to being able to not be surprized at all 
by JSContext construction and destruction. One really can't use xpconnect on a 
JSContext for which initClasses or initClassesWithNewWrappedGlobal has not been 
called. And, now that we have nsIXPConnect::releaseJSContext we could make it 
so that *all* destroyers of JSContext (that use xpconnect) call xpconnect to do 
their JSContext destruction. Once we can *trust* that xpconnect is being 
correctly informed about these events then we really don't need to do the 
aggressive work done by SyncXPCContextList (except maybe as a DEBUG only 
verification). I think that may be the best way to go.
The following patch seems to reduce the number of context iterations. I think we
still have issues on the cleanup side. This patch basically puts in place a
marking scheme for the context map. Rather than iterating all the context in the
map and then for each one of those iterating the JS's contexts I used a mark and
sweep scheme. This consolidates the iteration of JS context to one loop and the
iteration of the context map to one loop. I got widely different numbers the two
tests runs I made with the original code. They ranged from 5 million to 50
million calls to JS_ContextIterator. In both runs with the patch I came in under
300k of calls. In any case, even going from 5 million to 300k is a decent
improvement.

This patch retains the same logic so is fairly low risk, assuming I coded it
correctly. jband's suggestion is probably the better ultimate solution. So I
think it's worth using this patch until we can validate we won't be surprised by
JSContext construction/destruction.
Priority: -- → P1
Target Milestone: --- → mozilla0.9.6
Just to make performance testers cry: I tried my 1000-iframe example in
IE 5.5. Same machine as before (PIII-600, 320 meg, WinNT4sp6a).

Time to render file: 9 seconds (Mozilla: 20 minutes), not taking 100%
CPU either.

Time to close window: straight away (Mozilla: 17 minutes at 100% CPU).

Ouch.
Anyone got IE 6 to hand?

(I can see it now: Microsoft HTML how-to: "Yes, hundreds of iframes on
a page are the new standard ...")
Ok, a little follow up for today's findings.

I ran another series of tests. Basically the patch reduces the number calls to
js_ContextIterator by about 50 million. Quantify says the reduction in overall
time is about 75%. It's hard to get an identical run with Win2k. So many things
can affect performance. I'm a bit skeptical about that number I think it's less.

Leaving the page is supposedly faster, Quantify reports a 66% decrease in time.
Like above this is suspect.

Other than WaitForSingleObject blocking, SetWindowsPos, ReleaseSemaphore, new,
malloc, TlsGetValue are next on the list.

Not much I can do about SetWindowPos. 

The ReleaseSemaphore is used  via, 

NsThreadPool::DispatchRequest -\
                                \
PR_ExitMonitor --------------------> PR_Unlock -> PR_MD_UNLOCK
                                /
Js_UnlockRuntime =============-/

PR_ExitMonitor is called mainly by the GetNewOrUsed and a few other functions in
XPConnect and nsComponentManagerImpl::GetService

I know for XPConnect this locking just needs to be simple mutex/critical
section. I'm wondering why we're using such a heavy locking function to guard
access to maps and such. (I suspect there's something I'm missing)

Some other observations. RuleHash::AppendToTable gets called 329.658 times. This
seems to emanate from EnumRulesMatching. EnumRulesMatching is called 30,000
times, and everything else propigates from there.

The other thing that probably needs to be looked at is the memory consumption.
It takes about 90 megs to display this page. I don't know if that's acceptable
or not.
dbradley: The patch looks good to me. Seems like a fine place for mark'n'sweep 
to me. Had I been in a mark'n'sweep state of mind when I wrote that code I may 
have thought of it myself. Thanks. I have only nits to complain about...

- might as well rename "KillDeadContextsCB" to "SweepContexts"
- please don't infect xpconnect with "if (" :)
- If it were me, I'd just do "MARK_FLAG = 4" with a comment about avoiding 
overlap if anything changes. There's some (slight) chance some compiler might 
want to use a smaller data type for the enum - no reason to expand unnecessarly.

On locking... xpconnect is using a monitor rather than a lock because there are 
a couple of places where the lock needs to be reentrant. The extra cost (over 
just using the underlying PRLock) is just an extra call to get the current 
thread (maps down to TlsGetValue). Locking and unlocking a PRLock (on Win32) 
maps down to Enter/LeaveCritivalSection. NSPR is structured to use funtion calls 
where we might prefer macros. But this is not really all that heavyweight. It is 
part of the cost we pay for cross platform

On comparing to IE... I speculate that IE is lazier about setting up a JS 
execution environment in iframes. I'd bet that if one ran some JS code in each 
iframe then the difference would be less dramatic (after we fix the glaring 
bits).

This, of course, leads to the question: could we get away with being lazier 
about initializing JSContexts etc. ?
Regarding the locking. The only thing that peaked my interest was the 511 calls
to ReleaseSemaphore, which seemed to cost a fair bit of time. This is only done
if asked for, so not a problem for the majority of locks/unlocks.
Attachment #52882 - Attachment is obsolete: true
Comment on attachment 53569 [details] [diff] [review]
Updated with jband's comments

r/sr=jband
Attachment #53569 - Flags: review+
Comment on attachment 53569 [details] [diff] [review]
Updated with jband's comments

r/sr=jst
Attachment #53569 - Flags: superreview+
Patch checked in (53569)

There's still more to be addressed so leaving this open.
I did a little more digging using Purify. We are now to the point where we need
to look at reducing the complexity of data structures in some fashion to
conserve memory as well as increase speed. In any case this is outside the realm
of xpconnect and needs to be reassigned. One of the larger users of memory is
RuleHash::AppendRuleToTable. This allocates around 400k for 50 iFrames. I cut
the original sample of 500 iFrames down to 50 so I could run things in a
reasonable amount of time. The following are some other large consumers of memory:

[I] MIU: Memory use of 209865 bytes from 51 blocks allocated in PR_Malloc
        Distribution of blocks in use
        Allocation location
            malloc         [msvcrt.DLL]
            PR_Malloc      [prmem.c:50]
            PresShell::AllocateFrame(UINT,void * *) [nsPresShell.cpp:1789]
            nsStyleContext::SetStyle(nsStyleStructID,nsStyleStruct const&)
[nsStyleContext.cpp:465]
            nsRuleNode::WalkRuleTree(nsStyleStructID,nsIStyleContext
*,nsRuleData *,nsCSSStruct *) [nsRuleNode.cpp:1399]
            nsRuleNode::GetVisibilityData(nsIStyleContext *) [nsRuleNode.cpp:1077]
            nsRuleNode::GetStyleData(nsStyleStructID,nsIStyleContext *)
[nsRuleNode.cpp:4249]
            nsStyleContext::GetStyleData(nsStyleStructID) [nsStyleContext.cpp:384]
            nsBoxFrameInner::CacheAttributes(void) [nsBoxFrame.cpp:420]
            nsBoxFrame::Init(nsIPresContext *,nsIContent *,nsIFrame
*,nsIStyleContext *,nsIFrame *) [nsBoxFrame.cpp:377]

[I] MIU: Memory use of 209865 bytes from 51 blocks allocated in PR_Malloc
        Distribution of blocks in use
        Allocation location
            malloc         [msvcrt.DLL]
            PR_Malloc      [prmem.c:50]
            PresShell::AllocateFrame(UINT,void * *) [nsPresShell.cpp:1789]
            nsRuleNode::CreateRootNode(nsIPresContext *,nsIRuleNode * *)
[nsRuleNode.cpp:370]
            StyleSetImpl::ResolvePseudoStyleFor(nsIPresContext *,nsIContent
*,nsIAtom *,nsIStyleContext *,int,nsICSSPseudoComparator *) [nsStyleSet.cpp:942]
            nsPresContext::ResolvePseudoStyleWithComparator(nsIContent *,nsIAtom
*,nsIStyleContext *,int,nsICSSPseudoComparator *,nsIStyleContext * *)
[nsPresContext.obj:906]
            nsPresContext::ResolvePseudoStyleContextFor(nsIContent *,nsIAtom
*,nsIStyleContext *,int,nsIStyleContext * *) [nsPresContext.obj:884]
            nsCSSFrameConstructor::ConstructRootFrame(nsIPresShell
*,nsIPresContext *,nsIContent *,nsIFrame *&) [nsCSSFrameConstructor.obj:3500]
            StyleSetImpl::ConstructRootFrame(nsIPresContext *,nsIContent
*,nsIFrame *&) [nsStyleSet.cpp:1183]
            PresShell::InitialReflow(int,int) [nsPresShell.cpp:2619]

[I] MIU: Memory use of 208896 bytes from 51 blocks allocated in
nsSupportsArray::GrowArrayBy(int)
        Distribution of blocks in use
        Allocation location
            new(UINT)      [msvcrt.DLL]
            nsSupportsArray::GrowArrayBy(int) [nsSupportsArray.obj:194]
            CSSRuleProcessor::GetRuleCascade(nsIPresContext *,nsIAtom *)
[nsCSSStyleSheet.cpp:4492]
            CSSRuleProcessor::RulesMatching(nsIPresContext *,nsIAtom
*,nsIContent *,nsIAtom *,nsIStyleContext *,nsICSSPseudoComparator
*,nsIRuleWalker *) [nsCSSStyleSheet.cpp:4063]
            EnumPseudoRulesMatching [nsStyleSet.cpp:912]
            StyleSetImpl::ResolvePseudoStyleFor(nsIPresContext *,nsIContent
*,nsIAtom *,nsIStyleContext *,int,nsICSSPseudoComparator *) [nsStyleSet.cpp:945]
            nsPresContext::ResolvePseudoStyleWithComparator(nsIContent *,nsIAtom
*,nsIStyleContext *,int,nsICSSPseudoComparator *,nsIStyleContext * *)
[nsPresContext.obj:906]
            nsPresContext::ResolvePseudoStyleContextFor(nsIContent *,nsIAtom
*,nsIStyleContext *,int,nsIStyleContext * *) [nsPresContext.obj:884]
            nsCSSFrameConstructor::ConstructRootFrame(nsIPresShell
*,nsIPresContext *,nsIContent *,nsIFrame *&) [nsCSSFrameConstructor.obj:3500]
            StyleSetImpl::ConstructRootFrame(nsIPresContext *,nsIContent
*,nsIFrame *&) [nsStyleSet.cpp:1183]
[I] MIU: Memory use of 209865 bytes from 51 blocks allocated in PR_Malloc
        Distribution of blocks in use
        Allocation location
            malloc         [msvcrt.DLL]
            PR_Malloc      [prmem.c:50]
            PresShell::AllocateFrame(UINT,void * *) [nsPresShell.cpp:1789]
            nsRuleNode::Transition(nsIStyleRule *,nsIRuleNode * *)
[nsRuleNode.cpp:414]
            nsRuleWalker::Forward(nsIStyleRule *) [nsRuleWalker.cpp:95]
            ContentEnumFunc [nsCSSStyleSheet.obj:3917]
            RuleHash::EnumerateAllRules(int,nsIAtom *,nsIAtom *,nsVoidArray
const&,(*)(nsICSSStyleRule *,void *),void *) [nsCSSStyleSheet.cpp:634]
            CSSRuleProcessor::RulesMatching(nsIPresContext *,nsIAtom
*,nsIContent *,nsIStyleContext *,nsIRuleWalker *) [nsCSSStyleSheet.cpp:3965]
            EnumRulesMatching [nsStyleSet.cpp:766]
            StyleSetImpl::ResolveStyleFor(nsIPresContext *,nsIContent
*,nsIStyleContext *,int) [nsStyleSet.cpp:866]

[I] MIU: Memory use of 205750 bytes from 50 blocks allocated in PR_Malloc
        Distribution of blocks in use
        Allocation location
            malloc         [msvcrt.DLL]
            PR_Malloc      [prmem.c:50]
            PresShell::AllocateFrame(UINT,void * *) [nsPresShell.cpp:1789]
            nsRuleNode::ComputeBorderData(nsStyleStruct *,nsCSSStruct
const&,nsIStyleContext *,nsRuleNode *,RuleDetail::nsRuleNode const&,int)
[nsRuleNode.cpp:3083]
            nsRuleNode::WalkRuleTree(nsStyleStructID,nsIStyleContext
*,nsRuleData *,nsCSSStruct *) [nsRuleNode.cpp:1411]
            nsRuleNode::GetBorderData(nsIStyleContext *) [nsRuleNode.cpp:1186]
            nsRuleNode::GetStyleData(nsStyleStructID,nsIStyleContext *)
[nsRuleNode.cpp:4249]
            nsStyleContext::GetStyleData(nsStyleStructID) [nsStyleContext.cpp:384]
            nsStyleContext::CalcStyleDifference(nsIStyleContext *,int&,int)
[nsStyleContext.cpp:670]
            CaptureChange  [nsFrameManager.cpp:1551]

[I] MIU: Memory use of 131088 bytes from 2 blocks allocated in
orkinHeap::Alloc(nsIMdbEnv *,UINT,void * *)
        Distribution of blocks in use
        Allocation location
            new(UINT)      [msvcrt.DLL]
            orkinHeap::Alloc(nsIMdbEnv *,UINT,void * *) [orkinHeap.cpp:91]
            morkZone::zone_grow_at(morkEnv *,UINT) [morkZone.cpp:242]
            morkAtomSpace::MakeBookAtomCopyWithAid(morkEnv *,morkFarBookAtom
const&,UINT) [morkAtomSpace.cpp:201]
            morkStore::AddAlias(morkEnv *,morkMid const&,UINT) [morkStore.cpp:999]
            morkBuilder::OnAlias(morkEnv *,morkSpan const&,morkMid const&)
[morkBuilder.cpp:649]
            morkParser::ReadAlias(morkEnv *) [morkParser.cpp:1056]
            morkParser::ReadDict(morkEnv *) [morkParser.cpp:1336]
            ???            [ip=0x08e7e16c]
            morkParser::ReadContent(morkEnv *,BYTE) [morkParser.cpp:1428]
Changing subject to better reflect the focus of this bug.
Reassigning, since the XPConnect issue has been addressed.
Purify/Quantify evidence points to CSS/Rules
Assignee: dbradley → kmcclusk
Status: ASSIGNED → NEW
Summary: Crash when rendering a page with hundreds of iframes → Performance problems when rendering a page with hundreds of iframes
Setting component to Compositor to reflect the new summary and new owner, 
kmcclusk@netscape.com
Component: XPConnect → Compositor
QA Contact: pschwartau → petersen
Status: NEW → ASSIGNED
Target Milestone: mozilla0.9.6 → mozilla1.0
I think bug 104336 may be of interest here.
Target Milestone: mozilla1.0 → mozilla1.0.1
Not sure why this was set to Compositor?

Changing to HTMLFrames. Downgrading severity from critical and removing crash. 
This is an extreme edge case and it doesn't crash if you have enough memory.

It took 70 seconds to load on AMD 750MHZ running WINXP. It rendered properly
afterward.
Assignee: kmcclusk → evaughan
Severity: critical → normal
Status: ASSIGNED → NEW
Component: Compositor → HTMLFrames
Keywords: crash
Priority: P1 → P3
Bulk re-assigning all of Eric's HTMLFrame bugs to John.
Assignee: eric → jkeiser
Performance problems with hundreds of iframes is not a priority right now.
Assignee: jkeiser → frame
Priority: P3 → P4
QA Contact: petersen → amar
Target Milestone: mozilla1.0.1 → Future
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.6b) Gecko/20031124
Firebird/0.7+ (Nova: SVG,MNG,DOMi,Venkman)

The problem starts at around 40-45 iframes, which may look a lot, but I have
some pages where I need more than that .

same problem in Mozilla-Firebird and K-Meleon
(In reply to comment #33)
> Performance problems with hundreds of iframes is not a priority right now.
Would it be now , more than a year later ?
It is not that the iframes won't render, but the iframe content won't, unless
you resize the window they are in

testcase: 
Open
http://home.planet.nl/~pa1six/mag_eur/mag_eur.html
push on the 1st tab (left column)
scroll down the tabs and select on of the last tabs.
You should see "nothing" happening.
Rightclick inside the displayed frame and notice the weird popup behaviour (side
effect)
Now marginally resize the window, there is the content!
Rge rightframe does scoll properly, but somehow Gecko is tolatlly unable to
render the content w/o extra tricks.
*** Bug 249823 has been marked as a duplicate of this bug. ***
Assignee: core.layout.html-frames → nobody
QA Contact: amar → core.layout.html-frames
Product: Core → Core Graveyard
Component: Layout: HTML Frames → Layout: Images
Product: Core Graveyard → Core
Severity: normal → S3

The severity field for this bug is relatively low, S3. However, the bug has 10 votes.
:emilio, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(emilio)

This needs an up-to-date profile.

Flags: needinfo?(emilio)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: