Closed Bug 145425 Opened 22 years ago Closed 13 years ago

extreme cpu, memory usage and time to load big HTML file (in comparsion with other browsers)

Tracking

()

Status:

RESOLVED WORKSFORME

Milestone:

Future

People

(Reporter: aha, Assigned: waterson)

References

(Depends on 1 open bug)

Details

(4 keywords, Whiteboard: [reflow-refactor])

Attachments

(1 file, 6 obsolete files)

small file to ilustrate HTML code used in big testcase (196 kB) 22 years ago Adam Hauner 191.91 KB, text/html		Details
jprof report for loading a file which was half the size of the example file at ttp://www.hauner.cz/testcasehtml.zip 22 years ago John Morrison 867.95 KB, text/html		Details
probe parent for pseudo before walking frame list, diff -uw 22 years ago Chris Waterson 1.37 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
waterson's patch, updated to trunk. 21 years ago Boris Zbarsky [:bzbarsky] 18.99 KB, patch		Details \| Diff \| Splinter Review
A bit more cleanup... 21 years ago Boris Zbarsky [:bzbarsky] 23.68 KB, patch	roc : review+ roc : superreview+	Details \| Diff \| Splinter Review
Updated jprof of testcase 21 years ago Boris Zbarsky [:bzbarsky] 126.56 KB, application/x-gzip		Details
Bzip2-compressed testcase. 18 years ago Boris Zbarsky [:bzbarsky] 438.03 KB, application/bzip2		Details

Adam Hauner

Reporter

Description

•

22 years ago

Linkchecker Xenu has no problem to generate several megs big HTML file as its
report. Last week generate about 30 MB big HTML files and start Mozilla to
display this report. Mozilla hangs for several minutes and its memory used
growns to stars. 

So I prepared similar structured file (32 MBs, PREformated text with lot of
links) and compared several common used browsers, below are results. Mozilla
needs 13x more time to render page than NN 4.79 and eats near to 2x more memory
than other browsers. In NN4.79 you could scroll document while loading, in
Mozilla you can't anything.

Mozilla 2002051708/trunk:
Memory used before test:		16 MB
Time to 1st displayed char:		4 s
Time to load:				302 s
Memory used peak:			239 MB
Memory on Document Done:		202 MB

Mozilla 2002051006/RC2:
Memory used before test:		16 MB
Time to 1st displayed char:		10 s
Time to load:				292 s
Memory used peak:			256 MB
Memory on Document Done:		202 MB

NN 4.79
Memory used before test:		9 MB
Time to 1st displayed char:		0 s
Time to load:				23 s
Memory used peak:			132 MB
Memory on Document Done:		130 MB

Opera 6.02 (B#1101):
Memory used before test:		10 MB
Time to 1st displayed char:		0 s
Time to load:				90 s
Memory used peak:			151 MB
Memory on Document Done:		151 MB

MSIE 5.0:
Memory used before test:		9.5 MB
Time to 1st displayed char:		1 s
Time to load:				58 s
Memory used peak:			133 MB
Memory on Document Done:		132 MB

Adam Hauner

Reporter

Comment 1

•

22 years ago

Attached file small file to ilustrate HTML code used in big testcase (196 kB) (obsolete) — Details

Adam Hauner

Reporter

Comment 2

•

22 years ago

Big testcase file (ZIP, 1.9MB) http://www.hauner.cz/testcasehtml.zip

Tested on Win2K (SP2), AMD Duron 950 MHz, 512 MB RAM. While test was constant
CPU usage by other running application (just supporting test). All browsers was
started just before separate test with blank page.

Keywords: 4xp, perf, testcase

Matthias Versen [:Matti]

Comment 3

•

22 years ago

-> Layout

Assignee: Matti → attinasi

Component: Browser-General → Layout

QA Contact: imajes-qa → petersen

Boris Zbarsky [:bzbarsky]

Comment 4

•

22 years ago

This is a duplicate.  Please search for the "slow large page" tracking bug
(those exact words).  Then dup to one of its dependencies.

Whiteboard: DUPEME

Zibi Braniecki [:zbraniecki][:gandalf]

Comment 5

•

22 years ago

Boris. 0 bugs found...

I cannot reproduce this bug on Linux. testcase load very, very fast (<1 sec)

Mozilla 1505200207 Linux redHat 7.3 KDE 1 ghz, 128 ram, 8 mbps connection.

Chris Petersen

Comment 6

•

22 years ago

Marking WFM in the OS X May 19th build (2002-05-19-05)

Status: UNCONFIRMED → RESOLVED

Closed: 22 years ago

Resolution: --- → WORKSFORME

Adam Hauner

Reporter

Comment 7

•

22 years ago

Chris: Huh? You marked this bug reported for W2K as WFM, if don't occur on Linux
and OS X? I retried it again using 2002052108/trunk with same result - 5 minutes
in hang, memory usage peak on 270 MBs, after load 205 MBs. Futhermore, I tested
this bug now on Linux comp and opening and displaying file from local disk takes
5,5 minutes (2002051920/RH7.1) and similar amount of memory.
Bug is still present, so I'm REOPENing it.

Zbigniew: Did you try small file atteched with this bug or did you download ZIP
file with real (and big) testcase?

Status: RESOLVED → UNCONFIRMED

Resolution: WORKSFORME → ---

Chris Waterson

Assignee

Comment 8

•

22 years ago

I would be willing to be that we spend a large amount of time looking up the
visited state of each of these links in global history. Reporter: could you try
this with a clean profile and see if there is any difference in performance?

Adam Hauner

Reporter

Comment 9

•

22 years ago

Chris: Using 2002052508/trunk/W2K with new profile and empty history I'm getting
same results (253 MB peak and 205 MB after page load). Do you have any other
idea, where browser could spend time?

Boris: Zbigniew Braniecki tried to find bug to duplicate, but he wasn't lucky.

(I'm changing OS to All because it occurs on Linux too - my comment 7).

OS: Windows 2000 → All

Boris Zbarsky [:bzbarsky]

Comment 10

•

22 years ago

ah, sorry.  "long" not "large".

This bug needs a profile; otherwise we're all just guessing.  I'll try to do 
that in a few weeks.

Blocks: 56854

Keywords: qawanted

Chris Waterson

Assignee

Comment 11

•

22 years ago

Taking for now. What bzbarsky says: we need a profile.

Assignee: attinasi → waterson

Status: UNCONFIRMED → NEW

Ever confirmed: true

John Morrison

Comment 12

•

22 years ago

Attached file jprof report for loading a file which was half the size of the example file at ttp://www.hauner.cz/testcasehtml.zip (obsolete) — Details

jprof was set up as 
  setenv JPROF_FLAGS "JP_DEFER JP_PERIOD=0.0015"

trunk pulled may 25

Wall clock time to load 210 seconds

John Morrison

Comment 13

•

22 years ago

The jprof info seems to point away from checking the links, although I'm not so
good at interpreting jprof output.

However, I did also test making |nsStyleUtil::IsHTMLLink| into an early return
(no-op) and it seemed to reduce wall clock time by about 10%.

(I'll also note in passing that because of 8bit chars in some of the mailto:
urls in that page, we load (on win32) msgcompo.dll and create a new nsIURI to 
parse those mail URLs just to pass them as char* to the global history to check
the link for visited state. Ugh. 4.x linux doesn't appear to bother with
:visited state for mailto: links).

John Morrison

Comment 14

•

22 years ago

(Actually, 8bit-in-mailto is too rare to worry about the extra cycles).

Boris Zbarsky [:bzbarsky]

Comment 15

•

22 years ago

So.. the first obvious culprit is nsFrameList::LastChild.  Would it make any 
sense to add a mLastChild member to nsFrameList?  Because as it stands 
appending frames has to walk the entire existing frame list, so there are
O(n^2) calls to LastChild() on pageload.... (and this starts to quickly suck 
for cases with lots of child frames).  What would the footprint cost of adding 
this member be?

Chris Waterson

Assignee

Comment 16

•

22 years ago

Every container frame would grow by a word, which isn't death but...

It looks like the only reason that we're trying to _use_ the LastChild in
nsCSSFrameConstructor::AppendFrames is to determine if there are :after pseudos
-- a fairly unlikely case which we could probably afford to handle in a less
efficient way.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 17

•

22 years ago

Did you see my proposal in 144826?

Chris Petersen

Comment 18

•

22 years ago

Changing priority to P2.

Priority: -- → P2

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 19

•

22 years ago

Some thoughts about nsFrameList and adding |mLastChild|:

Do we always go through the nsFrameList appending functions rather than just
using SetNextSibling?  (Note that things done in batches within the frame
constructor already use the last child member of nsFrameItems.)

Would it be better to do something similar to what I did for the rule node child
storage:  use a tagged pointer, where the low bit being unset means that it's
just a first child, and if it's set, it points to a pair of pointers that has
first and last child?  We could then avoid switching to the latter mechanism
until we hit 8 or 16 frames or so.

Chris Waterson

Assignee

Comment 20

•

22 years ago

Attached patch probe parent for pseudo before walking frame list, diff -uw (obsolete) — Details — Splinter Review

dbaron, I'm not sure how you're proposal would make things different in this
case. Wouldn't we still need to get the last frame to look at it and see if it
was the generated frame?

I truncated the big file to 100,000 lines, and did some timing tests:

67s  Current code
59s  `#if 0' LastFrame check; always append content.
61s  Probe parent for :after pseudo, and only then check to see
     if LastFrame is the pseudo frame (this patch).

Boris Zbarsky [:bzbarsky]

Comment 21

•

22 years ago

Chris, any chance of seeing how that patch affects the pageload tests, if at 
all?

Chris Waterson

Assignee

Comment 22

•

22 years ago

On the giant test case, I get:

267s      #if 0
265s (!?) probe
308s      original

I'm just looking at the `Document: Done' number, so this isn't very scientific.
The important thing is that we do start to see an algorithmic win on really,
really, long pages.


As for page load:

Original.

Test id: 3CF6C24902
Avg. Median : 734 msec		Minimum     : 177 msec
Average     : 758 msec		Maximum     : 2345 msec
Test id: 3CF6C6D9C0
Avg. Median : 739 msec		Minimum     : 210 msec
Average     : 753 msec		Maximum     : 2353 msec


Probe.

Test id: 3CF6BCAF43
Avg. Median : 736 msec		Minimum     : 190 msec
Average     : 744 msec		Maximum     : 2292 msec
Test id: 3CF6C03F6F
Avg. Median : 734 msec		Minimum     : 179 msec
Average     : 745 msec		Maximum     : 2331 msec


So it looks like probing might be a slight win, but it's probably just in the noise.

Status: NEW → ASSIGNED

Target Milestone: --- → mozilla1.1alpha

Boris Zbarsky [:bzbarsky]

Comment 23

•

22 years ago

Comment on attachment 85661 [details] [diff] [review]
probe parent for pseudo before walking frame list, diff -uw

+  // See if the eparent has an :after pseudo-element

"the parent" right?

r=bzbarsky with that.  If it does not regress pageload I say land it and then
reprofile.

Attachment #85661 - Flags: review+

Adam Hauner

Reporter

Comment 24

•

22 years ago

Chris, could you retarget this bug as 1.1a is out?

Keywords: footprint

Chris Waterson

Assignee

Comment 25

•

22 years ago

dbaron says we ought to put a bit in the style context that indicates whether or
not a frame has a :before or :after pseudo; that way, we'll avoid doing the
pseudo-probe. Since this is only an issue for really, really, really big pages
(and it may have a negative impact on small pages), I'm going to FUTURE.

Target Milestone: mozilla1.1alpha → Future

Randell Jesup [:jesup] (needinfo me)

Comment 26

•

22 years ago

We should look more at that jprof.  There are some interesting things other than
the Append to examine.  Almost 1/2 the time is spent in laying out the data.

Random example: around 150 hits in nsFont::Equals, similar numbers in converting
text strings to single-byte for getting the width for layout.  We spend much
more than than (>5%-ish if I remember) dealing with creating and recovering
block reflow states.

Adam Hauner

Reporter

Comment 27

•

22 years ago

I re-test (with my primitive technique) giant testcase and I got these _bad_
numbers for actual nightbuild (used hardware is same):

Mozilla 200212108/trunk:
Memory used before test:		21 MB
Time to 1st displayed char:		1 s
Time to load:				689 s
Memory used peak:			313 MB
Memory on Document Done:		313 MB

Boris Zbarsky [:bzbarsky]

Updated

•

21 years ago

Blocks: 23187

Adam Hauner

Reporter

Updated

•

21 years ago

Whiteboard: DUPEME

Boris Zbarsky [:bzbarsky]

Updated

•

21 years ago

Blocks: 174359

Boris Zbarsky [:bzbarsky]

Updated

•

21 years ago

Blocks: 29805

Boris Zbarsky [:bzbarsky]

Updated

•

21 years ago

Blocks: 112858

Boris Zbarsky [:bzbarsky]

Comment 28

•

21 years ago

Attached patch waterson's patch, updated to trunk. (obsolete) — Details — Splinter Review

Attachment #85661 - Attachment is obsolete: true

Boris Zbarsky [:bzbarsky]

Comment 29

•

21 years ago

Comment on attachment 121076 [details] [diff] [review]
waterson's patch, updated to trunk.

roc, would you do the honors?

Attachment #121076 - Flags: superreview?(roc+moz)

Attachment #121076 - Flags: review?(roc+moz)

Boris Zbarsky [:bzbarsky]

Comment 30

•

21 years ago

Attached patch A bit more cleanup... (obsolete) — Details — Splinter Review

Attachment #121076 - Attachment is obsolete: true

Boris Zbarsky [:bzbarsky]

Updated

•

21 years ago

Attachment #121079 - Flags: superreview?(roc+moz)

Attachment #121079 - Flags: review?(roc+moz)

Boris Zbarsky [:bzbarsky]

Updated

•

21 years ago

Attachment #121076 - Flags: superreview?(roc+moz)

Attachment #121076 - Flags: review?(roc+moz)

Boris Zbarsky [:bzbarsky]

Comment 31

•

21 years ago

Comment on attachment 121079 [details] [diff] [review]
A bit more cleanup...

> +  NS_PRECONDITION(aContent, "Must have a content node");

This should actually be removed from nsLayoutUtils::IsGeneratedContentFor

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 32

•

21 years ago

Comment on attachment 121079 [details] [diff] [review]
A bit more cleanup...

OK, that looks fine. You could make HasPseudoStyle inline since it's really
just wrapping a single method call.

Attachment #121079 - Flags: superreview?(roc+moz)

Attachment #121079 - Flags: superreview+

Attachment #121079 - Flags: review?(roc+moz)

Attachment #121079 - Flags: review+

Boris Zbarsky [:bzbarsky]

Comment 33

•

21 years ago

Comment on attachment 121079 [details] [diff] [review]
A bit more cleanup...

removed extra assert, inlined that function, and checked in.

Time to reprofile here, I'd say....

Attachment #121079 - Attachment is obsolete: true

Markus Hübner

Updated

•

21 years ago

Blocks: 203448

mmc

Comment 34

•

21 years ago

Are there other issues implied in this bug or can it be marked fix?

Boris Zbarsky [:bzbarsky]

Comment 35

•

21 years ago

See comment 26 and comment 33 -- we need a new profile to identify any remaining
problems.

Markus Hübner

Comment 36

•

21 years ago

dbradley/bz - could you do a profile to further investigate what can be done 
here.

Boris Zbarsky [:bzbarsky]

Comment 37

•

21 years ago

Attached file Updated jprof of testcase (obsolete) — Details

Looks like a good deal of time is spent in reflow or in GetPrimaryFrameFor (the
latter all on #text nodes, in response to content being appended to those
nodes...)

Attachment #85405 - Attachment is obsolete: true

Boris Zbarsky [:bzbarsky]

Comment 38

•

21 years ago

Perhaps we need a way to avoid triggering reflow off these AppendData calls in
cases when the text node's frame has not yet been constructed (or at least the
text node's parent has not been flushed from the sink)?  Or do we have one?

Markus Hübner

Updated

•

21 years ago

Depends on: 115049

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

21 years ago

Blocks: 147894

rbs

Updated

•

20 years ago

Depends on: 233463

Boris Zbarsky [:bzbarsky]

Updated

•

20 years ago

Depends on: 74656

Boris Zbarsky [:bzbarsky]

Updated

•

20 years ago

Depends on: 235255

Alfred Kayser

Comment 39

•

19 years ago

Can someone do a fresh test/analysis of current builds?

Jonathan Evans

Comment 40

•

19 years ago

quick load on Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1)
Gecko/20050903 Firefox/1.6a1 with 2ghz amd 64 less than 2s. memory usage up 200k.

Marcin Garski

Comment 41

•

19 years ago

Good test file:
http://www.gnu.org/software/libc/manual/html_mono/libc.html
4.2 MB HTML file.

Ryuichi KUBUKI

Comment 42

•

19 years ago

*** Bug 301862 has been marked as a duplicate of this bug. ***

Ryuichi KUBUKI

Comment 43

•

19 years ago

*** Bug 61684 has been marked as a duplicate of this bug. ***

Ryuichi KUBUKI

Updated

•

19 years ago

Blocks: 267525

Worcester12345

Comment 44

•

18 years ago

(In reply to comment #39)
> Can someone do a fresh test/analysis of current builds?
> 

ditto (again)

Boris Zbarsky [:bzbarsky]

Comment 45

•

18 years ago

Not worth it till the reflow branch lands.

Whiteboard: [reflow-refactor]

Boris Zbarsky [:bzbarsky]

Comment 46

•

18 years ago

Attached file Bzip2-compressed testcase. — Details

For what it's worth, I just tested on reflow branch and performance seems largely unaffected.  I'll reprofile after the branch lands.

Attachment #84154 - Attachment is obsolete: true

Attachment #126694 - Attachment is obsolete: true

Boris Zbarsky [:bzbarsky]

Updated

•

18 years ago

Attachment #234142 - Attachment is patch: false

Attachment #234142 - Attachment mime type: text/plain → application/bzip2

Phil Ringnalda (:philor)

Updated

•

15 years ago

QA Contact: chrispetersen → layout

Marco Castelluccio [:marco]

Comment 48

•

13 years ago

Performance in the testcase is better in Firefox 8.0a2 than in IE9. However, adding to MemShrink.

Whiteboard: [reflow-refactor] → [reflow-refactor][MemShrink]

Justin Lebar (not reading bugmail)

Comment 49

•

13 years ago

On my Linux box, we load this page faster than chrome (7s versus 19s) and use almost exactly the same amount of memory.

However, this testcase gives us high (~50%) heap-unclassified.  I imagine this is covered by one of our existing darkmatter bugs, but I'll open a separate bug with the testcase just in case.

Justin Lebar (not reading bugmail)

Updated

•

13 years ago

Status: ASSIGNED → RESOLVED

Closed: 22 years ago → 13 years ago

Resolution: --- → WORKSFORME

Whiteboard: [reflow-refactor][MemShrink] → [reflow-refactor]

Boris Zbarsky [:bzbarsky]

Comment 50

•

13 years ago

I would expect that textruns might account for a lot of the heap here... in any case, please cc me and njn on the new bug?

Justin Lebar (not reading bugmail)

Updated

•

13 years ago

Blocks: 693898

You need to log in before you can comment on or make changes to this bug.