Closed Bug 148338 Opened 22 years ago Closed 3 years ago

Hangs on very large HTML tables (50.000 rows / 26.mb data) (nsCellMap::GetDataAt)

Tracking

()

Status:

RESOLVED WORKSFORME

Milestone:

Future

People

(Reporter: henrik, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: hang, perf, testcase)

Attachments

(7 files)

Simple testcase with 50000 lines (gzipp'ed HTML) which takes > 15mins to load with 2002-05-24-08-trunk optimised non-debug build on 333MHz UltraSPARC 22 years ago Roland Mainz 258.63 KB, application/x-gzip		Details
original test case that does not render 22 years ago Henrik Lynggaard Hansen 962.55 KB, application/zip		Details
jprof report for loading 15000 rows 22 years ago Andrew Schultz 607.10 KB, text/html		Details
patch to speed up nsTableRowGroupFrame::CalculateRowHeights 22 years ago karnaze (gone) 2.46 KB, patch	bernd_mozilla : review+ waterson : superreview+	Details \| Diff \| Splinter Review
jprof report for 15000 rows with the patch 22 years ago Andrew Schultz 588.74 KB, text/html		Details
jprof profile of original testcase, pre-patch 22 years ago Boris Zbarsky [:bzbarsky] 136.51 KB, application/x-gzip		Details
eazel profile of about 3600 row table redused from testcase without patch 22 years ago Tomi Leppikangas 11.23 KB, text/html		Details

Henrik Lynggaard Hansen

Reporter

Description

•

22 years ago

After trying to load a html page with a *very* large table (thats 50.000 rows from around 26 MB of data. symptoms: - the 26 MB is fetched from the webserver (which is on a 100 Mbs LAN) - memory rizes to around 400MB (Windows 2000 taskmanager's VM size) - browser hangs, and can only be exited by killing it via tha task manager)

Henrik Lynggaard Hansen

Reporter

Comment 1

•

22 years ago

I cannot upload the table in question as it contains company data i cannot disclose.Will try and create test case later

Keywords: hang, perf

Adam Hauner

Comment 2

•

22 years ago

Henrik: Bugzilla can't accept attachments bigger than 1 MB, so you have to upload packed testcase on some other location.

Severity: normal → critical

Boris Zbarsky [:bzbarsky]

Comment 3

•

22 years ago

This needs a testcase that can be profiled....

Blocks: 56854

Roland Mainz

Comment 4

•

22 years ago

Attached file Simple testcase with 50000 lines (gzipp'ed HTML) which takes > 15mins to load with 2002-05-24-08-trunk optimised non-debug build on 333MHz UltraSPARC — Details

Just tried the attached testcase with 50000 table cells, two lines per cell. I waited and waited and waited and killed mozilla-bin after 15mins since it did not react on any user input anymore (this is no real hang, but the Zilla was simply far too busy). Further tests (cutting-down the number of cells) showed that it takes ~12mins to reach line 14000 (which means that it would need ~45mins to load the whole document (assuming the load time grows linear (which is not the case...))). Ouch.

Henrik Lynggaard Hansen

Reporter

Comment 5

•

22 years ago

I wonder if it changes anything if the 50000 lines were broken into say 1000 tables with 50 lines in each ?

Manoj

Comment 6

•

22 years ago

I downloaded the attachment and unzipped the html source. I loaded the local file in both IE and Moz 1.0 - Build 2002053104. Here are the results: IE - loaded page within six seconds (twice), refreshed in 4 seconds, cPU - 100% Moz - still loading after ninty seconds, CPU Usage - 100%, VM Size - 26220K. The application has become unresponsive and I cannot see anything in the window, I can't kill it either, must use task manager. The symptoms are exactly as reported. OS: WinXP Reproducible: Always, even in a brand new session

Henrik Lynggaard Hansen

Reporter

Comment 7

•

22 years ago

I have tried the original version (the one I reported the bug on), in both: * IE 6 on windows 2000 * Netscape 7 on Windows 2000 * Opera 6.02 on windows and they all hung after a few minutes and comsumed plenty of memory

Boris Zbarsky [:bzbarsky]

Comment 8

•

22 years ago

I'll try to profile this on June 13 or so... of course people should feel free to do it before that.

Andrew Schultz

Comment 9

•

22 years ago

some timing from PII-450, Linux, current trunk CVS, optimization -O3, 392MB Ram thousand rows seconds 1 2 2 5 5 17 10 65 11 89 12 115 13 147 15 213 this would be O(N^2+) scaling... for the case of 15000 rows, Jprof says that 82% of the time was spent in nsCellMap::GetDataAt. I can attach the full output if it would be useful.

Henrik Lynggaard Hansen

Reporter

Comment 10

•

22 years ago

Attached file original test case that does not render — Details

this is a test case based on the original file, only differnce is that the swift codes and country names have been replaced by dummy texts There is apporx 56.000 lines in the 28 mb file. This file does not render memory usage goes up to about 180 mb and the VM size to about 480 mb, and then moz appear to be hanging.

Boris Zbarsky [:bzbarsky]

Comment 11

•

22 years ago

Andrew, please do attach the full profile.

Henrik Lynggaard Hansen

Reporter

Comment 12

•

22 years ago

I am adding the mozilla1.0.1 keyword in order to nominate this bug for mozilla 1.0.1, the reasons are: * it has the severity critical, and it hangs (kills) mozilla * The bug seems very general, it doesn't happen to just bug tables with a jpeg in each third row that has a blue pixel at 233,450 , but instead it happends to all big tables. * We do have test cases and a rough estimate of the scaling (this would be O(N^2+) scaling...) meaning that we do have some good info * not knowing that code, i would say that it looks like andrew has nailed the problem by finding out that c82% of the cpu time was spendt in nsCellMap::GetDataAt.

Keywords: mozilla1.0.1

Andrew Schultz

Comment 13

•

22 years ago

Attached file jprof report for loading 15000 rows — Details

Amarendra Hanumanula

Updated

•

22 years ago

Keywords: testcase

Priority: -- → P2

Boris Zbarsky [:bzbarsky]

Comment 14

•

22 years ago

So it looks like we're calling the nsCellMap::GetDataAt on every incremental reflow. It also looks like that function is walking an array that is O(N) in the number of rows on every call. Would it help to store a flag that says "there is no useful data in the cellmap"?

karnaze (gone)

Comment 15

•

22 years ago

Attached patch patch to speed up nsTableRowGroupFrame::CalculateRowHeights — Details — Splinter Review

I tried the patch on attachement #1 reduced to about 11500 rows, and the time went from 120 seconds to 30 seconds. I think for a larger table the performance gain will be greater than 4x, since the patch avoids traversing all of the rows. I experimented with removing the 0 colspan/rowspan calculations in nsCellMap::GetDataAt and was able to get the time to 20 seconds, but to rework the way 0 colspans/rowspans are handled represents a lot of work, that I may do later. It would be nice to get new jprof results with the patch.

karnaze (gone)

Updated

•

22 years ago

Severity: critical → major

Status: NEW → ASSIGNED

Target Milestone: --- → mozilla1.0.1

Chris Waterson

Comment 16

•

22 years ago

Comment on attachment 86520 [details] [diff] [review] patch to speed up nsTableRowGroupFrame::CalculateRowHeights sr=waterson

Attachment #86520 - Flags: superreview+

Andrew Schultz

Comment 17

•

22 years ago

some time data rows/1000 orig patched 5 17 11 10 65 25 15 213 48 20 70 30 132 40 210 jprof report to follow

Andrew Schultz

Comment 18

•

22 years ago

Attached file jprof report for 15000 rows with the patch — Details

down to spending 32% in nsCellMap::GetDataAt

Andrew Schultz

Comment 19

•

22 years ago

OS,Platform=>All

OS: Windows 2000 → All

Hardware: PC → All

Bernd

Comment 20

•

22 years ago

Comment on attachment 86520 [details] [diff] [review] patch to speed up nsTableRowGroupFrame::CalculateRowHeights r=bernd

Attachment #86520 - Flags: review+

Henrik Lynggaard Hansen

Reporter

Comment 21

•

22 years ago

Could we please get a jprof report (both for pached and unpatched) for my original testcase, since its table is not just a clean table with only XX in each cell, but it also contains form controls on each row.

Andrew Schultz

Comment 22

•

22 years ago

form controls will probably just get you a more drastic version of bug 148636. there's a perf hit there also, but mainly memory consumption.

Henrik Lynggaard Hansen

Reporter

Comment 23

•

22 years ago

That is excatly why I would like to see the results for my attachment, to see if it willl render after the patch and to see how much it differs from the first attachment, which was only a simple test case I hope it is not too much trouble to create such a jprof run

karnaze (gone)

Comment 24

•

22 years ago

I checked the patch into the trunk but am leaving the bug open.

Boris Zbarsky [:bzbarsky]

Updated

•

22 years ago

Attachment #86019 - Attachment mime type: application/octet-stream → application/zip

Boris Zbarsky [:bzbarsky]

Comment 25

•

22 years ago

Attached file jprof profile of original testcase, pre-patch — Details

This doesn't show any issues with the cellmap. The slowness due to swapping almost immediately completely overwhelmed whatever else was going on. I let it run for 5 minutes or so wall clock time, during which there were 7385 hits, with 1.5ms between hits in code time. In other words, almost all the time was spent outside Mozilla code.

Boris Zbarsky [:bzbarsky]

Comment 26

•

22 years ago

The profile with this patch is really no different from the profile without the patch on the _original_ testcase. Again, the real speed problem there is the swapping.

Boris Zbarsky [:bzbarsky]

Updated

•

22 years ago

Attachment #85774 - Attachment mime type: application/octet-stream → application/x-gzip

Tomi Leppikangas

Comment 27

•

22 years ago

Attached file eazel profile of about 3600 row table redused from testcase without patch — Details

Here is output from eazel profiler. I cut table to about 3600 rows and it takes around 15min to render.

karnaze (gone)

Comment 28

•

22 years ago

The patch that was checked into the trunk is the biggest gain to be made in tables. I'm not sure who should look at the general footprint/swapping problem, but I'm moving this to future to get it off of my radar.

Target Milestone: mozilla1.0.1 → Future

Boris Zbarsky [:bzbarsky]

Updated

•

22 years ago

Summary: Hangs on very large HTML tables (50.000 rows / 26.mb data) → Hangs on very large HTML tables (50.000 rows / 26.mb data) (nsCellMap::GetDataAt)

Markus Hübner

Comment 29

•

22 years ago

Also related to bug 54542

karnaze (gone)

Comment 30

•

22 years ago

mass reassign to default owner

Assignee: karnaze → table

Status: ASSIGNED → NEW

QA Contact: amar → madhur

Target Milestone: Future → ---

Kevin McCluskey (gone)

Updated

•

22 years ago

Target Milestone: --- → Future

Markus Hübner

Comment 31

•

21 years ago

Might a new profile bring more light into this?

Markus Hübner

Updated

•

21 years ago

Blocks: 54542

Boris Zbarsky [:bzbarsky]

Comment 32

•

21 years ago

How about a profilable testcase first? The only one that's still usefully profilable as far as I can tell just shows bug 148636.

Markus Hübner

Comment 33

•

21 years ago

Will compile a new testcase by the end of next week.

Jason Barnabe (np)

Comment 34

•

21 years ago

*** Bug 226358 has been marked as a duplicate of this bug. ***

Boris Zbarsky [:bzbarsky]

Updated

•

21 years ago

Blocks: 234240

Bernd

Comment 35

•

21 years ago

It seems questionable to me that we call RowIsSpannedInto as soon as we need to seriously update the rowgroup height, and then we loop over all cells and even try to repair cell map hole. The colinfo (http://lxr.mozilla.org/seamonkey/source/layout/html/table/src/nsCellMap.h#53) for every column group has two member variables: 55 PRInt32 mNumCellsOrig; // number of cells originating in the col 56 PRInt32 mNumCellsSpan; // number of cells spanning into the col via colspans (not rowspans) 57 // for simplicity, a colspan=0 cell is only counted as spanning the 58 // 1st col to the right of where it orginates and we update them during manipulations of the cellmap (if we fail we crash). It might be worth the effort to do something similiar for rows and have a struct nsRowInfo { int mNumCellsSpanIn; int mNumCellsSpanOut; int mNumCellsOrig; } and update it when building the cellmap, so that we only look up these numbers when calling RowIsSpannedInto I think I will be able to do that in the timeframe outlined in http://bugzilla.mozilla.org/show_bug.cgi?id=54542#c140

Rene Pronk

Comment 36

•

21 years ago

*** Bug 239432 has been marked as a duplicate of this bug. ***

Martijn Wargers (dead)

Comment 37

•

19 years ago

I think this has improved quite considerably. Mozilla1.7 takes very long (> 5minutes) and freezes completely up after a while. Current trunk build takes a few minutes (2 or so I guess) and don't freeze up. Only at the end the ui becomes a little slow.

Dimitrios

Comment 38

•

17 years ago

Latest results for the simple test case (1st attachment) on a Core Duo 1.86GHz Windows machine : IE7 loads it in 5 sec Firefox 3 beta 3 loads it in 20 sec No freeze, fine scrolling performance after completing loading.

Phil Ringnalda (:philor)

Updated

•

15 years ago

Assignee: layout.tables → nobody

QA Contact: madhur → layout.tables

Nicholas Nethercote [inactive]

Comment 39

•

10 years ago

A current e10s trunk build (FF36) takes about 2 seconds to show the table on my fast desktop Linux machine. The page is blank while it's loading, and then the contents all become visible at once. Scrolling is very smooth once it has loaded. Chromium also takes about 2 seconds, but it loads things progressively -- the start of the table is visible immediately, and the last part of the table takes about 2 seconds to show up. So that's a nicer behaviour.

Sylvestre Ledru [:Sylvestre]

Comment 40

•

6 years ago

Moving to p3 because no activity for at least 1 year(s). See https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md#how-do-you-triage for more information

Priority: P2 → P3

Sylvestre Ledru [:Sylvestre]

Comment 41

•

6 years ago

Moving to p3 because no activity for at least 1 year(s). See https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md#how-do-you-triage for more information

Andrei Purice

Comment 42

•

3 years ago

Marking this as Resolved > Worksforme since the hang is not occurring anymore using Release 93.0, Beta 94.0b2 and latest Nightly 95.0a1 (2021-10-07) on Windows 10 and Ubuntu 20.04.
If anyone else is able to reproduce it please re-open the issue or file a new one.

Status: NEW → RESOLVED

Closed: 3 years ago

Resolution: --- → WORKSFORME

You need to log in before you can comment on or make changes to this bug.