Closed Bug 105219 Opened 24 years ago Closed 24 years ago

memory leak with crash

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED

People

(Reporter: georg, Assigned: jst)

Details

(Keywords: perf)

Attachments

(5 files)

When generating large portions of HTML (>500 kB) via JavaScript, then Mozilla (tested 0.9.3 to 0.9.5 on linux) crashes. In many times it also causes KDE to crash. Before the crash it wastes a lot of time (several minutes) flooding the swap partition. Maybe that the crash happens, when it tries to begin displaying the HTML output.
I'm also seeing a somewhat strange behavior: Mozilla loads the testcase and begins to allocate up to 90 MB of RAM, jumping from 20 MB to 90 MB and back. The script never finished (>5Min) and Mozilla didn't crash (but hang).
Severity: normal → critical
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: hang
The testcase makes my milestone 0.9.5 hang while consuming insane amounts of memory, between 250 and 500MB. ---> NEW
Same problem on Windows; OS : Linux --> All Will attach WinNT stack traces obtained by interrupting the page load.
OS: Linux → All
Hardware: PC → All
How does IE perform on this page? I don't have anything later than IE4.7.
On a Win2k, PIII 700 laptop running IE 6.0 it max's out the CPU and jumps between 12MB and 20MB RAM, settling at ~20MB. It takes about 42 seconds to execute (according to Task Manager). (IE reports 6.0, but I've only installed 5.5 for the detail oriented :)
Sterling: thanks! cc'ing you on this bug if you don't mind. I've got a reduced testcase I will attach below -
I have reduced the original testcase to something which takes the same time for Mozilla to load from a "cold start". That is, 1. Launch Mozilla for the first time 2. Load the reduced testcase Here are timings I got comparing the original testcase to the reduced one, using Mozilla trunk binary 20011016xx on WinNT. Note: I had saved both files LOCALLY when running these tests. All times in seconds: ORIGINAL TESTCASE REDUCED TESTCASE From a cold start: 225 220 Shift + Reload: 215 77 I don't know why the discrepancy on Shift + Reload; need more testing.
Here is the reduced testcase: <html><head><title>Bug 105219</title> <style type="text/css"> table.calendar,table.calendar td{border:0px;padding:0px;border-spacing:1px;margin:0px;empty-cells:show;} table.calendar{position:relative;} table.calendar td{width:4px;height:4px;background-color:#ffffff} body{background-color:#ffffff;color:#000000;font-family:helvetica,arial,sans-ser if;} div{position:fixed;top:50%;left:50%;visibility:hidden;background-color:#ffffd0;b order:1px solid #000000;padding:3px;color:#000000;} p+table{background-color:#808080} </style> <script type="text/javascript"> var START = new Date(); var ROWS = 96; </script> </head> <body> <p>Warning, this requires a modern browser like Mozilla or MSIE. W3C-DOM, MSIE 4-DOM and NN 4-DOM are not supported.</p> <table> <tr> <td> Table A </td> <td> <table class="calendar"> <script type="text/javascript"> var table = ''; for(i=0;i<ROWS;i++) { table += '<tr>'; for(j=0;j<ROWS;j++) table += '<td id="c'+j+'r'+i+'"><\/td>'; table += '<\/tr>'; } document.write(table); </script> </table> </td> </tr> <tr> <td> Table B </td> <td> <table class="calendar"> <script type="text/javascript"> table = ''; for(i=0;i<ROWS;i++) { table += '<tr>'; for(j=0;j<ROWS;j++) table += '<td id="c'+j+'r'+i+'"><\/td>'; table += '<\/tr>'; } document.write(table); </script> </table> </td> </tr> </table> </body> <script> var loadtime = (new Date() - START)/1000; var loadmsg = loadtime + ' seconds to load the page -' alert(loadmsg); </script> </html>
Attached file Reduced testcase
The reduced testcase, apart from some timing code I added, consists of two 96 x 96 <TABLES> that get document.written. Each one has CSS styling: <table class="calendar"> My experience has been this: depending on how long Mozilla takes to load the reduced testcase, you may or may not even see the tables fully rendered; sometimes not at all! I think this bug has to do with performance of document.write(), and so I'm going to assign this to DOM Level 0, not JavaScript Engine. Will also look for duplicate bugs involving document.writing <TABLES>. What I would like to ask the contributors is this: do you think the performance of Mozilla on the reduced testcase is equally bad as on the original, roughly speaking? I have found this to be true, at least when Mozilla is loading it for the first time -
Assignee: rogerl → jst
Component: Javascript Engine → DOM Level 0
QA Contact: pschwartau → amar
Reduced testcase takes just as long to load without the CSS styling on the tables, so I will remove the <STYLE> tags, change <table class="calendar"> to <table>, and reattach below -
Keywords: perf
Note: the ROWS constant equals 96 in the testcases. You can alter this to say, 10, to load quickly and see what the page is supposed to look like. You'll get two 10 x 10 tables instead of two 96 x 96 tables. When you try without any CSS, the tables will look blank because the document.write() is not putting any data into the cells: <td></td>.
Note: I do not hang on the 96 x 96 tables, they just take a long time to load. I also do not crash. On the other hand, the tables don't always render fully...
Keywords: hang
According to the Phill's comments I ran the given testcases and found the following For NS 6.2 build : 2001-10-17-05-0.9.4 It took 305 sec when I ran it from my local machine and 242 sec when I ran it from bugzilla For IE 5.0 it took just 43 sec when I ran it form both local and bugzilla But IE does not render the CSS table cells.. Its empty REDUCED TESTCASE NS6.2 305 sec(local copy) and 242sec( from Bugzilla) IE 42sec( when ran from both local and bugzilla)
Keywords: hang
This has nothing to do with document.write(), or very little at least. If you comment out the two calls to document.write() you'll see that it still takes almost as long to load the page. Most of the time spent on this page is in the JS engine building up the 500k char string that will later be document.write():n (unless you comment that out for testing). Back to the JS engine, but I'm not sure there's alot that can be done for this. I never saw the crash, so I can't comment on that.
Assignee: jst → rogerl
Component: DOM Level 0 → Javascript Engine
Keywords: hang
QA Contact: amar → pschwartau
If CSS doesn't affect the crash, and (assuming jst is right) JS is only a perf issue, then would we be a dupe of bug 104836?
To answer my own question, probably not -- I've looked through five other bugs on table performance issues (bug 104836, bug 39573, bug 63530, bug 54542, and bug 74888) and they seem to focus on one aspect or another of the symptoms, but not all at once. Maybe they can all be mushed into one with some strategic thought...? Anyway, sorry if this doesn't help...I'm just a lurker who's happy to participate where he can :)
jst is right - I was silly not to see this before. I have reduced the HTML testcase to string concatenation and nothing else: <html><head><title>Bug 105219</title></head> <body><script type="text/javascript"> /* * Adjust ROWS to change the size of the test. * It's a double loop of size ROWS x ROWS. */ var ROWS = 96; var splashAlert = ''; splashAlert += 'JS Engine performance test: a double-loop ('; splashAlert += ROWS + ' x ' + ROWS + ') of string concatenations.\n' splashAlert += 'Please wait until the results are printed below -' alert(splashAlert); /* * Start the test - */ var START = new Date(); var str = ''; for(i=0;i<ROWS;i++) { str += '<tr>'; for(j=0;j<ROWS;j++) str += '<td id="c'+j+'r'+i+'"><\/td>'; str += '<\/tr>'; } // Once more, as in the original HTML testcase - str = ''; for(i=0;i<ROWS;i++) { str += '<tr>'; for(j=0;j<ROWS;j++) str += '<td id="c'+j+'r'+i+'"><\/td>'; str += '<\/tr>'; } /* * Show the results - */ var splashText = ''; splashText += '<p>JS Engine performance test: a double-loop ('; splashText += ROWS + ' x ' + ROWS + ') of string concatenations.<br>' splashText += 'Please wait until the results are printed below -</p>' document.writeln(splashText); var loadtime = (new Date() - START)/1000; var loadmsg = '<b>Total time was: ' + loadtime + ' seconds</b>' document.writeln(loadmsg); </script></body></html>
Sorry if I'm stepping out of bounds...Aren't we drifting away from the real focus of this? Perf testing JS is good and all (and I think in this case belongs elsewhere, since we can all create sufficiently large loops that it takes a long time) but in this case the reporter crashed with a mem leak. Georg, can you give these testcases a shot and see if it's still on track?
Sterling is right - here are my results, at any rate: Timings, using Mozilla trunk binary 20011016xx on my WinNT box. Time to load the reduced testcase of string concatenation only: 105 seconds If I modify this to document.write() the two resulting strings right into the document (not inside a <TABLE>): 140 seconds The testcases above have <SCRIPT>document.write()</SCRIPT> inside a <TABLE> contained within a <TABLE>: 220 seconds So although a lot of time is taken up by the string concatenation, another large chunk is taken up by having <SCRIPT> inside a <TABLE> inside a <TABLE>.
I try to test your new scenaries on the machine, which had the crashes tomorrow. May be that you have more memory to not crash. On the crashing machine I have 112 MB usable RAM + 128 MB swap partion, which both run up to 100%. When the swap partion reaches 100%, then the crash happens. Mozilla should clean up it's memory from no more needed stuff to prevent this situation.
Georg, do you also crash on the "string-concatenation-only" testcase?
I try it tomorrow, when I'm back to my work, where the test machine is. In my second test scenary it did not crash. The difference was, the it there also generated the tables dynamically instead of putting the JavaScript into statical HTML tables. In this testcase it run about 15 minutes and then it displayed the tables correctly without crash. The crash testcase run only about 5 minutews and then crashed at least mozilla, in many cases also KDE. In one case it also crashed the X11 server. May be that there is something wrong on that linux, but Mozilla finds that weak point and hits onto it until my working place machine crashes.
Following jst's suggestion, I made a version of the testcase where everything is static. This tests the loading time from Layout alone. The HTML file is too big to attach to this bug (412K). I generated the table rows involved with this program in the standalone JS shell: /* * Adjust ROWS to change the size of the table. * It's a double loop of size ROWS x ROWS. */ var ROWS = 100; var WIDTH = 20; var str = ''; for(i=1; i<ROWS+1; i++) { str += '<tr>\n'; for(j=1; j<ROWS+1; j++) { str += '<td id="c'+j+'r'+i+'"><\/td>'; if (j%WIDTH==0) {str += '\n';} } str += '<\/tr>\n'; } print(str); This made 100 rows with 100 columns each; I cut-and-pasted this static HTML into the above testcases where the dynamic HTML was. Timing, again using Mozilla trunk binary 20011016xx WinNT with 62 MB usable RAM. This static HTML testcase took about 10 seconds to load, comprising two 100 x 100 <TABLES> inside a <TABLE>, including CSS data. This seems to let Layout off the hook as far as this bug is concerned, I would think...
I've now just clickt on that link http://bugzilla.mozilla.org/attachment.cgi?id=54074&action=view and got the crash of mozilla 0.9.5 after about 5 minutes heavy memory flooding. The crash does not come at one, when the swap partition is full. Mozilla requieres about 80 MB RAM + more than 96 MB swap partition memory. May be that the machine has some influence on the situation becoming so extremely critical.
Georg: thanks! Do you have a stack trace for your crash on http://bugzilla.mozilla.org/attachment.cgi?id=54074&action=view ? If so, could you attach it to this bug? So far, I've been unable to crash - thanks.
Where can I find the stack trace? My working place is a SuSE 7.1 linux system. Must I do anything special to activate stack traces, or are they generated automatically? Where are they stored? Talkback does not work for unknown reasons with 0.9.5 With 0.9.3 Talkback works, but the crash is to hard to give Talkback any chance to do something. I many cases the crash also results in a logout because ogf crashing KDE. The machine is configured to use graphical login.
Georg: thanks; I've had problems getting traces with bugs like this, too. There is a clear relationship between this bug and bug 56940, "O(n**2) and O(n**3) growth too easy with JS string concatentaion" js> load('mozilla/js/tests/105219.js') Double loop of 50 x 50 turns This took 2.656 seconds js> load('../../tests/test/test/105219.js') Double loop of 100 x 100 turns This took 43.859 seconds Observe 10,000 total turns = 4 x 2,500 total turns, yet on the timings, 43.859 secs = 16.5 x 2.656 secs. Thus a factor of 4 in the number of string concatenations becomes a factor of 16.5 in the time it takes JS to do them -
cc'ing Brendan to make sure I'm right on this - Despite any inefficiencies in JS string concatenation, the reporter's original point is: Mozilla should exit gracefully when memory is going to run out on the user's box, and not crash. JS Engine has a function JS_ReportOutOfMemory(); I believe it is up to the DOM to police JS loops for this. Compare bug 13350: "DOM needs to police JS infinite loops, schedule garbage collection" I won't dupe now, because that bug specifically addresses JS infinite recursion. But I think the idea is rather similar here: DOM needs to police memory usage by JS. This is a JS Engine bug if JS_ReportOutOfMemory() isn't actually working properly, but I believe it is. See bug 46196 for extra info on this function. I have gone into js.c and adjusted the memory pool, from rt = JS_NewRuntime(8L * 1024L * 1024L); to e.g. rt = JS_NewRuntime(1024L); and I get polite "out of memory" exceptions in the JS shell. I don't crash. Reassigning to DOM Level 0 for consideration -
Assignee: rogerl → jst
Component: Javascript Engine → DOM Level 0
QA Contact: pschwartau → amar
Summary: memory leak with crash → DOM needs to police JS memory usage [WAS: memory leak with crash]
>Thus a factor of 4 in the number of string concatenations >becomes a factor of 16.5 in the time it takes JS to do them - You've just characterized an O(n**2) growth rate bug -- bug 56940. Please try the latest patch in that bug, which should fix things. If that patch does fix this bug's symptoms, please mark this bug a dup of 56940. /be
Hm, on ftp.mozilla.org/pub/js/ I see this: js-1.5-rc3a.tar.gz 994042 07/03/01 12:26:00 pm file ~~~~~~~~ Is there anywhere a newer release of the Spidermonkey engine, or should I use the Rhino for testing outside the browser? I know from other discusions, that some bugs are browser and spidermonkey related only, where other bugs are also presented by Rhino. So what should I use to test outside Mozilla?
In Bug bug 56940 there is written, that the + operator causes the problem, where the += does not. In my original testcase I now replaced this construction and tjhe similiar construction residing in the table generating loop: table += '<td id="c'+j+'r'+i+'"><\/td>'; with this code: table += '<td id="c'; table += j; table += 'r'; table += i; table += '"><\/td>'; I tested again with Mozilla 0.9.5. After exactly 5 minutes Mozilla crashed and also caused KWrite to crash and also caused KDE to crash. In this test case the workaround mentioned in bug 56940 does not help. This might help you to decide whether this bug is really a duplicate of bug 56940 or not.
Georg, please read carefully my analysis in bug 56940. I never wrote that + was exclusively a problem and += was not. I showed how + can lead to O(n**3) growth while += leads to O(n**2) growth (bad enough, just not as bad). Both are bugs. The patch in bug 56940 should land today or tomorrow (pending review), and fixes both cases. Your question about testing is stated oddly, because most people test what they need to use -- Java embedders need Rhino, C or C++ embedders need SpiderMonkey. But we do need another release candidate of SpiderMonkey 1.5, and I'll work with phil to do one that coincides with mozilla0.9.6's release. /be
> You've just characterized an O(n**2) growth rate bug -- bug 56940. > Please try the latest patch in that bug, which should fix things. > If that patch does fix this bug's symptoms, please mark this bug > a dup of 56940. Bug 56940 has now been fixed on the trunk (as of 2001-10-24). So - using Mozilla trunk binary 20011026xx on my WinNT box: (All times in seconds) BEFORE 56940 FIX AFTER 56940 FIX Reduced testcase #1 220 7.5 Reduced testcase #2 210 6.5 Reduced testcase #3 105 1 Similar improvements on my Linux and Mac as well. Marking FIXED based on these astonishing results. Georg, could you try a build dated today or after and confirm that you don't crash? You can get a tarball for example at: ftp://ftp.mozilla.org/pub/mozilla/nightly/2001-10-26-08-trunk If in fact you don't crash, you can mark this bug "Verified". If you do crash, well... I will cry!
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Summary: DOM needs to police JS memory usage [WAS: memory leak with crash] → memory leak with crash
I've forwarded this message as ToDo for monday to my working place to do the test on mondy, when I'm back at the critical machine. I hope, that you do not need to cry ;-)
No need to cry! With the linux tarball from ftp://ftp.mozilla.org/pub/mozilla/nightly/2001-10-26-08-trunk no more crashes happen and the page is visible within about 15 seconds (instead of crashing after 5 minutes). Yeah! Thanks a lot.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: