Closed
Bug 28948
Opened 25 years ago
Closed 25 years ago
Linux Mozilla allocates 324MB to load a page
Categories
(Core :: DOM: HTML Parser, defect, P3)
Core
DOM: HTML Parser
Tracking
()
VERIFIED
FIXED
People
(Reporter: tenthumbs, Assigned: rickg)
References
()
Details
(Whiteboard: [PDT+])
Attachments
(2 files)
1.20 KB,
patch
|
Details | Diff | Splinter Review | |
80 bytes,
text/html
|
Details |
Page is about 377K yet Linux Mozilla allocates 324MB and it actually uses 311MB. The page is just a manual. There doesn't seem to be anything odd about it
No, it's bad HTML. There are tons of <A NAME="foo"> tags with no </A> tags. I guess Mozilla is nesting each anchor inside the previous one with the insane results. Feeding the page to this perl script: #!/usr/bin/perl -w while (<>) { print; if (/<A NAME=/) { $_ = <>; print $_, "</A>\n"; } } fixes the problem. I guess this is a quirks issue because I can't find a current browser that fails so spectacularly.
This is not a layout bug, it's a parser problem (marking as such). Here's what happends. The reason that mozilla can't handle this page is that the parser is building an incorrect and very much overcomplicated content model. This is basically what the gnuplot page contains (simplified testcase) <HTML> <BODY> <A NAME="35"> <PRE> data </PRE> <p>text <p>more text <A NAME="135"> <PRE> data </PRE> <p>text <p>more text <A NAME="235"> <PRE> data </PRE> <p>text <p>more text <A NAME="335"> <PRE> data </PRE> <p>text <p>more text </BODY> </HTML> Seems like there are two major problems with how the parser handles this, one problem is that the parser doesn't close tags (PRE is one of the problem tags on the gnuplot page) correctly in this case (if you cut down the gnuplot page to something like 50k you'll see (after you wait a while) that the page loads but everything is displayd as PRE text) and the other problem is that due to the opening <A NAME="xxx"> (without the missing </A>) tags we end up with a residual style stack that grows deeper and deeper for every <A NAME="xxx"> the parser sees. So, the content model for the upper sample HTML is this: docshell=0x81fcb80 html@0x823c564 refcount=5< head@0x81ce3e4 refcount=2< > Text@0x83a8f38 refcount=3<\n> body@0x83b6994 refcount=3< Text@0x84124a0 refcount=3<\n> a@0x8412dd4 name=35 refcount=3< Text@0x83300c8 refcount=3<\n> > pre@0x833003c refcount=3< a@0x825feb4 name=35 refcount=4< Text@0x8264098 refcount=4< data\n\n> > p@0x82640ec refcount=3< a@0x8264154 name=35 refcount=3< Text@0x8264290 refcount=3<text\n> > > p@0x8264324 refcount=3< a@0x8264364 name=35 refcount=3< Text@0x83896a0 refcount=3<more text\n> > a@0x8389704 name=35 refcount=3< a@0x838984c name=135 refcount=3< Text@0x83899b0 refcount=3<\n> > > > > pre@0x8365124 refcount=3< a@0x8365164 name=35 refcount=4< a@0x836528c name=135 refcount=4< Text@0x83653c8 refcount=4< data\n\n> ... pre@0x82673a4 refcount=3< a@0x826c864 name=35 refcount=4< a@0x826c98c name=135 refcount=4< a@0x826cab4 name=235 refcount=4< a@0x826cbfc name=335 refcount=4< Text@0x826cd38 refcount=4< data\n\n> > > > > p@0x826cd9c refcount=3< a@0x826cddc name=35 refcount=3< a@0x826cf24 name=135 refcount=3< a@0x826d06c name=235 refcount=3< a@0x826d1b4 name=335 refcount=3< Text@0x826d2f0 refcount=3<text\n> > > > > > p@0x826d394 refcount=3< a@0x826d3d4 name=35 refcount=3< a@0x826d51c name=135 refcount=3< a@0x826d664 name=235 refcount=3< a@0x826d7ac name=335 refcount=3< Text@0x83fd908 refcount=3<more text\n> > > > > > > > > And then the further we go most of the tags end up with things like <p><a><a>... with as many <a>'s as there are <A NAME="xxx"> tags before the tag in the file, a quick grep tgrough the file shows that there are ~350 of them and thus we end up chewing up loads of memory for the frames of all these <a> tags when viewing this page. The cool thing is that the layout system seems to be able to handle this if you have the resources... As a quick hack sollution to this problem I created a patch that closes all <A> tags *without* HREF attribute immediately after they've been opened (hey, that wouldn't actually be such a bad thing for the parser would it?). I'll attach the patch.
Assignee: troy → rickg
Component: Layout → Parser
OS: Linux → All
QA Contact: petersen → janc
Hardware: PC → All
The "pre close demo" attachment contains this HTML <HTML> <BODY> <A HREF="foo"><PRE>pre text</PRE> normal text</A> </BODY> </HTML> Loading this in mozilla shows that the parser doesn't close the PRE tag in this case either (look at the font). Here's the content model generated from the above HTML docshell=0x81fc548 html@0x8240104 refcount=5< head@0x82d11f4 refcount=2< > Text@0x8242f88 refcount=3<\n> body@0x83a0d5c refcount=3< Text@0x83c5688 refcount=3<\n> a@0x83bca24 href=foo refcount=3< > pre@0x83bccac refcount=3< a@0x83bccec href=foo refcount=3< Text@0x83e16a8 refcount=3<pre text normal text> > Text@0x82436e0 refcount=3<\n> > > > Hopefully this will help to solving the problem.
Comment 7•25 years ago
|
||
pretty severe results if you try to load this page on linux. here is the posting of test results. yes, bad juju on the gnuplot page that seamonkey does not like. http://www.ucc.ie/gnuplot/gnuplot.html on the current win32 and linux builds I didn't see the large memory hogging, but cpu use went to 100% for a very long time. I eventually had to kill off the mozilla process with the win95 task manager because it was not responding, and I actually had to hit the old power switch on my linux box because the loading the page seemed to have driven my whole system into the ground. (hp vectra - 200 mhz) equivalent or larger, but more simple, pages using http://komodo.mozilla.org/buster/mkpg.cgi?lines=160000 seem to load ok. beta1?
Keywords: beta1
Just to note that closing all the <A NAME="..."> tags and loading the document drops Mozilla's footprint down to ~28MB which is just about what it usually is.
Assignee | ||
Comment 10•25 years ago
|
||
Ok -- there's a moderate chance that this is the same problem as 3944. I'm trying to land that tonight (or tomorrow) -- and once I do I'll revisit this. It's likely a 1 to 2 day project, given my schedule.
Status: NEW → ASSIGNED
Comment 11•25 years ago
|
||
Closing unterminated <A HREF...> tags was described in bug 2406, could the fix for this one be similar ? Or rather, should the fix for that one close all anchor tags, in stead of just the hrefs ?
Assignee | ||
Comment 12•25 years ago
|
||
It looks like I have a fix, but I need to spend a bit more time testing. I'll give a new status tomorrow.
Assignee | ||
Comment 14•25 years ago
|
||
This bug is related to bug3944, which I've landed a fix for (which corrected the residual style bug). I'll do more testing to see what that fix does in this case in terms of memory.
Comment 15•25 years ago
|
||
I just loaded the gnuplot page and things look normal now, mozilla used ~40Mb of memory when loading that page after doing some other surfing, I'd say this is fixed.
Comment 16•25 years ago
|
||
Works for me too, the page loaded in 11.5 secs with build 2000-02-27-16 on NT, which seems pretty good considering the filesize.
Assignee | ||
Comment 17•25 years ago
|
||
I believe this is fixed per my checkin this weekend.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 18•25 years ago
|
||
Certainly seems to be fixed. The bad HTML page takes just a little longer to load and seems to use a little more space compared to a fixed page, but it's not serious.
Comment 19•25 years ago
|
||
loads fine for me, so based on this and comments of others I'm marking this verified.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•