Open
Bug 112738
Opened 23 years ago
Updated 2 years ago
large page with unescaped '<' in email addresses takes 137 seconds to load - locks browser until loaded (nested unclosed tags), plaintext served as HTML
Categories
(Core :: Layout, defect)
Core
Layout
Tracking
()
NEW
Future
People
(Reporter: jrspm, Unassigned)
References
()
Details
(Keywords: perf, Whiteboard: [WONTFIX 12/31/2002?][reflow-refactor])
Attachments
(4 files, 2 obsolete files)
Using build 2001112914 on Win2k(SP2) I'm not sure that Parser is the right component for this, but I don't think it is layout either. The page in the URL field took 137 seconds to load. I tried it in IE5.5(SP2) and it took about 3 to 4 seconds to load. BTW, I am on a cable modem. The page has two tags: <html> <pre> It has no other tags and no end tags for the tags that it has. The two tags take up the first 3 lines of the file. The rest of the file is plain text. The size of the file is about 311KB. Steps to reproduce: 1. load file: http://www.theos.com/deraadt/coremail.html 2. Wait a loooooooooooong time (about 137 seconds) with all mozilla windows freezing during that time. 3. Finally view file. Expected: File should load in 3-4 seconds like IE5.5 Actual: File takes 137 seconds to load Jake
Comment 1•23 years ago
|
||
Since this page has unterminated tags, I'm guessing possible dupe of bug 84466.
Comment 2•23 years ago
|
||
This is _so_ layout based on the profile
Assignee: harishd → attinasi
Component: Parser → Layout
Keywords: perf
OS: Windows 2000 → All
QA Contact: moied → petersen
Hardware: PC → All
Comment 3•23 years ago
|
||
Comment 4•23 years ago
|
||
See also 73889.
Comment 5•23 years ago
|
||
I found the problem. I removed all < and > from the document and time to load dropped from 54 seconds (Netscape 6.2) to less than a second. When you have a <pre> tag all further scanning of anything other than a </pre> should be suspended.
Comment 6•23 years ago
|
||
No. You can use <b>, <i>, <span style="font-color:red"> and so forth in the <pre>.... So yes, the "<" should be removed (there is an evangelism bug on that) but that is the responsibility of the site in this case.
Comment 7•23 years ago
|
||
The fact that IE can do the page 40-50 times faster than Mozilla suggests that the responsibility should not be the site but the browser.
Comment 8•23 years ago
|
||
responsibility for correct markup is always the site. this bug is open solely because we _should_ be able to do it faster.
Updated•23 years ago
|
Target Milestone: --- → mozilla1.1
Comment 9•22 years ago
|
||
I accessed the URL:http://www.theos.com/deraadt/coremail.html , but not succeed. Who can tell me where is the testcase now?
Comment 10•22 years ago
|
||
Because the bug cannot recur again,I think it should be closed. Btw I create a html file(>320KB),which include only two tags: <html> and <pre> and a big block of plain text,but the phenomena didn't appear again.
Comment 11•22 years ago
|
||
I reproduced the bug. Even though I download the whole file from the website,and opened it form local machine, the time costed still is same to open the file from original website. The time consumed is about 22s. I tested the file using IE and netscape4.76,the time is only 2-3 seconds. BTW, reproduced the bug using netscape 7.0 and mozilla 1.0 trunk20020618
Comment 12•22 years ago
|
||
I just reprofiled this and it looks about the same as my orginal profile. The complete lack of closing tags is causing us to create what is essentially a linear content model, with a very large depth. Since reflow has to walk down the tree to the node being reflown and then back up to the root, construction of such a content model will be O(n^2) in the number of nodes if you keep reflowing it as you construct it. We could probably make this fast by ignoring open tags inside <pre> if they were not known elements.... but I question whether this one edge case is really worth it.
Comment 13•22 years ago
|
||
I have got a logfile about the reflow process of this file. But it is too huge(>360M) to upload :( you can produce the logfile according to http://lxr.mozilla.org/seamonkey/source/layout/doc/frame_reflow_debug.html
Comment 14•22 years ago
|
||
As a note, I talked to Harish and ignoring these unknown elements is simply not an option... Apparently sites actually depend on this sort of thing and ignoring the elements would break them.
Comment 15•22 years ago
|
||
I think a simplified testcase is necessary to resolve this bug. Can anyone provide one? :)
Comment 16•22 years ago
|
||
After analyzed the testcase,I find the source of bug is the invalid tag,such as<sys/syslog.h> and <Chris_G_Demetriou@LAGAVULIN.PDL.CS.CMU.EDU>. we can verify it by deleting those tags. I think the bug is related to the parser.
Comment 17•22 years ago
|
||
Yes. See comment 12 and comment 5
Comment 18•22 years ago
|
||
Because there exist many invalid tags,So change some of them into eHTMLTag_text, especially such tags: <aaa.bbb.c@ee.ff.com> or <sys/file.h>. The time consumed when loading the testcase is almost same to IE now.
Comment 19•22 years ago
|
||
one idea: if parser meet <pre>, then it will skip all '<' inside it. just take '<' as one char, not begin of another html tag.
Comment 20•22 years ago
|
||
Not an option. See comment 6. In fact, read the whole bug, please.
Comment 19 would not work since markup is allowed within PRE. harishd could probably comment better on the patch in comment 18, although I suspect there are reasons we act as we do.
Updated•22 years ago
|
Attachment #92520 -
Attachment is obsolete: true
Comment 22•22 years ago
|
||
a little modification
Comment 23•22 years ago
|
||
Isn't this going to create a bunch of non-existent tags in the page DOM? I think we'd be better off just sending this to Tech Evangelism.
Assignee: attinasi → harishd
Component: Layout → Parser
QA Contact: petersen → moied
This bug was found relatively late, and there are no duplicates. Only one non-top100 page exhibits this problem. The fix is not universal, and would slow us down in certain cases. Based on this I also think this should be handled by Evangelism. Reassigning.
Assignee: harishd → doron
Component: Parser → US General
Product: Browser → Tech Evangelism
QA Contact: moied → zach
Target Milestone: mozilla1.1alpha → ---
Version: other → unspecified
Comment 25•22 years ago
|
||
We already have an evang bug on this site -- bug 112802. If we're not planning to fix the core problem, whatever we determine that to be, we should just wontfix this.
Assignee: doron → harishd
Component: US General → Parser
Product: Tech Evangelism → Browser
QA Contact: zach → moied
Version: unspecified → other
Comment 26•22 years ago
|
||
I vote for WONTFIX. heikki?
Hmm, let's make a compromise. I'll future this for now, and will wontfix this at the end of the year if we don't find more sites with this problem. If we do find more sites, we'll think of the fix again. That will also let Harish have a look at the proposed fix (he is on sabbatical now).
Whiteboard: [WONTFIX 12/31/2002?]
Target Milestone: --- → Future
Comment 28•22 years ago
|
||
Should we push Evangelism team to take a look at this bug and bug 112802? It seems that there are many such issues and fix them can give us big performance improvement.
Comment 29•22 years ago
|
||
1) Because testcase(http://www.theos.com/deraadt/coremail.html) often is unavialble, I upload the orignal one. 2) Becuase the limitation to the attachment is 300k,and the original testcase is 305k,so I delete some pure text line in orginal testcase. 3) The attachment is the modified testcase.
Comment 30•22 years ago
|
||
Comment 31•22 years ago
|
||
How to do with this bug now? new idea? or old one?
Comment 32•22 years ago
|
||
Attachment #92536 -
Attachment is obsolete: true
Comment 33•22 years ago
|
||
Comment on attachment 107551 [details] [diff] [review] Changed some invalid tags into eHTMLTag_text (format is correct) hi,harishd The patch have been exist for 4 months, can you review it?
Attachment #107551 -
Flags: review?(harishd)
Comment 34•22 years ago
|
||
Leon, your patch basically masks the real problem ( pointed out by boris ). Anyway, based on your patch tag-like-syntax that contain '@', '/', or '.' will be treated as text but what if my tag-like-syntax contained special characters that are different from '@','/', or '.'? Note: '.' and '-' are legal characters in a tag.
Comment 35•22 years ago
|
||
*** Bug 190267 has been marked as a duplicate of this bug. ***
Comment 36•21 years ago
|
||
Comment on attachment 107551 [details] [diff] [review] Changed some invalid tags into eHTMLTag_text (format is correct) This is scary. How much testing have you done? I really don't like the idea of searching for a few special characters just to fix this bug. IMO, this is not a fix but just a hack.
Attachment #107551 -
Flags: review?(harishd) → review-
Comment 37•21 years ago
|
||
then, who can give a better solution?
Comment 38•21 years ago
|
||
parser or layout should do something to do with this issue, I still prefer parser.
Comment 39•21 years ago
|
||
*** Bug 107733 has been marked as a duplicate of this bug. ***
Comment 40•21 years ago
|
||
Note that bug 107733 has this same issue with tags like <control> that could not possibly be screened out in the parser.... Perhaps the real solution is to change the incremental loading logic to reflow less often in cases like this....
Updated•21 years ago
|
Comment 41•20 years ago
|
||
*** Bug 265619 has been marked as a duplicate of this bug. ***
Comment 42•19 years ago
|
||
*** Bug 239330 has been marked as a duplicate of this bug. ***
Comment 43•19 years ago
|
||
*** Bug 298422 has been marked as a duplicate of this bug. ***
Comment 44•19 years ago
|
||
Guys, this bug is one cause of many bad user experiences of Firefox sucking up RAM and bringing systems to their knees "for no reason." It appears that Mozilla needs logic to better handle bad web pages, of which there are many! Just surf with the HTML Validator extension and see for yourselves. Considering that this issue has been abandoned, is there no desire to make Firefox immune to these problems - in the form of more complete exception handling (by which I mean being able to deal with anything that does not obey the rules, rather than expecting the world-at-large to behave themselves) - the root of most performance problems and exploits?
Comment 45•19 years ago
|
||
What exact solution are you proposing here? Ignoring markup on the page? Or what?
Comment 46•19 years ago
|
||
> What exact solution are you proposing here? Ignoring markup on the page? Or
what?
Well, it would be nice to have Firefox use much less memory to render documents,
as is apparent Microsoft was able to do with Internet Explorer.
Comment 47•19 years ago
|
||
Yeah, we could remove support for all sorts of DOM and CSS stuff (like explorer), and then we could use a lot less memory to get the much more limited functionality...
Comment 48•19 years ago
|
||
It's not like you would have to eliminate functionality. Why allocate memory for stuff you are not rendering? Couldn't firefox use some sort of flag or tag-based architecture that tells other modules not to allocate memory for stuff that isn't there? I'm not sure it makes much sense to expend 130+ MB of RAM just to display text. JMHO
Comment 49•19 years ago
|
||
> Why allocate memory for stuff you are not rendering? Because you have to render it, because it may affect the positions of other things that would end up in the viewport. Look up CSS relative positioning, please. > I'm not sure it makes much sense to expend 130+ MB of RAM just to display text. Consider that you have to allocate various DOM objects for all that text, per the DOM spec requirements.
Comment 50•19 years ago
|
||
> Consider that you have to allocate various DOM objects for all that text, per
the DOM spec requirements.
Then, ignoring for a moment that Microsoft does not always obey standards, how
is it that they are able to render and properly display the exact same page
using far less time, CPU power, and memory. Firefox is not even close.
Comment 51•19 years ago
|
||
Because you've ignored the reason that they can do it.
Comment 52•18 years ago
|
||
"a little modification to original testcase" loads in about 20 seconds for me on trunk, about 10 seconds on the reflow branch... For comparison, Opera 8.5 is at about 2 seconds.
Whiteboard: [WONTFIX 12/31/2002?] → [WONTFIX 12/31/2002?][reflow-refactor]
Comment 53•18 years ago
|
||
This page, which is the one that caused me to log this bug: 239330 makes 1.5.0.5 start running away with resources. It does eventually load. But since it consumes 185MB who wants to keep ff running after that. http://www.lambdacs.com/cpt/FAQ.html (not sure, but there could be some irony in the fact that the page is a thread programming FAQ....)
Updated•18 years ago
|
Summary: large plain text page takes 137 seconds to load - locks browser until loaded (nested unclosed tags) → large page with unescaped '<' in email addresses takes 137 seconds to load - locks browser until loaded (nested unclosed tags), plaintext served as HTML
Updated•15 years ago
|
Assignee: harishd → nobody
QA Contact: moied → parser
Comment 54•15 years ago
|
||
Problem persists in Fx 3.5.3. Test URLs yield 100% CPU load, unresponsiveness and crashes.
Comment 55•14 years ago
|
||
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.4) Gecko/20100503 Firefox/3.6.4 http://www.theos.com/deraadt/coremail.html and http://www.lambdacs.com/cpt/FAQ.html still cause hangs and 100% CPU usage.
Comment 56•14 years ago
|
||
I can't switch tabs while either uri of comment 55 is loading, using Firefox 4.0 Beta 7 on Windows 7 (32 bit).
Comment 57•13 years ago
|
||
Sending this back to layout based on the profiles. Whenever the parser tries to protect layout from certain kinds of input, unintended consequences follow.
Component: HTML: Parser → Layout
QA Contact: parser → layout
Updated•2 years ago
|
Severity: normal → S3
Comment 58•2 years ago
|
||
The severity field for this bug is relatively low, S3. However, the bug has 4 duplicates.
:dholbert, could you consider increasing the bug severity?
For more information, please visit auto_nag documentation.
Flags: needinfo?(dholbert)
Comment 59•2 years ago
|
||
The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.
Flags: needinfo?(dholbert)
You need to log in
before you can comment on or make changes to this bug.
Description
•