Closed Bug 61842 Opened 24 years ago Closed 24 years ago

Performance degradation: HTML parsing of comments/scripts

Categories

(Core :: DOM: Core & HTML, defect, P2)

x86
Windows 95
defect

Tracking

()

CLOSED DUPLICATE of bug 56624

People

(Reporter: bht237, Assigned: harishd)

References

Details

(Keywords: perf, testcase)

Attachments

(1 file)

This simple attached HTML file demonstrates a shocking bug in Netscape 6 and Mozilla build 2000111704: It appears that HTML parsing does not stop within non-HTML portions of the document such as scripts and comments. In practical terms, the "<" character and the -"- (double quote character) cause considerable problems. I have not tested other cases. On a 300MHz Windows 95 PC, this file takes 24,700 ms to reload from a local disk. In comparison, Navigator 4 needs 220ms, Internet Explorer 4 needs 50ms. This test case is derived from a problem in a live web application. In this application, Netscape/Mozilla typically shows a SUBSTANTIAL performance degradation of one order of magnitude slower compared with Netscape 4. This does of course depend on the content, but obviously enough, -"- and "<" characters are heavily used in scripts and commented-out HTML. As a proof, please replace "<abc>" with aaaaaaa: 270ms on a 300MHz PC.
Attached file Test case
Keywords: perf, testcase
Priority: P3 → P2
Didn't try changing the <abc>, but the attachment took 25000ms to load on Linux 2000120206. Yikes!
//a="<abc>" or a="<abc>" makes no difference except the minor (in comparison with the bug related delay) additional JavaScript processing. I think this test variation is a good proof of the real life significance of the bug. However: a="<abc" (closing ">" missing) does not cause the delay. In general, the whole issue is very relevant for JavaScript document.write(). I noticed that Mozilla's JavaScript engine is not too bad but until now I have wondered why it did not perform at all in real life scenarios. This is a total knockout and I am looking forward to the real thing. How much better could Mozilla be without this ??? ETA?
Blocks: 23187
Blocks: 29805
This is a parser performance problem, not a DOM problem. Reassigning to the parser owner.
Assignee: jst → harishd
Status: UNCONFIRMED → NEW
Ever confirmed: true
Here is the one line change to fix the performance problem: Index: nsHTMLTokens.cpp =================================================================== RCS /cvsroot/mozilla/htmlparser/src/nsHTMLTokens.cpp,v retrieving revision 3.177 diff -u -w -r3.177 nsHTMLTokens.cpp --- nsHTMLTokens.cpp 2000/11/02 22:20:28 3.177 +++ nsHTMLTokens.cpp 2000/12/05 21:41:46 @@ -642,6 +642,7 @@ //theTermStrPos=theBuffer.RFind(aTerminalString,PR_TRUE,tempOffset,termStrLen+2); theTermStrPos=theBuffer.RFind(aTerminalString,PR_TRUE,tempOffset,tempOffset-(theCurrOffset-2)); //bug43513... if(theTermStrPos>-1) break; + theCurrOffset=tempOffset; tempOffset++; } else break; Note: This exact change will not land because a lot of parser performance work, which has addressed problems like these, will be landing soon and therefore the fix will not look the same.
Status: NEW → ASSIGNED
*** This bug has been marked as a duplicate of 56624 ***
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → DUPLICATE
Verifying as a duplicate of 56624 (general bug for poor parser performance) 'needs a lot of time to load the page'
Status: RESOLVED → VERIFIED
Works like a charm. VERY COOL!
Status: VERIFIED → CLOSED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: