Closed Bug 133044 Opened 23 years ago Closed 23 years ago

Single line HTML comments get mangled in textarea

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla1.0

People

(Reporter: bugzilla, Assigned: hjtoi-bugzilla)

References

Details

(4 keywords, Whiteboard: [ADT1 RTM][Fixed on the trunk and branch 05/22 ],custrtm-)

Attachments

(4 files, 2 obsolete files)

Simple Test Case 23 years ago Stuart Johnston 183 bytes, text/html		Details
WIP, does not work 23 years ago Heikki Toivonen (remove -bugzilla when emailing directly) 12.83 KB, patch		Details \| Diff \| Splinter Review
patch v1.0 23 years ago harishd 8.57 KB, patch		Details \| Diff \| Splinter Review
testcase 23 years ago Heikki Toivonen (remove -bugzilla when emailing directly) 308 bytes, text/html		Details
patch v1.1 23 years ago harishd 9.73 KB, patch	hjtoi-bugzilla : review+ jst : superreview+ brendan : approval+	Details \| Diff \| Splinter Review
testcase: view source, save as complete 23 years ago Heikki Toivonen (remove -bugzilla when emailing directly) 323 bytes, text/html		Details

Sam Rowe

Reporter

Description

•

23 years ago

When editing HTML in a text area, single line comments have the opening and closing comment tags removed.   The latter turns into: THISLLBREAK This has the potential for pretty big data corruption.

Sam Rowe

Reporter

Comment 1

•

23 years ago

Sorry Build ID: 2002032306

Matthias Versen [:Matti]

Comment 2

•

23 years ago

not critical

Severity: critical → normal

Boris Zbarsky [:bzbarsky]

Comment 3

•

23 years ago

You should escape the "<" and ">" because as the HTML spec says, "comments are markup". Over to parser.

Assignee: rods → harishd

Status: UNCONFIRMED → NEW

Component: HTML Form Controls → Parser

Ever confirmed: true

QA Contact: madhur → moied

Christopher Hoess (gone)

Comment 4

•

23 years ago

Probably related to bug 105973.

Stuart Johnston

Comment 5

•

23 years ago

The problem is that the parser should be ignoring any HTML within the <textarea></textarea>. This is critical as it can result in dataloss! This will make Mozilla unsafe to use for managing webpages, from a webpage. I am attaching a more complete test case. In my experience, multi-line doesn't make a difference (build 2002032803).

Stuart Johnston

Comment 6

•

23 years ago

Attached file Simple Test Case — Details

GAThrawn

Comment 7

•

23 years ago

Possible dupe of bug 64799 or bug 127043 particularly the bits to do with <SCRIPT> tags within TEXTAREAs from those bugs.

Stuart Johnston

Comment 8

•

23 years ago

I still don't find any reference in the HTML 4.01 Specification that says there can't be html within a <textarea> although I did find it in the 3.2 Spec and I suppose that it probably does make sense. However, this is not consistent with any other browsers and WILL RESULT IN DATALOSS in many applications. Therefore, IMHO, it should only be enforced in Strict mode.

Christopher Hoess (gone)

Comment 9

•

23 years ago

Marking "compat". The only tags in which unescaped HTML is not parsed are those with declared content CDATA, such as <script> and <style>. <textarea> is not one of them.

Keywords: compat, testcase

[not reading bugmail]

Comment 10

•

23 years ago

If you are writing for one spec and put it on a different doctype you are gonna get problems.. especially with depreciated tags in newer versions of standards.

harishd

Comment 11

•

23 years ago

Heikki, could you please take a look at this? Thanx.

Assignee: harishd → heikki

Boris Zbarsky [:bzbarsky]

Comment 12

•

23 years ago

*** Bug 135364 has been marked as a duplicate of this bug. ***

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 13

•

23 years ago

Hmm... seems I broke this with my comment optimization :(

Status: NEW → ASSIGNED

Keywords: dataloss, nsbeta1+, regression

OS: Linux → All

Priority: -- → P1

Target Milestone: --- → mozilla1.0

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

23 years ago

Whiteboard: [ADT2]

Jesse Ruderman

Comment 14

•

23 years ago

*** Bug 138494 has been marked as a duplicate of this bug. ***

Chuck Roberts

Comment 15

•

23 years ago

I get something similar. Go to http://chuckr.bravepages.com for an example. Sometimes the page is rendered and missing the whole table, a single line comment seems to mess it up. Other times it works. Using Mozilla 1.0rc1. Here is a partial code sample as viewed from Mozilla 1.0rc1. Everything after the BODY tag seems to be commented out, until you get to the end of the table. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"> <html> <head> <title>Freezone freeware</title> <meta name="keywords" content="freeware freezone computer software free Unrealed Unrealed2"> <meta NAME="description" CONTENT="Freezone Freeware, 600+ freeware programs"> <meta NAME="author" CONTENT="croberts@gilsongfx.com"> <body bgcolor="#c0c0c0" TOPMARGIN=0 LEFTMARGIN=5 MARGINHEIGHT=0 MARGINWIDTH=5> <!--></noscript></div></xmp></style> <center> <!partner is bravepages 192.168.100.13> <SCRIPT SRC='http://www.bravenet.com/jsbanner.php?size=666'></SCRIPT><SCRIPT SRC='http://www.bravenet.com/jsbanner.php?size=999'></SCRIPT> </center> <table border=1> <tr> <td width=20% bgcolor=lightblue valign=top> <p> <center> <a href="http://www.cdrinfo.com"><img border="0" src="http://www.cdrinfo.com/images/button6.gif" alt="Click here to Visit CDR-Info!"></a> First view the page with IE, and view source in IE. Now do the same in Mozilla. You'll get different results. croberts@gilsongfx.com

Erik Goodlad

Comment 16

•

23 years ago

I support Stuart Johnston in that it's vital that HTML should not be parsed with <textarea> tags. IMHO this should extend beyond HTML comments, and include HTML characters (the ones that begin with "&" and end with ";", and have "nbsp" or "lt" etc. in the middle).

Michael Lucas

Comment 17

•

23 years ago

This is a showstopper for our use of the browser. Can someone specify what we need to do to proceed? If unescaped HTML inside a TEXTAREA is actually in conformance with the HTML spec, do we need to approach W3C to discuss the issue and possibly request a modification the spec?

Boris Zbarsky [:bzbarsky]

Comment 18

•

23 years ago

*** Bug 144013 has been marked as a duplicate of this bug. ***

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

23 years ago

Whiteboard: [ADT2] → [ADT1 RTM]

R.K.Aa.

Comment 19

•

23 years ago

*** Bug 145238 has been marked as a duplicate of this bug. ***

Dawn Endico

Updated

•

23 years ago

Blocks: 143200

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 20

•

23 years ago

Attached patch WIP, does not work (obsolete) — Details — Splinter Review

This shows the idea I am after. New comment parsing has been implemented for strict mode (not really new, just create comment string between '<' and '>' and hold iters into the string for actual data start and end). Will do quirks after I see this works in practice. Create comment node with that string and iters. To minimize changes, don't change GetStringValue() member, but expose the new stuff with a new method that gets the iters (or possibly substring). Most callers of GetStringValue are fine with just that value (for example view source and serialization), but the data that is exposed to DOM must be the substring between iters. I think we'll need a new interface, nsINSDOMComment or something, that we use internally for comments instead of nsICOMComment. Additional twist is if someone edits the comment data via script: we must create new data, but retain the old delimiters (for serialization).

Sebastian Schaffert

Comment 21

•

23 years ago

In my opinion, it would be sufficient if there was a clear rule like "HTML has to be quoted in textareas". At the moment, it is neither parsed correctly nor displayed as is. If it was parsed correctly, then a comment would be removed completely, but in the current situation, the parser just removes the opening and closing tags. On the other hand, I would like it if textareas just would allow unparsed content, simply for practical reasons. I run a web form where people can modify XML documents (which are then evaluated) and it would be rather inconvenient to quote every brace and ampersand. P.S. Mozilla only began behaving this way in 0.9.9 (or RC1?) ...

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 22

•

23 years ago

That won't do. There are several things that go wrong, all of which were caused by bug 110544. The things that come to mind right now are: * Comments in textarea elements * View source (noticeable with bad comments) * Serialization My fix will fix all of them. To put it simply, the problem is that we need to retain information about how the comments started and ended in order to make round tripping possible. All of the above problems are caused by this basic problem, or they are variations of it. Currently the code only stores what is between the comment delimiters.

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 23

•

23 years ago

Turns out the full fix seems to be pretty big, with some non-trivial changes. While I am working on that, harishd will be working on making a smaller band-aid fix to fix just the textarea part.

harishd

Comment 24

•

23 years ago

Attached patch patch v1.0 (obsolete) — Details — Splinter Review

This patch should fix the problem with comments in textarea and in viewsource. I've added an extra member to the comment token that stores the complete comment. AFAIK, the only overhead is the extra member there is no extra copy ( I think :-/ ) because the member variable is a slidingsubstring. FYI: The patch needs clean up and extensive testing.

chris hofmann

Comment 25

•

23 years ago

is backing out bug 110544 an option for rc3/1.0? and we could land the optimization and the bigger clean up patches for 1.01?

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 26

•

23 years ago

Backing out is not a good option in my opinion: in addition to some performance, the fix did fix some pages we had never gotten right, including pages on top100 sites, and it made us more standards compliant and more tolerant to really bad comments. Also there have been quite a bit of changes on top of these changes as well. I think harishd's patch would be good enough for RC3/1.0, need to confirm with a bit of testing tomorrow.

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 27

•

23 years ago

Attached file testcase — Details

8 different comments in textarea

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 28

•

23 years ago

I can understand how the patch fixes view source, and that looks good. However, I don't know how this would fix textareas. Maybe something was missing from the patch?

harishd

Comment 29

•

23 years ago

>I can understand how the patch fixes view source, and that looks good. However, >I don't know how this would fix textareas. Maybe something was missing from the >patch? Textarea content is collected in CNavDTD::CollectSkippedContent() by calling AppendSourceTo on the token ( comment in our case ). CCommentToken::AppendSourceTo will provide us with the comment as seen on the source.

harishd

Comment 30

•

23 years ago

Attached patch patch v1.1 — Details — Splinter Review

This patch in addition to fixing the stated problem it also fixes a problem, pertaining to comments, in the viewsource. I've replaced the nsString member, in CCommentToken, with nsSlidingSubstring. This would avoid one extra copy! Heikki: I have also removed a hack that was in place to avoid a hang ( bug 112943 ) since using nsSlidingSubstring avoids the problem.

Attachment #84404 - Attachment is obsolete: true

Attachment #84575 - Attachment is obsolete: true

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 31

•

23 years ago

Comment on attachment 84675 [details] [diff] [review] patch v1.1 >+ mNewlineCount = !(aFlag & NS_IPARSER_FLAG_VIEW_SOURCE) ? mComment.CountChar(kNewLine) : -1; We should count newlines from mCommentDecl member. With that, r=heikki.

Attachment #84675 - Flags: review+

harishd

Comment 32

•

23 years ago

>We should count newlines from mCommentDecl member. Yes it should be. Made that change on my local tree.

Johnny Stenback (:jst)

Comment 33

•

23 years ago

Comment on attachment 84675 [details] [diff] [review] patch v1.1 sr=jst

Attachment #84675 - Flags: superreview+

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 34

•

23 years ago

Oh, and get rid of the old unused member variable in CCommentToken :)

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 35

•

23 years ago

Attached file testcase: view source, save as complete — Details

This testcase shows the minor dataloss problem that still remains after this fix. The issue is that if you have malformed comments and you save page as complete, we always save with the correct comment open/close so your bad markup probably changed to better.

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

23 years ago

Keywords: adt1.0.0

Brendan Eich [:brendan]

Comment 36

•

23 years ago

Comment on attachment 84675 [details] [diff] [review] patch v1.1 with all of heikki's comments attended to, a=brendan@mozilla.org, please get this in tonight. /be

Attachment #84675 - Flags: approval+

scottputterman

Comment 37

•

23 years ago

adding adt1.0.0+ for checkin to the 1.0 branch.

Keywords: adt1.0.0 → adt1.0.0+

harishd

Comment 38

•

23 years ago

Fix landed ( 05/22 )on the trunk and branch.

Status: ASSIGNED → RESOLVED

Closed: 23 years ago

Resolution: --- → FIXED

Whiteboard: [ADT1 RTM] → [ADT1 RTM][Fixed on the trunk and branch 05/22 ]

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

23 years ago

Keywords: fixed1.0.0

R.K.Aa.

Comment 39

•

23 years ago

*** Bug 146576 has been marked as a duplicate of this bug. ***

Moied

Comment 40

•

23 years ago

Verified fixed with trunk and branch build 20020524 on winxp

Status: RESOLVED → VERIFIED

Jimmy Uen

Updated

•

23 years ago

Whiteboard: [ADT1 RTM][Fixed on the trunk and branch 05/22 ] → [ADT1 RTM][Fixed on the trunk and branch 05/22 ],custrtm-

Moied

Updated

•

23 years ago

Keywords: verified1.0.0

You need to log in before you can comment on or make changes to this bug.