Closed
Bug 133044
Opened 23 years ago
Closed 23 years ago
Single line HTML comments get mangled in textarea
Categories
(Core :: DOM: HTML Parser, defect, P1)
Tracking
()
VERIFIED
FIXED
mozilla1.0
People
(Reporter: bugzilla, Assigned: hjtoi-bugzilla)
References
Details
(4 keywords, Whiteboard: [ADT1 RTM][Fixed on the trunk and branch 05/22 ],custrtm-)
Attachments
(4 files, 2 obsolete files)
183 bytes,
text/html
|
Details | |
308 bytes,
text/html
|
Details | |
9.73 KB,
patch
|
hjtoi-bugzilla
:
review+
jst
:
superreview+
brendan
:
approval+
|
Details | Diff | Splinter Review |
323 bytes,
text/html
|
Details |
When editing HTML in a text area, single line comments have the opening and
closing comment tags removed.
<!--
this is fine, cuz it's multi-line
-->
<!-- THISLLBREAK -->
The latter turns into:
THISLLBREAK
This has the potential for pretty big data corruption.
![]() |
||
Comment 3•23 years ago
|
||
You should escape the "<" and ">" because as the HTML spec says, "comments are
markup".
Over to parser.
Assignee: rods → harishd
Status: UNCONFIRMED → NEW
Component: HTML Form Controls → Parser
Ever confirmed: true
QA Contact: madhur → moied
Comment 4•23 years ago
|
||
Probably related to bug 105973.
Comment 5•23 years ago
|
||
The problem is that the parser should be ignoring any HTML within the
<textarea></textarea>. This is critical as it can result in dataloss! This
will make Mozilla unsafe to use for managing webpages, from a webpage.
I am attaching a more complete test case. In my experience, multi-line doesn't
make a difference (build 2002032803).
Comment 6•23 years ago
|
||
Possible dupe of bug 64799 or bug 127043 particularly the bits to do with
<SCRIPT> tags within TEXTAREAs from those bugs.
Comment 8•23 years ago
|
||
I still don't find any reference in the HTML 4.01 Specification that says there
can't be html within a <textarea> although I did find it in the 3.2 Spec and I
suppose that it probably does make sense.
However, this is not consistent with any other browsers and WILL RESULT IN
DATALOSS in many applications. Therefore, IMHO, it should only be enforced in
Strict mode.
Comment 9•23 years ago
|
||
Marking "compat". The only tags in which unescaped HTML is not parsed are those
with declared content CDATA, such as <script> and <style>. <textarea> is not one
of them.
Comment 10•23 years ago
|
||
If you are writing for one spec and put it on a different doctype you are gonna
get problems.. especially with depreciated tags in newer versions of standards.
Comment 11•23 years ago
|
||
Heikki, could you please take a look at this? Thanx.
Assignee: harishd → heikki
![]() |
||
Comment 12•23 years ago
|
||
*** Bug 135364 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 13•23 years ago
|
||
Hmm... seems I broke this with my comment optimization :(
Status: NEW → ASSIGNED
OS: Linux → All
Priority: -- → P1
Target Milestone: --- → mozilla1.0
Assignee | ||
Updated•23 years ago
|
Whiteboard: [ADT2]
Comment 14•23 years ago
|
||
*** Bug 138494 has been marked as a duplicate of this bug. ***
Comment 15•23 years ago
|
||
I get something similar. Go to http://chuckr.bravepages.com for an example.
Sometimes the page is rendered and missing the whole table, a single line
comment seems to mess it up. Other times it works. Using Mozilla 1.0rc1. Here is
a partial code sample as viewed from Mozilla 1.0rc1. Everything after the BODY
tag seems to be commented out, until you get to the end of the table.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
<html>
<head>
<title>Freezone freeware</title>
<meta name="keywords" content="freeware freezone computer software free Unrealed
Unrealed2">
<meta NAME="description" CONTENT="Freezone Freeware, 600+ freeware programs">
<meta NAME="author" CONTENT="croberts@gilsongfx.com">
<body bgcolor="#c0c0c0" TOPMARGIN=0 LEFTMARGIN=5 MARGINHEIGHT=0 MARGINWIDTH=5>
<!--></noscript></div></xmp></style>
<center>
<!partner is bravepages 192.168.100.13>
<SCRIPT SRC='http://www.bravenet.com/jsbanner.php?size=666'></SCRIPT><SCRIPT
SRC='http://www.bravenet.com/jsbanner.php?size=999'></SCRIPT>
</center>
<table border=1>
<tr>
<td width=20% bgcolor=lightblue valign=top>
<p>
<center>
<a href="http://www.cdrinfo.com"><img border="0"
src="http://www.cdrinfo.com/images/button6.gif" alt="Click here to Visit
CDR-Info!"></a>
First view the page with IE, and view source in IE. Now do the same in Mozilla.
You'll get different results.
croberts@gilsongfx.com
Comment 16•23 years ago
|
||
I support Stuart Johnston in that it's vital that HTML should not be parsed with
<textarea> tags.
IMHO this should extend beyond HTML comments, and include HTML characters (the
ones that begin with "&" and end with ";", and have "nbsp" or "lt" etc. in the
middle).
Comment 17•23 years ago
|
||
This is a showstopper for our use of the browser. Can someone specify what we
need to do to proceed? If unescaped HTML inside a TEXTAREA is actually in
conformance with the HTML spec, do we need to approach W3C to discuss the issue
and possibly request a modification the spec?
![]() |
||
Comment 18•23 years ago
|
||
*** Bug 144013 has been marked as a duplicate of this bug. ***
Assignee | ||
Updated•23 years ago
|
Whiteboard: [ADT2] → [ADT1 RTM]
Comment 19•23 years ago
|
||
*** Bug 145238 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 20•23 years ago
|
||
This shows the idea I am after. New comment parsing has been implemented for
strict mode (not really new, just create comment string between '<' and '>' and
hold iters into the string for actual data start and end). Will do quirks after
I see this works in practice. Create comment node with that string and iters.
To minimize changes, don't change GetStringValue() member, but expose the new
stuff with a new method that gets the iters (or possibly substring). Most
callers of GetStringValue are fine with just that value (for example view
source and serialization), but the data that is exposed to DOM must be the
substring between iters. I think we'll need a new interface, nsINSDOMComment or
something, that we use internally for comments instead of nsICOMComment.
Additional twist is if someone edits the comment data via script: we must
create new data, but retain the old delimiters (for serialization).
Comment 21•23 years ago
|
||
In my opinion, it would be sufficient if there was a clear rule like "HTML has
to be quoted in textareas". At the moment, it is neither parsed correctly nor
displayed as is. If it was parsed correctly, then a comment would be removed
completely, but in the current situation, the parser just removes the opening
and closing tags.
On the other hand, I would like it if textareas just would allow unparsed
content, simply for practical reasons. I run a web form where people can modify
XML documents (which are then evaluated) and it would be rather inconvenient to
quote every brace and ampersand.
P.S. Mozilla only began behaving this way in 0.9.9 (or RC1?) ...
Assignee | ||
Comment 22•23 years ago
|
||
That won't do. There are several things that go wrong, all of which were caused
by bug 110544. The things that come to mind right now are:
* Comments in textarea elements
* View source (noticeable with bad comments)
* Serialization
My fix will fix all of them. To put it simply, the problem is that we need to
retain information about how the comments started and ended in order to make
round tripping possible. All of the above problems are caused by this basic
problem, or they are variations of it. Currently the code only stores what is
between the comment delimiters.
Assignee | ||
Comment 23•23 years ago
|
||
Turns out the full fix seems to be pretty big, with some non-trivial changes.
While I am working on that, harishd will be working on making a smaller band-aid
fix to fix just the textarea part.
Comment 24•23 years ago
|
||
This patch should fix the problem with comments in textarea and in viewsource.
I've added an extra member to the comment token that stores the complete
comment. AFAIK, the only overhead is the extra member there is no extra copy (
I think :-/ ) because the member variable is a slidingsubstring.
FYI: The patch needs clean up and extensive testing.
Comment 25•23 years ago
|
||
is backing out bug 110544 an option for rc3/1.0? and we could land the
optimization and the bigger clean up patches for 1.01?
Assignee | ||
Comment 26•23 years ago
|
||
Backing out is not a good option in my opinion: in addition to some performance,
the fix did fix some pages we had never gotten right, including pages on top100
sites, and it made us more standards compliant and more tolerant to really bad
comments. Also there have been quite a bit of changes on top of these changes as
well. I think harishd's patch would be good enough for RC3/1.0, need to confirm
with a bit of testing tomorrow.
Assignee | ||
Comment 27•23 years ago
|
||
8 different comments in textarea
Assignee | ||
Comment 28•23 years ago
|
||
I can understand how the patch fixes view source, and that looks good. However,
I don't know how this would fix textareas. Maybe something was missing from the
patch?
Comment 29•23 years ago
|
||
>I can understand how the patch fixes view source, and that looks good. However,
>I don't know how this would fix textareas. Maybe something was missing from the
>patch?
Textarea content is collected in CNavDTD::CollectSkippedContent() by calling
AppendSourceTo on the token ( comment in our case ).
CCommentToken::AppendSourceTo will provide us with the comment as seen on the
source.
Comment 30•23 years ago
|
||
This patch in addition to fixing the stated problem it also fixes a problem,
pertaining to comments, in the viewsource. I've replaced the nsString member,
in CCommentToken, with nsSlidingSubstring. This would avoid one extra copy!
Heikki: I have also removed a hack that was in place to avoid a hang ( bug
112943 ) since using nsSlidingSubstring avoids the problem.
Attachment #84404 -
Attachment is obsolete: true
Attachment #84575 -
Attachment is obsolete: true
Assignee | ||
Comment 31•23 years ago
|
||
Comment on attachment 84675 [details] [diff] [review]
patch v1.1
>+ mNewlineCount = !(aFlag & NS_IPARSER_FLAG_VIEW_SOURCE) ? mComment.CountChar(kNewLine) : -1;
We should count newlines from mCommentDecl member.
With that, r=heikki.
Attachment #84675 -
Flags: review+
Comment 32•23 years ago
|
||
>We should count newlines from mCommentDecl member.
Yes it should be. Made that change on my local tree.
Comment 33•23 years ago
|
||
Comment on attachment 84675 [details] [diff] [review]
patch v1.1
sr=jst
Attachment #84675 -
Flags: superreview+
Assignee | ||
Comment 34•23 years ago
|
||
Oh, and get rid of the old unused member variable in CCommentToken :)
Assignee | ||
Comment 35•23 years ago
|
||
This testcase shows the minor dataloss problem that still remains after this
fix. The issue is that if you have malformed comments and you save page as
complete, we always save with the correct comment open/close so your bad markup
probably changed to better.
Comment 36•23 years ago
|
||
Comment on attachment 84675 [details] [diff] [review]
patch v1.1
with all of heikki's comments attended to, a=brendan@mozilla.org, please get
this in tonight.
/be
Attachment #84675 -
Flags: approval+
Comment 37•23 years ago
|
||
adding adt1.0.0+ for checkin to the 1.0 branch.
Comment 38•23 years ago
|
||
Fix landed ( 05/22 )on the trunk and branch.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Whiteboard: [ADT1 RTM] → [ADT1 RTM][Fixed on the trunk and branch 05/22 ]
Assignee | ||
Updated•23 years ago
|
Keywords: fixed1.0.0
Comment 39•23 years ago
|
||
*** Bug 146576 has been marked as a duplicate of this bug. ***
Comment 40•23 years ago
|
||
Verified fixed with trunk and branch build 20020524 on winxp
Status: RESOLVED → VERIFIED
Whiteboard: [ADT1 RTM][Fixed on the trunk and branch 05/22 ] → [ADT1 RTM][Fixed on the trunk and branch 05/22 ],custrtm-
Keywords: verified1.0.0
You need to log in
before you can comment on or make changes to this bug.
Description
•