Closed Bug 288597 Opened 20 years ago Closed 20 years ago

Editor mangles comments inside HTML tags

Categories

(Core :: DOM: HTML Parser, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: nipp2222, Unassigned)

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b2) Gecko/20050331
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b2) Gecko/20050331

When typing HTML in Source mode, comments inside HTML tags are not recognised
for what they are. For instance, a structure like

<p -- this is a paragraph -->

is correct HTML and should be left alone by the editor. However, the editor
doesn't seem to know about these kinds of comments and tries to change it into
something that it thinks is correct HTML.

Reproducible: Always

Steps to Reproduce:
1. switch to <HTML> Source and type <p -- this is a paragraph --> </p>
2. switch to Preview or another view mode, and back to Source.


Actual Results:  
<p this="" is="" a="" paragraph="" -=""> </p>

Expected Results:  
<p -- this is a paragraph --> </p>
(no change from what I entered)

This happens both with "Retain original source formatting" and "Reformat HTML
source".
The same bug can be found in NVU 0.90.
The HTML rules for comments are given here (basically just restating the SGML
rules) http://www.w3.org/TR/html401/intro/sgmltut.html#idx-HTML

Another example <!-- this is a comment -- this is not -- this is another comment -->

If you want to experiment, http://validator.w3.org/ is very helpful.
Maybe(In reply to comment #1)
> The HTML rules for comments are given here (basically just restating the SGML
> rules) http://www.w3.org/TR/html401/intro/sgmltut.html#idx-HTML
> 
> Another example <!-- this is a comment -- this is not -- this is another
comment -->

IMHO this is not valid, read the link you posted:
" A common error is to include a string of hyphens ("---") within a comment.
Authors should avoid putting two or more adjacent hyphens inside comments."

btw: We accept "--" in comments, but only in quirks mode, not in Standards mode.
I've done some more searching on the net (which I should have done before, I'm
afraid) but now I'm confused. Comments in HTML start and end with --, that much
is clear; but many pages mention that comments in HTML can only occur within
declarations, although not everybody agrees with that!
And I'm still not convinced either that the official specs exclude comments from
appearing inside HTML tags.
On the other hand, I can now understand where the editor is coming from if it
rejects that kind of comment, and that it's a choice, not an oversight.

Anyway, you raise an interesting point. If, as you say, a comment can contain --
in quirks mode, how does the parser interpret things like this at the very
beginning of a file:

<!-- comment -- DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 ...

before it's clear whether to use quirks or standards mode?
Reporter, there is no comment involved, you seem to have an invalid attribute
in the <p> start tag, videlicet:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
        "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
	<title>Untitled</title>
	<meta name="generator" content="BBEdit 7.1.4">
</head>
<body>
<p -- this is a paragraph -->Text<p/>
</body>
</html>

Has a total of 6 markup errors, of which the significant one is:
File "untitled text 4"; Line 9:  Document type doesn't permit attribute “--”
within element “<p>”.

nVu/Composer is probably acting correctly.

I am not sure that it helps you, but if you want to study comments, you
need to be aware that comments are only allowed within mark-up, which 
follows an MDO 
<! 
and the comment delimiter must follow the MDO without 
a space. See comment 1 above for more info.
Reporter, Since it is April the 1st (the only offical Internet holiday), 
I'll add one OT comment.

1. SMGL is really hard to understand: Probably only four people in total 
have ever really grasped the whole of it.

2. Comments begin and end with COM, and the closing COM must be followed by 
MDC without intervening white space.

3. If you want to write SGML, it doesn't really matter what 'not everyone' 
agrees with. It does matter that you either keep to the letter of the 
specification (hard) or well within some simple subset of it (easier).

4. If you want to use comments within start tags, it is up to you to match 
your requirement against the actual specification. If (like most people) you 
can confine yourself to comments within markup declaration, you will have 
less work to do!

5. You might find it a good idea to lay your hands on a tame SGML parser such 
as SP, if you want to know real answers to questions about whether markup is 
an SGML document.

5. You example looks invalid to me. I can repair it by closing the markup
declaration, videlicet:
<!-- comment -->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
        "http://www.w3.org/TR/html4/loose.dtd">

You can embed comments within entity and element declarations (and
you should), and I suspect that you can within a document type declaration,
videlicet:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
        "http://www.w3.org/TR/html4/loose.dtd"
    -- comment --
  >
 

6. The quirks/standards mode is a browser thing, and independently
implemented by browser vendors. So as I am aware it has no offical
standing.
sounds more like parser than editor
Assignee: composer → parser
Component: Composer → HTML: Parser
Product: Mozilla Application Suite → Core
QA Contact: mrbkap
Version: unspecified → Trunk
(In reply to comment #4)

Yes, based on everything I read, I feel now compelled to agree with this conclusion:

> nVu/Composer is probably acting correctly.

Probably. Comments can occur inside declarations (that is, <! .. > structures)
and there is no real evidence that they can occur inside HTML tags. So because
it's better to be safe than sorry, I shall never attempt to put comments inside
tags again.
I'm now off to check all the HTML documents I ever created.

P.S. how do you change the status of a bug report to "probably not a real bug" then?
You simply change the status of the bug to INVALID, when it's not a real bug.

See http://www.mozilla.org/bugs/ for more detailled info.
Status: UNCONFIRMED → RESOLVED
Closed: 20 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.