<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Comment 2

•

22 years ago

*** Bug 214475 has been marked as a duplicate of this bug. ***

Comment 3

•

22 years ago

This is invalid. No -- are permitted within a comment.

Status: UNCONFIRMED → RESOLVED

Closed: 22 years ago

Resolution: --- → INVALID

Comment 4

•

22 years ago

In SGML, "--" starts a comment and "--" ends a comment. HTML just uses SGML comments, with "<!" signalling the starts of SGML markup and ">" signalling its end. Therefore:  text --> Has two comments; one containing the string " " and one containing the string "> text ". Now if Mozilla is put in quirks mode, we do backwards compatible comment parsing (read "broken comment parsing just like old browsers"). But it sounds like the site in question put Mozilla in standards mode.

Comment 5

•

22 years ago

Bill, not the difference between what you said and how comment parsing actually works (it's subtle, but important: "--" is the comment delimiter, not just "not allowed inside a comment").

Comment 6

•

22 years ago

BZ, maybe before you CC me yet again on bugs that I don't want to be on, you should take time to take task within the W3C HTML group: "A common error is to include a string of hyphens ("---") within a comment. Authors should avoid putting two or more adjacent hyphens inside comments." Stop CCing me to be pedantic when I'm quoting a spec, or I swear I'll just stop donating time to Mozilla to triage bugs. I didn't start writing HTML yesterday.

Status: RESOLVED → VERIFIED

Oliver Klee

Comment 7

•

20 years ago

*** Bug 269104 has been marked as a duplicate of this bug. ***

Martijn Wargers (dead)

Comment 8

•

20 years ago

*** Bug 271854 has been marked as a duplicate of this bug. ***

Comment 9

•

20 years ago

*** Bug 271860 has been marked as a duplicate of this bug. ***

Steve England [:stevee]

Comment 10

•

20 years ago

*** Bug 288610 has been marked as a duplicate of this bug. ***

Erik Fabert

Comment 11

•

20 years ago

*** Bug 294614 has been marked as a duplicate of this bug. ***

Erik Fabert

Updated

•

20 years ago

Alias: SGMLComment

Phil Ringnalda (:philor)

Comment 12

•

20 years ago

*** Bug 294796 has been marked as a duplicate of this bug. ***

Richard Brodie

Comment 13

•

19 years ago

*** Bug 307747 has been marked as a duplicate of this bug. ***

Reed Loden [:reed]

Comment 14

•

19 years ago

*** Bug 318411 has been marked as a duplicate of this bug. ***

Erik Fabert

Comment 15

•

19 years ago

*** Bug 320933 has been marked as a duplicate of this bug. ***

Jo Hermans

Comment 16

•

19 years ago

*** Bug 321472 has been marked as a duplicate of this bug. ***

Jo Hermans

Comment 17

•

19 years ago

*** Bug 332516 has been marked as a duplicate of this bug. ***

Jo Hermans

Comment 18

•

19 years ago

*** Bug 338810 has been marked as a duplicate of this bug. ***

Kevin Brosnan

Comment 19

•

19 years ago

*** Bug 340975 has been marked as a duplicate of this bug. ***

Régis Caspar

Comment 20

•

19 years ago

*** Bug 341443 has been marked as a duplicate of this bug. ***

Richard

Comment 21

•

19 years ago

Invalid or not, I'm not sure I understand why Firefox sometimes will render these rogue comments, sometimes will treat them as commenting out entire sections, etc. Regardless of the number of "-" between the start and end of a comment, shouldn't they always not render and not affect anything else? With the number of duplicates, obviously lots of people use this technique to visually block off sections of code.

Comment 22

•

19 years ago

> I'm not sure I understand Please read the whole bug, esp. comment 4.

g0adragon

Comment 23

•

18 years ago

The following perfectly valid web page is made invalid by having two hyphen characters in a row inside an HTML comment. Enter the following web page code into the HTML validator under "Validate by Direct Input" at the following address: http://validator.w3.org/ <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <title>HTML Comments Display on Web pages</title> </head> <body>  <p>This line is supposed to be first visible text on web page.</p> <p>The page works perfectly if the hyphen is split with even a space or removed completely.</p> </body> </html>

Phil Ringnalda (:philor)

Comment 24

•

18 years ago

*** Bug 351419 has been marked as a duplicate of this bug. ***

Phil Ringnalda (:philor)

Comment 25

•

18 years ago

*** Bug 352664 has been marked as a duplicate of this bug. ***

ski

Comment 26

•

18 years ago

So if I'm a web developer and I need to allow -- inside comments, what am I to do? FFox renders this directly even with transitional mode, which strikes me as the wrong thing. Or is there some browser-specific way to force "quirks mode" ?

Mike Shaver (:shaver emeritus)

Comment 27

•

18 years ago

> So if I'm a web developer and I need to allow -- inside comments You can't do that in HTML, if the HTML spec is actually followed. > Or is there some browser-specific way to force "quirks mode" ? This is well-documented at http://developer.mozilla.org/en/docs/Mozilla%27s_DOCTYPE_sniffing I should note that that page is linked directly off http://developer.mozilla.org/en/docs/Mozilla's_Quirks_Mode, which is the first Google hit for "mozilla quirks".

Jesse Ruderman

Comment 28

•

18 years ago

*** Bug 359014 has been marked as a duplicate of this bug. ***

Maik Riechert

Comment 29

•

18 years ago

*** Bug 362663 has been marked as a duplicate of this bug. ***

Anne (:annevk)

Comment 45

•

17 years ago

FWIW, per HTML5 this is a bug in Firefox.

Frank Wein [:mcsmurf]

Comment 46

•

17 years ago

But -- is still not allowed inside a comment, right? Just the error handling is different (http://www.whatwg.org/specs/web-apps/current-work/#bogus)?

Anne (:annevk)

Comment 47

•

17 years ago

Right. (Though you would not end up in the "bogus comment state".) And also, that doesn't make it less of a bug :-)

Comment 48

•

17 years ago

Reopening per Anne's comment.

Severity: major → normal

Status: VERIFIED → UNCONFIRMED

Resolution: INVALID → ---

Mike Shaver (:shaver emeritus)

Updated

•

17 years ago

Assignee: harishd → nobody

Status: UNCONFIRMED → NEW

Ever confirmed: true

QA Contact: dsirnapalli → parser

Reed Loden [:reed]

Updated

•

17 years ago

URL: (page was repaired to avoid problem)

Alan O.

Comment 51

•

17 years ago

If this is because -- is not allowed between , then it sounds like the specification is inadequate. It defies common sense to use something like that for delimiting when you know full well that it's also used for comments and that -- is commonly used in texts. Use a less common sequence of characters for that purpose, ergo . Don't make the job of a web developer more a pain because of ridiculously short-sighted standards.

Ryan Jones-Ward [:sciguyryan]

Updated

•

17 years ago

Keywords: html5

tarquin

Comment 52

•

17 years ago

SGML comments are ridiculous and problematic. They confuse authors and break pages. HTML 5 recognises this, and no longer requires browsers to parse comments in SGML format. They are now parsed in a way compatible with all browsers except Firefox. Firefox seems to be insisting on hanging on to the SGML comments, even though others realised they are stupid, and protested against their inclusion in Acid 2. HTML is not SGML, and never has been (even though it was originally supposed to be) - this won't work anywhere, even though it is valid SGML: http://virtuelvis.com/download/162/evilml.html SGML comments were removed from Acid 2 because they are stupid. They were removed from HTML 5 because they are stupid. It's time to remove them from Firefox, and stop breaking pages like the one in this report. For those wondering why some patterns work, and others leave bits on the page: http://www.howtocreate.co.uk/SGMLComments.html

Dotan Cohen

Comment 53

•

17 years ago

If Tarquin is right and SGML comments were removed from HTML5, then Firefox should not treat documents with a valid HTML5 Doctype and SGML comment deliminators as such. For older documents, or non-valid HTML5 documents, the gotcha-but-correct behaviour should be maintained.

Updated

•

17 years ago

Flags: blocking1.9.1?

tarquin

Comment 54

•

17 years ago

Note that this is specified in the parsing section of HTML 5 (the language definition tells authors not to include -- inside comments, but the error handling stage of parsing will allow it): http://www.w3.org/html/wg/html5/#comment3 "U+002D HYPHEN-MINUS (-) Parse error. Append a U+002D HYPHEN-MINUS (-) character to the comment token's data. Stay in the comment end state. ... Anything else Parse error. Append two U+002D HYPHEN-MINUS (-) characters and the input character to the comment token's data. Switch to the comment state."

Damon Sicore (:damons)

Comment 55

•

17 years ago

Wouldn't hold the release for this, but I think we should get this on the list for 1.9.1. blocking1.9.1- wanted1.9.1+, P2. Anyone want to volunteer here?

Flags: wanted1.9.1+

Flags: blocking1.9.1?

Flags: blocking1.9.1-

Priority: -- → P2

Comment 56

•

17 years ago

I'll take this.

Assignee: nobody → mrbkap

tarquin

Comment 57

•

17 years ago

"For older documents, or non-valid HTML5 documents, the gotcha-but-correct behaviour should be maintained." This will not help anyone. The broken pages (including the one in this report) will not get fixed. Firefox will remain incompatible with all other browsers. SGML comments were removed from HTML because they are stupid in all cases, not sometimes-stupid. Having comments that nobody understands in HTML 4 standards mode, but not in quirks mode or HTML 5 mode, is beyond confusing. They should be removed in all modes in order to be compatible with existing Web Pages, other browsers, and author expectations, while providing a consistent response to all doctypes.

Updated

•

17 years ago

URL: http://www.w3.org/html/wg/html5/#markup

Status: NEW → ASSIGNED

Comment 58

•

17 years ago

Attached patch Implement HTML5 comments, v1 (obsolete) — Details — Splinter Review

If we're going to replace our comment parsing, we might as well implement HTML5. This patch is a straightforward implementation of the part of the state machine that consumes comments. I haven't tested it very thoroughly (in particular, I need to ensure that the behavior is consistent across packet boundaries) but as far as I can tell, it follows the spec word for word.

Damon Sicore (:damons)

Comment 59

•

17 years ago

Blake, what test framework do we use to test something like this?

Comment 60

•

17 years ago

We have parser mochitests.

Comment 61

•

17 years ago

So, the parser mochitests work, but require a bunch of manual verification. In particular, my patch doesn't affect where we put the comments in the DOM and all of the interesting test cases in our mochitests hit this problem.

Comment 62

•

17 years ago

Doesn't your patch affect where the comment terminates and therefore what Element nodes end up in the document?

Comment 63

•

17 years ago

Sorry, yes. I meant that given the testcase ||, our resulting DOM looks like: HTML HEAD  BODY where the tests want  HTML HEAD BODY and fixing that seems beyond the scope of this bug (unless people say otherwise).

Comment 64

•

17 years ago

Sure. I was assuming we'd add the tests from this bug and/or duplicates to parser/htmlparser/tests/mochitest/regressions.txt or some such. At least that's where I've been adding the parser tests... ;)

Comment 65

•

17 years ago

Attached patch Implement HTML5 comments, v2 (obsolete) — Details — Splinter Review

I made the state machine a little less jumpy and started adding tests. Unfortunately, the tests I've added here all fail. I don't understand how we're serializing these comments.

Attachment #326714 - Attachment is obsolete: true

Comment 66

•

17 years ago

Attached patch Implement HTML5 comments, v2.5 (obsolete) — Details — Splinter Review

This adds a bunch of tests, and I'm pretty sure that I've implemented the spec faithfully. This is ready for review.

Attachment #327975 - Attachment is obsolete: true

Attachment #328128 - Flags: review?(jonas)

Comment 67

•

17 years ago

One thing I've noticed is in the testcase: |<title>foo  I'll fix that.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 68

•

17 years ago

Attached patch Implement HTML5 comments, v2.7 — Details — Splinter Review

Here's the interesting part of the interdiff: diff --git a/parser/htmlparser/src/nsHTMLTokens.cpp b/parser/htmlparser/src/nsHTMLTokens.cpp --- a/parser/htmlparser/src/nsHTMLTokens.cpp +++ b/parser/htmlparser/src/nsHTMLTokens.cpp @@ -900,6 +900,10 @@ CTextToken::ConsumeParsedCharacterData(P consumer.AppendSourceTo(theContent.writable()); mNewlineCount += consumer.GetNewlineCount(); + + // If we successfully consumed a comment, end the title after the + // comment. + aScanner.CurrentPosition(altEndPos); continue; } }

Attachment #328128 - Attachment is obsolete: true

Attachment #329653 - Flags: review?(jonas)

Attachment #328128 - Flags: review?(jonas)

Comment 72

•

16 years ago

Blake, what is the status of this patch? Is it still good to review?

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 73

•

16 years ago

(In reply to comment #72) > Blake, what is the status of this patch? Is it still good to review? Yeah, the only question that needs answering before review is whether we want to do this at all.

Comment 74

•

16 years ago

Assuming that this makes us follow the HTML5 algorithm I think we should try it. The only reason not to do it would be if we think the HTML5 parser is going to land pretty soon anyway...

pp

Comment 78

•

16 years ago

I was the one opening bug 484036. This bug unfortunately didn't show up when I searched, I'm sorry for that. Just to add to this discussion, putting the url of an IDN-domain between those comment tags triggers this error too. I run several forums based off vBulletin and this application puts the forum's url wrapped in a comment in the footer. One of my forums is using an IDN-domain and vBulletin puts that domain name in punycode format in the footer (www.xn--something-xyz.com) and breaks the page. Not allowing certain valid domains within a comment seems a little far fetched so I hope this problem will get resolved. Other browsers renders this correctly.

Comment 79

•

16 years ago

> Not allowing certain valid domains within a comment seems a little far fetched Er, say what? The two languages are unrelated! You wouldn't complain if a url that contains "*/" ended a CSS comment, would you? This bug should be fixed for compat, but said compat is just a workaround for people sticking things with the comment end delimiter in them inside comments...

pp

Comment 80

•

16 years ago

Not exactly sure what you mean. I was referring to comment #3 where someone claims it's forbidden to have -- enclosed in comment tags. The people inventing punycode and making IDN domains a standard either didn't know this or completely ignored it. In any case it's the users of Firefox that has to pay the price which, after reading this thread, seems to have been forgotten. Standards seem to be more important for Mozilla than user experience. If I have misunderstood this I sincerely apologize.

Tony Mechelynck [:tonymec]

Comment 81

•

16 years ago

> or completely ignored it The latter. Or assumed people would properly escape their stuff inside comments, of course.

Comment 82

•

16 years ago

In reply to comment #80: If you want to put */ inside a C comment without ending the comment, you have to alter it somehow. Add a space between the star and slash, maybe. Similarly, you can't put a punycode URL, containing two dashes in the middle of text, inside an HTML4 comment -- you have to alter it somehow. If you want the punycode URL to be human-readable, you can replace two dashes by four (but the opposite conversion will have to be done when copying it to the URL bar); or if you want it to be machine-readable, I think you can replace the two dashes by %2D%2D (correct me, someone, if I'm wrong in thinking that such an "escaped" URL, when copied to the URL bar, will be correctly interpreted, first by having each %2D replaced by a dash, then by punycode interpretation). _Four_ dashes in sequence are allowed within an HTML comment, even in HTML4; and %2D%2D means -- in URL syntax but not in HTML comment syntax. IIUC, HTML5 comment syntax differs from HTML4 comment syntax; but I don't know the details. The above paragraph is about HTML4 but this bug is (IIUC, now that it has been reopened, assigned, and given the "html5" keyword) about altering Gecko to handle HTML5 comments correctly.

Adam Nielsen

Comment 83

•

16 years ago

Your example is not quite correct though. */ is the terminating marker for a C comment - nobody is wanting to put --> inside a HTML comment. The problem is putting just part of the comment terminator inside a comment terminates it. It would be like being unable to use a single * inside a C comment. The same argument applies to comment #79. I wouldn't complain if a URL containing */ ended a CSS comment, but I could complain if a URL containing * ended it. Likewise I'm not complaining that "-->" ends a HTML comment, but rather that "--" does. I'm not aware of any other language that uses one character sequence to start a comment, but two or more different sequences to terminate that same comment.

Mike Shaver (:shaver emeritus)

Comment 84

•

16 years ago

You can choose whether you're wrong according to HTML4.01 and SGML: http://htmlhelp.com/reference/wilbur/misc/comment.html or according to HTML5: http://dev.w3.org/html5/spec/Overview.html#comments but unfortunately that's just the way HTML comments work. You can't put -- inside them.

Comment 85

•

16 years ago

Adam, in SGML and HTML the comment terminator sequence is "--", and can only be used inside an SGML markup declaration. SGML markup declaration start with "<!" and end with ">". Here's an example from the HTML4 DTD at http://www.w3.org/TR/REC-html40/sgml/dtd.html : <!ATTLIST Q %attrs; -- %coreattrs, %i18n, %events -- cite %URI; #IMPLIED -- URI for source document or msg -- > Here the comments are "%coreattrs, %i18n, %events" and "URI for source document or msg" but the rest of the text is not comment and in fact is the declaration for the allowed attributes on the Q element. I realize this is a bit more complex than the way comments work in C, but that's life. Note that I already said all this in this bug in comment 4, almost 6 years ago... In any case, this bug is now about ignoring HTML4 and implementing the HTML5 definition of comments, which does indeed start with "", so I'm really not sure what all the discussion is about at this point.

Murray Crowe

Comment 86

•

16 years ago

I've noticed that -- followed by any sequence of characters and a > within a conditional comment really messes things up. Surely a conditional comment needs to be an exception to this rule, as it may contain a script with valid code. I realise conditional comments aren't best practice coding, but they do exist and they do get parsed by browsers. Here's an example:

Marius Hudea

Comment 87

•

16 years ago

This is starting to get ridiculous... #86 ... perhaps you can use eval and unescape to avoid having -- and > characters in IE only code?

Comment 88

•

16 years ago

Conditional comments are "parsed by browsers" as just comments, with no special treatment, except for IE. Seriously, if you want to shoot yourself in the foot with HTML you can. And I still don't nderstand what the discussion is about, since the plan is to change behavior here... Can people just shut up and let the bug be until it's fixed?

Lars Gunther

Comment 90

•

16 years ago

#86 Use Conditional compilation instead. Problem solved!

pbyhistorian

Comment 92

•

16 years ago

I submitted duplicate bug 500110 for this, after three failed searches. Boris' comment "... I already said all this ... almost six years ago" (#86) is distressing. Six years! By the time HTML5 is official *and* the browser manufacturers adopt it *and* older browsers like Firefox 3.0.11 disappear from the Internet, I may have retired. I'm in the camp that uses dashes (and others) in comments to visually break my code into sections. A solution that looks good and seems to work well is to replace the non-allowed dashes with character 196 (Alt-196 in Windows).

Mario Rossi

Comment 94

•

16 years ago

ã

Jim Michaels

Comment 96

•

15 years ago

but what I am curious about is, are these valid HTML? Am I to understand that a space is required?  (one - in the middle. in firefox, this is not a usable comment because it has an odd number of -'s.)  (one + in the middle, should be same problem as above, but lexically analyzed differently. I am not sure, but I think in firefox this may not be a usable comment.)  (->-> in the middle. in firefox this is likely not be a usable comment because it has 3 -'s.)  (two - in the middle. in firefox, this is a usable comment because it has an even number of -'s.) What I have noticed in the past about the firefox lexer is that it just simply pairs off -'s, which may be the wrong way to lexically analyze them. I was thinking of the following algorithm: char0=0 char1=0 char2=0 match " was coming down the pipe, and I could keep it always 3 or 4 characters full of characters that I newly got. got it? what you would need to implement this in C++ is something like the STL to do this the easiest way. there is already a deque class. unfortunately, I have already stuffed my STL book in a box for moving. an iterator should provide the necessary means of iterating across the data elements of the deque. and they are easy to make. Jim Michaels

Adam Nielsen

Comment 97

•

15 years ago

Well the way I look at it is the tag name is "!--" in the same way as the tag name might be "img". Obviously <imgblah> is not valid (you need a space after the tag name), so I would expect  to be something completely different to a comment.  would be a comment with an uneven number of dashes.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 98

•

15 years ago

Jim, please read comment 85 (and then comment 4). Those explain exactly how SGML and HTML4 work here (and yes, they require simply pairing off "--"). Nte that a comment containing "--" is invalid HTML4 and it says so right in the HTML4 specification. HTML5 changes the specified behavior here, which is why this bug is still (or rather again) open.

Updated

•

15 years ago

Attachment #329653 - Flags: review?(jonas)

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 101

•

15 years ago

Comment on attachment 329653 [details] [diff] [review] Implement HTML5 comments, v2.7 Hopefully all of this code is going away soon, so no need to muck around with it at this point. If that plan changes please rerequest review

Jesse Ruderman

Updated

•

15 years ago

Depends on: html5-parsing

Joseph

Comment 105

•

15 years ago

Not sure about implementation in swallowing, but I think you use the big state machines... while in comments you could break this into three states: IN_COMMENTS: if ( nextchar == "-" ) { state = IN_COMMENTS_ONEDASH; break; } // else swallow the rest; IN_COMMENTS_ONEDASH: if ( nextchar == "-" ) { state = IN_COMMENTS_TWODASH; break; } else { state = IN_COMMENTS; break } IN_COMMENTS_TWODASH: if ( nextchar == ">" ) { state = DATA; // or return to whatever state we were in break; } else if ( nextchar == "-" ) { state = IN_COMMENTS_TWODASH; // cause now we have two dashes again } else { state = IN_COMMENTS; } // maybe some error here because of dashes and no end to comments That should work for the state engines and you can keep track of the errors.

Joseph

Comment 106

•

15 years ago

(In reply to comment #105) http://www.w3.org/TR/2010/WD-html5-20100304/syntax.html#comments (just counting " or ". If white-space (or "!") is encountered such as in COMMENT_END it should do what it does in "default" and not go to COMMENT_END_SPACE (or BANG) like it does. Other than that and other tokenizers, it appears it should work.

Henri Sivonen (:hsivonen)

Updated

•

15 years ago

Assignee: mrbkap → nobody

Updated

•

15 years ago

Status: ASSIGNED → RESOLVED

Closed: 22 years ago → 15 years ago

Resolution: --- → FIXED

Whiteboard: [fixed by the HTML5 parser]

James

Comment 108

•

15 years ago

I thought  delimited comments. At least that is how it works with ie. (Yes I understand that ie doesn't follow all standards).