User-Agent: Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.5a) Gecko/20030610 Build Identifier: Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.5a) Gecko/20030610 When text is copied to the clip-board and brought to Mozilla Composer using paste then Word specific <![if ...]> ... <![endif]> are written into <!--[if ...]--> ... <!--[endif]--> which is not bad to Mozilla but Internet Explorer displays it as clear text as IE seem to not identify "<!--[" as a comment. This also happens when using Composer to edit a .htm produced by Word using Save as Web Page. This makes it hard to use Mozilla to work with Word documents which makes our users choose off Mozilla. Reproducible: Always Steps to Reproduce: 1. Open Composer and click the Source pane. 2. Insert "<![if]>" (without the quotes) is the body sections 3. Switch to "Normal" and then back to "Source" 4. The string above is rewritten into "<!--[if]-->" Actual Results: The <!--[if]--> is show as text in Internet Explorer Expected Results: It should have left the "<![if]>" (and the like) as-is.
This problem i seen in 1.4b and 1.4(May 29) on Windows XP and in the current nightly build on Solaris 8
It is also checked to happen in 1.0.2-2 from Debian 3.0r1 on i386 Furthermore it is independent of the settings "Retain original source formatting" and "Reformet HTML source" which would be obvious as it occurs while editing (well when switching display mode).
Sounds like a parser issue. Shouldn't <![if ...] be treated as an unknown tag. Not a comment? --> Reassigning to parser for further investigation.
Why would it be treated as an unknown tag? The entire premise of this bogosity is that non-IE browsers will treat them as comments, as we do. The real question is "Should Composer be able to preserve and emit an invalid MS markup 'innovation'?", which I tend to feel it shouldn't, although this is of course up to the module owners.
As far as I understand the sgml documents "<!" is not an comment in it self - it is an "comment declaration" marking a region in the file where comments can occur - the actual comment starts at the "--" sequence. Hence it is perfectly legal sgml to use the "<! ... >" comment declaration for something else. If Composer keeps "fixing" the "<! ... >" sequences into "<!-- ... -->" then users of M$ Word will not be able to copy-n-paste parts for Word text into the Composer to successfully view it in M$ Internet Explorer. Try for yourselves. IMHO this will stop any attempt to persuade Office users into use Mozilla Composer instead of other web editors and your could even as well stop developing the Composer part of Mozilla. I would actually vote to remove it. If pages made with Composer cannot reliably be viewed in MSIE which 95% of the world's surfers use then there is no need for Mozilla Composer. Just my ¢.02
Had a long IM conversation with Harish about this. We've worked out a solution which he will implement. Thanks, Harish! The gist of the solution is that we will continue to treat markup like <![ ... ]> as comments but add delimeter information to comment nodes so that they can be serialized correctly.
Not to start a flame war or anything, but Brian, you said: > Hence it is perfectly legal sgml to use the "<! ... >" comment declaration > for something else. SGML is very specific about what's allowed between "<!" and ">", and what MS Word inserts there is by no means valid SGML. The only thing that's allowed between "<!" and ">" is an optional list of optionally whitespace separated sequences of "--...--", and nothing else. Therefore, it's not unreasonable for Mozilla to clean up those invalid comments when serializing. Having said that, I'm not arguing that we shouldn't fix this, but we shouldn't pretend that what MS Word is doing is by any means standards compliant, so our fix should apply to quirks mode documents only (as I just discussed with Harish).
Johnny, may I start with the same phrase that you were "Not to start a flame war or anything" ? Your point may be valid for plain HTML within the <body> section as proposed by W3C. But let me remind you that all HTML files already has another markup using the <! ... > sequence namely <!DOCTYPE ... >. Furthermore taking it to the next step XHTML which is the first convergence towards XML (which is a subset of the larger SGML) allow for the use of <!ENTITY ... > and more - still outside the <body> section. I know that Composer does not allow you to edit the <!DOCTYPE ... > line which makes my point invalid this bug as such. It think Composer should be large in what it accepts and strict in what it produces by it self. To "correct" what Windows produces resulting in pages that are unusable to Internet Explorer would be like sawing off the branch you are sitting on.
Correct, on all accounts. I should have clarified in my comment that what's allowed between "<!" and ">" within the root element is only SGML comments, and what MS Word is doing is thus invalid *ML markup.
Created attachment 126778 [details] [diff] [review] patch v1.0 [ not final yet; needs testing etc. ) Do not always add <!-- and --> to a comment on serialization in quirks mode.
ccing parser purity task force.
Comment on attachment 126778 [details] [diff] [review] patch v1.0 [ not final yet; needs testing etc. ) >Index: content/base/src/nsXMLContentSerializer.cpp >@@ -56,6 +56,34 @@ >+InStandardsMode(nsIContent* aContent, >+ // Should content with no document default to quirks mode? >+ nsCompatibility mode; >+ rv = htmldoc->GetCompatibilityMode(mode); >+ *_retval = (mode != eCompatibility_NavQuirks); there are now two quirks modes: 42 eCompatibility_FullStandards = 1, 43 eCompatibility_AlmostStandards = 2, 44 eCompatibility_NavQuirks = 3
Taking. As a note: patch v1.0 isn't quite ready (it "breaks" the DOMI's display of comments, showing "-- hi --" instead of simply " hi ").
Hello, I have the same kind of problem with Mozilla Thunderbird. If a copy-past from word a text wich contains a list, I got something like this in the source code: <!--[if !supportEmptyParas]--> <!--[endif]--> Points positifs : <!--[if !supportLists]-->- l’outil aulation<!--[endif]--> <!--[if !supportLists]-->- la mise en œuv)<!--[endif]--> <!--[if !supportLists]-->- les essaparaison <!--[endif]--> <!--[if !supportEmptyParas]--> <!--[endif]--> This is well displayed in Thunderbird, but when I send it, Outlook or a webmail doesn't recognize this and display it, which made the mail unreadable...
This should be WONTFIX per HTML5 as far as Web-facing parser features go. I suggest providing an editor feature for discarding comments that look like IE conditional comments.