Closed
Bug 184001
Opened 22 years ago
Closed 22 years ago
saving to a file produces invalid xhtml (inserts DTD in a weird spot)
Categories
(Core :: XML, defect, P1)
Core
XML
Tracking
()
RESOLVED
FIXED
mozilla1.3beta
People
(Reporter: brant, Assigned: hjtoi-bugzilla)
References
()
Details
(Keywords: testcase, Whiteboard: document.internalSubset wrong)
Attachments
(2 files)
10.92 KB,
application/zip
|
Details | |
5.72 KB,
patch
|
harishd
:
review+
jst
:
superreview+
asa
:
approval1.3b+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.3a) Gecko/20021205 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.3a) Gecko/20021205 If I save to a file, Mozilla's xhtml11.dtd is appended to the beginning, but the comment strings are missing. This makes the page invalid. Reproducible: Always Steps to Reproduce: 1. Open about:mozilla 2. Click File > Save. 3. Save it wherever you want. Actual Results: The following is appended: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd" * The contents of this file are subject to the Mozilla Public * License Version 1.1 (the "License"); you may not use this file * except in compliance with the License. You may obtain a copy of * the License at http://www.mozilla.org/MPL/ * * Software distributed under the License is distributed on an "AS * IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or * implied. See the License for the specific language governing * rights and limitations under the License. * * The Original Code is mozilla.org code. * * The Initial Developer of the Original Code is Netscape * Communications Corporation. Portions created by Netscape are * Copyright (C) 2000 Netscape Communications Corporation. All * Rights Reserved. * * Contributor(s): * Predefined HTML entities to be loaded when parsing XHTML documents. * The contents match mozilla/htmlparser/src/nsHTMLEntityList.h, * except that Navigator entity extensions are not included. ISO 8859-1 entities <!ENTITY nbsp " "> <!ENTITY iexcl "¡"> <!ENTITY cent "¢"> <!ENTITY pound "£"> <!ENTITY curren "¤"> <!ENTITY yen "¥"> <!ENTITY brvbar "¦"> <!ENTITY sect "§"> <!ENTITY uml "¨"> <!ENTITY copy "©"> <!ENTITY ordf "ª"> <!ENTITY laquo "«"> <!ENTITY not "¬"> <!ENTITY shy "­"> <!ENTITY reg "®"> <!ENTITY macr "¯"> <!ENTITY deg "°"> <!ENTITY plusmn "±"> <!ENTITY sup2 "²"> <!ENTITY sup3 "³"> <!ENTITY acute "´"> <!ENTITY micro "µ"> <!ENTITY para "¶"> <!ENTITY middot "·"> <!ENTITY cedil "¸"> <!ENTITY sup1 "¹"> <!ENTITY ordm "º"> <!ENTITY raquo "»"> <!ENTITY frac14 "¼"> <!ENTITY frac12 "½"> <!ENTITY frac34 "¾"> <!ENTITY iquest "¿"> <!ENTITY Agrave "À"> <!ENTITY Aacute "Á"> <!ENTITY Acirc "Â"> <!ENTITY Atilde "Ã"> <!ENTITY Auml "Ä"> <!ENTITY Aring "Å"> <!ENTITY AElig "Æ"> <!ENTITY Ccedil "Ç"> <!ENTITY Egrave "È"> <!ENTITY Eacute "É"> <!ENTITY Ecirc "Ê"> <!ENTITY Euml "Ë"> <!ENTITY Igrave "Ì"> <!ENTITY Iacute "Í"> <!ENTITY Icirc "Î"> <!ENTITY Iuml "Ï"> <!ENTITY ETH "Ð"> <!ENTITY Ntilde "Ñ"> <!ENTITY Ograve "Ò"> <!ENTITY Oacute "Ó"> <!ENTITY Ocirc "Ô"> <!ENTITY Otilde "Õ"> <!ENTITY Ouml "Ö"> <!ENTITY times "×"> <!ENTITY Oslash "Ø"> <!ENTITY Ugrave "Ù"> <!ENTITY Uacute "Ú"> <!ENTITY Ucirc "Û"> <!ENTITY Uuml "Ü"> <!ENTITY Yacute "Ý"> <!ENTITY THORN "Þ"> <!ENTITY szlig "ß"> <!ENTITY agrave "à"> <!ENTITY aacute "á"> <!ENTITY acirc "â"> <!ENTITY atilde "ã"> <!ENTITY auml "ä"> <!ENTITY aring "å"> <!ENTITY aelig "æ"> <!ENTITY ccedil "ç"> <!ENTITY egrave "è"> <!ENTITY eacute "é"> <!ENTITY ecirc "ê"> <!ENTITY euml "ë"> <!ENTITY igrave "ì"> <!ENTITY iacute "í"> <!ENTITY icirc "î"> <!ENTITY iuml "ï"> <!ENTITY eth "ð"> <!ENTITY ntilde "ñ"> <!ENTITY ograve "ò"> <!ENTITY oacute "ó"> <!ENTITY ocirc "ô"> <!ENTITY otilde "õ"> <!ENTITY ouml "ö"> <!ENTITY divide "÷"> <!ENTITY oslash "ø"> <!ENTITY ugrave "ù"> <!ENTITY uacute "ú"> <!ENTITY ucirc "û"> <!ENTITY uuml "ü"> <!ENTITY yacute "ý"> <!ENTITY thorn "þ"> <!ENTITY yuml "ÿ"> Mathematical symbols and Greek letters <!ENTITY fnof "ƒ"> <!ENTITY Alpha "Α"> <!ENTITY Beta "Β"> <!ENTITY Gamma "Γ"> <!ENTITY Delta "Δ"> <!ENTITY Epsilon "Ε"> <!ENTITY Zeta "Ζ"> <!ENTITY Eta "Η"> <!ENTITY Theta "Θ"> <!ENTITY Iota "Ι"> <!ENTITY Kappa "Κ"> <!ENTITY Lambda "Λ"> <!ENTITY Mu "Μ"> <!ENTITY Nu "Ν"> <!ENTITY Xi "Ξ"> <!ENTITY Omicron "Ο"> <!ENTITY Pi "Π"> <!ENTITY Rho "Ρ"> <!ENTITY Sigma "Σ"> <!ENTITY Tau "Τ"> <!ENTITY Upsilon "Υ"> <!ENTITY Phi "Φ"> <!ENTITY Chi "Χ"> <!ENTITY Psi "Ψ"> <!ENTITY Omega "Ω"> <!ENTITY alpha "α"> <!ENTITY beta "β"> <!ENTITY gamma "γ"> <!ENTITY delta "δ"> <!ENTITY epsilon "ε"> <!ENTITY zeta "ζ"> <!ENTITY eta "η"> <!ENTITY theta "θ"> <!ENTITY iota "ι"> <!ENTITY kappa "κ"> <!ENTITY lambda "λ"> <!ENTITY mu "μ"> <!ENTITY nu "ν"> <!ENTITY xi "ξ"> <!ENTITY omicron "ο"> <!ENTITY pi "π"> <!ENTITY rho "ρ"> <!ENTITY sigmaf "ς"> <!ENTITY sigma "σ"> <!ENTITY tau "τ"> <!ENTITY upsilon "υ"> <!ENTITY phi "φ"> <!ENTITY chi "χ"> <!ENTITY psi "ψ"> <!ENTITY omega "ω"> <!ENTITY thetasym "ϑ"> <!ENTITY upsih "ϒ"> <!ENTITY piv "ϖ"> <!ENTITY bull "•"> <!ENTITY hellip "…"> <!ENTITY prime "′"> <!ENTITY Prime "″"> <!ENTITY oline "‾"> <!ENTITY frasl "⁄"> <!ENTITY weierp "℘"> <!ENTITY image "ℑ"> <!ENTITY real "ℜ"> <!ENTITY trade "™"> <!ENTITY alefsym "ℵ"> <!ENTITY larr "←"> <!ENTITY uarr "↑"> <!ENTITY rarr "→"> <!ENTITY darr "↓"> <!ENTITY harr "↔"> <!ENTITY crarr "↵"> <!ENTITY lArr "⇐"> <!ENTITY uArr "⇑"> <!ENTITY rArr "⇒"> <!ENTITY dArr "⇓"> <!ENTITY hArr "⇔"> <!ENTITY forall "∀"> <!ENTITY part "∂"> <!ENTITY exist "∃"> <!ENTITY empty "∅"> <!ENTITY nabla "∇"> <!ENTITY isin "∈"> <!ENTITY notin "∉"> <!ENTITY ni "∋"> <!ENTITY prod "∏"> <!ENTITY sum "∑"> <!ENTITY minus "−"> <!ENTITY lowast "∗"> <!ENTITY radic "√"> <!ENTITY prop "∝"> <!ENTITY infin "∞"> <!ENTITY ang "∠"> <!ENTITY and "∧"> <!ENTITY or "∨"> <!ENTITY cap "∩"> <!ENTITY cup "∪"> <!ENTITY int "∫"> <!ENTITY there4 "∴"> <!ENTITY sim "∼"> <!ENTITY cong "≅"> <!ENTITY asymp "≈"> <!ENTITY ne "≠"> <!ENTITY equiv "≡"> <!ENTITY le "≤"> <!ENTITY ge "≥"> <!ENTITY sub "⊂"> <!ENTITY sup "⊃"> <!ENTITY nsub "⊄"> <!ENTITY sube "⊆"> <!ENTITY supe "⊇"> <!ENTITY oplus "⊕"> <!ENTITY otimes "⊗"> <!ENTITY perp "⊥"> <!ENTITY sdot "⋅"> <!ENTITY lceil "⌈"> <!ENTITY rceil "⌉"> <!ENTITY lfloor "⌊"> <!ENTITY rfloor "⌋"> <!ENTITY lang "〈"> <!ENTITY rang "〉"> <!ENTITY loz "◊"> <!ENTITY spades "♠"> <!ENTITY clubs "♣"> <!ENTITY hearts "♥"> <!ENTITY diams "♦"> Markup-significant and internationalization characters <!ENTITY quot """> <!ENTITY amp "&"> <!ENTITY lt "<"> <!ENTITY gt ">"> <!ENTITY OElig "Œ"> <!ENTITY oelig "œ"> <!ENTITY Scaron "Š"> <!ENTITY scaron "š"> <!ENTITY Yuml "Ÿ"> <!ENTITY circ "ˆ"> <!ENTITY tilde "˜"> <!ENTITY ensp " "> <!ENTITY emsp " "> <!ENTITY thinsp " "> <!ENTITY zwnj "‌"> <!ENTITY zwj "‍"> <!ENTITY lrm "‎"> <!ENTITY rlm "‏"> <!ENTITY ndash "–"> <!ENTITY mdash "—"> <!ENTITY lsquo "‘"> <!ENTITY rsquo "’"> <!ENTITY sbquo "‚"> <!ENTITY ldquo "“"> <!ENTITY rdquo "”"> <!ENTITY bdquo "„"> <!ENTITY dagger "†"> <!ENTITY Dagger "‡"> <!ENTITY permil "‰"> <!ENTITY lsaquo "‹"> <!ENTITY rsaquo "›"> <!ENTITY euro "€">> Expected Results: It should append the file correctly. It looks like Mozilla is trying to include the DTD in place of the SYSTEM reference. However, it is being inserted in the wrong place. Also, since there is already a reference to the SYSTEM DTD, isn't the inclusion of the Mozilla-specific variant somewhat superfluous. Someone tells me that this may already be filed, but I could not find it. Thinking this might be a global problem with application/xhtml+xml XHTML 1.1 files, I tried saving a file of that type and the problem did not occur. Perhaps it is limited to about: URLs.
Comment 1•22 years ago
|
||
to XML. Happens on Linux too.
Assignee: dougt → heikki
Component: Networking → XML
OS: Windows XP → All
QA Contact: benc → rakeshmishra
Hardware: PC → All
Summary: saving to a file produces invalid xhtml → saving to a file produces invalid xhtml (inserts DTD in a weird spot)
Assignee | ||
Comment 2•22 years ago
|
||
Uh, that looks really weird & bad. I'll try to get to this ASAP.
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: --- → mozilla1.3beta
Reporter | ||
Comment 3•22 years ago
|
||
Additionally, you must save it as Web Page - Complete. The HTML Only option doesn't have a problem. Additionally this happens with any XHTML file. Use <http://www.cherokeescouting.org/OtherUnits/Troop545IIN/brant.xhtml> for an example. I feel that this is severe enought to be a blocker. If a user can't save a file completely and be able to open it, they will be dissatisfied. Also, Web developers will have yet another reason to not use XHTML 1.1. 2003011508
Flags: blocking1.3b?
Keywords: clean-report
This is pretty badly broken. Heikki, can you get onto this for 1.3beta? That's Tuesday or Wednesday...
Flags: blocking1.3b? → blocking1.3b+
Reporter | ||
Comment 5•22 years ago
|
||
I will soon be uploading some tests to try some diagnoses: 184001-1.xhtml: XHTML Extension 184001-1-saved.xhtml: Broken 184001-2.xml: XML Extension 184001-2-saved.xml: Broken Conclusion: Extension and therefore MIME type don't make a difference. 184001-3.xhtml: No SYSTEM Doctype location provided 184001-3.xhtml: Broken Conclusion: None. Mozilla apparently needs the SYSTEM Doctype (Or it might be required by spec, not sure). 184001-4.xhtml: No XML Prolog 184001-4-saved.xhtml: Broken Conclusion: XML Prolog is not necessary for reproduction. 184001-4-saved-nodownloadmanager-noprogress.xhtml: Broken Conclusion: Download method makes no difference. Hopefully these tests are helpful. Build used for testing was 2003012508 WinXP SP1. My guess as to what is happening: Mozilla is not a validating XML parser so it does not read external DTDs. Because XHTML defines entities, Mozilla has its own DTD containing those entities. Because other browsers may not be validating parsers either, Mozilla tries to include its DTD when the page is saved. However, it does not put the square brackets around the whole mess as the XML spec requires for internal subsets. Other ideas I haven't tried due to time: standalone="yes" in prolog since this should tell the parser that it doesn't need external resources to render correctly and shouldn't need to include the DTD. Other locations for external DTD. Send page by e-mail.
URL: about:mozilla
Keywords: testcase
Reporter | ||
Comment 6•22 years ago
|
||
Assignee | ||
Comment 7•22 years ago
|
||
The real problem is that document.internalSubset somehow gets the external DTD(s) appended to it. That it happens to do it in a way that makes it invalid is just an added insult. But since it is in the DOM the serializer will naturally serialize it - there is no bug in the serializer. You can test this by loading the about page, and then typing in the URLbar: javascript:alert(document.doctype.internalSubset);
URL: about:
Whiteboard: document.internalSubset wrong
Assignee | ||
Comment 8•22 years ago
|
||
This patch fixes several issues: - this bug - makes one function name self explaining - check return value from one function that could fail - remove unneeded variable - make things more efficient by using smaller doctype buffer - make some lines shorter to fit 80 col displays
Assignee | ||
Comment 9•22 years ago
|
||
Cathleen, could you run startup & open new window performance tests with this patch?
Assignee | ||
Comment 10•22 years ago
|
||
I should also mention that there is one decision that I needed to make concerning the internal subset. Given an internal subset like this: <!ENTITY % regionDTD SYSTEM "chrome://global-region/locale/region.dtd" > %brandDTD; what will end up in DOM is this: <!ENTITY % regionDTD SYSTEM "chrome://global-region/locale/region.dtd" > i.e. %brandDTD; disappeared - there is only a newline in its place. We debated with harishd as to what to do here. The DOM spec leaves this wide open for the implementors (basically anything goes). The reason we chose this approach was that we automatically expand all entities, so if brandDTD defined some entity that was used in this document, the entity will be expanded and there is no longer any need for the external file. If you think we should change our approach, let us know. The possible other approaches I see are: * Insert the %brandDTD% there anyway, even though it is no longer needed. * Make it into a comment: <!-- %brandDTD; --> (we have some precedents about this in Save As Complete...) * Remove all references to external entities from the internal subset.
Assignee | ||
Updated•22 years ago
|
Attachment #113120 -
Flags: superreview?(jst)
Attachment #113120 -
Flags: review?(harishd)
Comment 11•22 years ago
|
||
Comment on attachment 113120 [details] [diff] [review] Proposed fix sr=jst
Attachment #113120 -
Flags: superreview?(jst) → superreview+
Comment 12•22 years ago
|
||
**** PAGELOAD PERFORMANCE **** win2k, 2.4 GHz, 1GB --------------------- baseline: 313 ms w/ patch: 316 ms win2k, 650MHz, 512MB --------------------- baseline: 632 ms w/ patch: 631 ms win98, 200MHz, 64MB --------------------- baseline: 4687 ms w/ patch: 4653 ms Hard to say, possible pageload perf win on slow machine, but again it could also be within the range of up/down swing for a test run.
Comment 13•22 years ago
|
||
Comment on attachment 113120 [details] [diff] [review] Proposed fix >- mDoctypeText.SetCapacity(20480); >+ // setting mDoctypeText's capacity to be 1K ( just a guesstimate! ). >+ mDoctypeText.SetCapacity(1024); > return NS_OK; > } I'm not how the size reduction is going to benifit mozilla when loading pages with large DOCTYPES. Let's hope that 1K is the optimal size. >- NS_NewUTF8ConverterStream(getter_AddRefs(uniIn), in, 1024); >+ rv = NS_NewUTF8ConverterStream(getter_AddRefs(uniIn), in, 1024); >+ >+ if (NS_FAILED(rv)) { >+ return result; >+ } This message will get lost inside expat!
Attachment #113120 -
Flags: review?(harishd) → review+
Comment 14•22 years ago
|
||
Cathleen: I don't think this change will improve html page load performance, however, it should improve startup performance because the changes are xml related.
Comment 15•22 years ago
|
||
**** STARTUP PERFORMANCE **** win2k, 2.4 GHz, 1GB --------------------- baseline (ms): 1652 1386 1496 996 965 1199 1886 1855 1715 (lowest: 965) w/ patch (ms): 1683 1246 1699 1558 1886 1761 1402 1277 1058 (lowest: 1058) win2k, 650MHz, 512MB --------------------- baseline (ms): 2257 2687 2620 1960 2261 2701 2161 2750 2390 (lowest: 1960) w/ patch (ms): 2636 2098 2669 2070 2273 2756 3137 1877 2267 (lowest: 1877) win98, 200MHz, 64MB --------------------- baseline (ms): 12790 12050 12580 11990 12850 12290 12080 12380 (lowest: 11990) w/ patch (ms): 12620 12410 12210 12060 12140 12100 12170 12130 (lowest: 12060) Also, hard to say here. These numbers are in milliseconds, and they are all very close.
Comment 16•22 years ago
|
||
This patch has review, super-review and the perf tests don't seem to suggest any regressions. Is this ready to land?
Assignee | ||
Comment 17•22 years ago
|
||
Comment on attachment 113120 [details] [diff] [review] Proposed fix We used to copy external DTDs into the internal subset when the external DTD was specified on the DOCTYPE line. All XHTML documents having DOCTYPE, and most XUL files do this. The change is to make us observe when we are parsing external DTD and not copy into internal subset. (Because of this I was able to make our internal subset member smaller.) This makes us correct, and in theory should improve startup performance quite a bit. Unfortunately tests don't show that, but they don't show a sure slowdown either.
Attachment #113120 -
Flags: approval1.3b?
Comment 18•22 years ago
|
||
Comment on attachment 113120 [details] [diff] [review] Proposed fix a=asa (on behalf of drivers) for checkin to 1.3beta.
Attachment #113120 -
Flags: approval1.3b? → approval1.3b+
Assignee | ||
Comment 19•22 years ago
|
||
Fixed. I did some additional tests, and found why we didn't see startup win - there are only 5 XUL files that benefit from this during startup. I checked all pref panels and most dialogs etc. I could think of and at most the doctype string length was 266 so 1K is more than enough for XUL. For documents on the web it might be less than optimal, but we can tweak it as we see more XML content.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•