Closed
Bug 69799
Opened 23 years ago
Closed 6 years ago
External entities are not included in XML document
Categories
(Core :: XML, defect)
Core
XML
Tracking
()
RESOLVED
WONTFIX
Future
People
(Reporter: joden, Assigned: peterv)
References
()
Details
Attachments
(1 file, 1 obsolete file)
4.30 KB,
patch
|
timeless
:
review-
|
Details | Diff | Splinter Review |
When rendering an XML document (whether with or without a style sheet) that references external entities as in: <!ENTITY blah SYSTEM "blah.ent"> when the browser comes accross the entity tag: &blah; it will not include the external entity in the rendered text. For instance if you had the XML document: <?xml version="1.0"?> <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "dtd/docbook/docbookx.dtd" [ <!ENTITY blah SYSTEM "blah.ent"> ]> <article> <sect1> <para> Some text in a paragraph. </para> &blah; </sect1> </article> and the file blan.ent contains: <para> Blah, blah blah blah...blah blah! </para> Then Mozilla renders: Some text in a paragraph. Instead of: Some text ina paragraph. Blah, blah blah blah...blah blah!
This is a known issue. Mozilla does not load external DTDs (or fragments of them, like entities). The exception are the DTDs for the user interface, in chrome dir. Moving to Future.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Target Milestone: --- → Future
You are correct, thanks. *** This bug has been marked as a duplicate of 22942 ***
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → DUPLICATE
Comment 4•22 years ago
|
||
I understand this bug to be about external entities and not external dtds. The best reason that this is not a dupe of bug 22942 is that this doesn't work from file :-(. If you want, I can attach a testcase. I came across this while testing the docbook xslt stylesheet, which use external entities for l10n. Anyway, the external entity references stay in the internalSubset of the doctype, but don't get loaded or substituted. Even for file:// urls.
Status: RESOLVED → REOPENED
OS: Windows NT → All
Hardware: PC → All
Resolution: DUPLICATE → ---
Axel, we use the same mechanism for loading external DTDs/entities. There is special code that for file URLs we look for them under 'dtd' folder from the directory of the file that we are loading.
Comment 6•22 years ago
|
||
I changed Index: nsExpatTokenizer.cpp =================================================================== RCS file: /cvsroot/mozilla/htmlparser/src/nsExpatTokenizer.cpp,v retrieving revision 1.92 diff -u -r1.92 nsExpatTokenizer.cpp --- nsExpatTokenizer.cpp 2001/11/07 04:12:02 1.92 +++ nsExpatTokenizer.cpp 2001/11/21 16:31:06 @@ -838,7 +838,11 @@ } } } - } + } + + if (!isLoadable) { + res = (*aDTD)->SchemeIs("file", &isLoadable); + } return isLoadable; } and get a XML_ERROR_EXTERNAL_ENTITY_HANDLING in return while loading a xml with external entities. (I've set the URL to the file I'm testing) Seems like this bug is different to bug 22942 after all. I glanced into expat, and that indicates some effort to be done when we want this. And my mind isn't set to effort tonight, so I just tell you how far I got without any effort ;-)
Comment 7•22 years ago
|
||
Oh, wow. I was wondering what was wrong with my code, why I couldn't get an external entity to work in my XHTML document. Of course, if we ever get around to doing validation by DTD's, this is going to hurt us severely for XHTML 1.1.
*** Bug 130339 has been marked as a duplicate of this bug. ***
Comment 9•22 years ago
|
||
Related to Bug 44458, Mozilla understands XHTML character entities if FPI for XHTML 1.0 Strict/Transitional/Frameset, 1.1, Basic or XHTML 1.1 plus MathML 2.0 is specified, but otherwise it reports an XML parsing error for undefined entity even if external DTD subset is present. Try some test documents linked from: http://www.w3.org/People/mimasa/test/xhtml/entities/#xhtml-family This behavior is against "Well-formedness constraint: Entity Declared" of XML 1.0. See: http://www.w3.org/TR/REC-xml#wf-entdeclared
Could you explain why it is against the wf rules? That part of the XML spec you pointed to says that non-validating parsers are _not obligated to_ read external entities. If we read them in some cases but don't read them in others shouldn't matter according to the spec. Also XHTML spec itself complicates things by suggesting that we should not flag unknown entities as errors, but show them as &foo; which I think is a big mistake.
Comment 11•22 years ago
|
||
The last sentence of "Well-formedness constraint: Entity Declared" says (emphasis added by me): for such documents, the rule that an entity must be declared is a well-formedness constraint *only if standalone='yes'*. "2.9 Standalone Document Declaration" of XML 1.0 says "If there are external markup declarations but there is no standalone document declaration, the value "no" is assumed." cf. http://www.w3.org/TR/REC-xml#sec-rmd Thus, when entities are declared in the external subset through the DOCTYPE declaration, a non-validating processor is not obligated to read and process their declarations but MUST NOT report well-formedness errors against those entities. In addition, in the XHTML+MathML+SVG sample, the standalone document declaration is explicitly declared as "no", yet Mozilla reports a WF error. It is acceptable that Mozilla doesn't understand entities in that case, but it is totally unacceptable to report a WF error and stop normal processing against a valid document. cf. http://www.w3.org/People/mimasa/test/xhtml/entities/entities-math-svg.xhtml
I disagree. If the parser can not resolve an entity reference, it must be an error. How should the parser replace an entity it cannot understand?
Comment 13•22 years ago
|
||
I believe the rule in XML 1.0 is clear on this, but if you are still not sure, read the rationale behind this described in "Reports from the W3C SGML ERB to the SGML WG and from the W3C XML ERB to the XML SIG" at: http://www.w3.org/XML/9712-reports#ID52 For your convenience, I'll copy relevant parts here. <blockquote cite="http://www.w3.org/XML/9712-reports#ID52"> S.40 Should Entity Declared be a VC or a WFC? Decision: In a standalone document (one without a DTD, one with only an internal subset and no references to external parameter entities, or one with standalone='yes'"), this constraint should be treated as a WFC: i.e. it must be checked by all conforming processors. In a document with a DTD and "standalone='no'", it should be treated as a VC. Unanimous (MMal and EM abstaining). Rationale: it cannot be a WFC without serious injury to the notion of Draconian error handling. As the current draft (97-11-17) makes explicit, a non-validating processor cannot be expected to know whether an entity declaration for an entity being referred to does or does not occur in some external parameter entity or external DTD subset. But if the constraint is a well-formedness constraint, even a non-validating processor should catch the error. So for "standalone='no'", it should be a VC -- a constraint enforceable only if one reads the entire DTD. </blockquote>
*** Bug 150728 has been marked as a duplicate of this bug. ***
*** Bug 151370 has been marked as a duplicate of this bug. ***
![]() |
||
Comment 16•21 years ago
|
||
*** Bug 178945 has been marked as a duplicate of this bug. ***
Comment 17•21 years ago
|
||
I apologize for asking this question here, but... How does a Buzilla user determine the release in which a bug will be or has been fixed? I am interested in the fix for this problem.
Updated•21 years ago
|
QA Contact: petersen → rakeshmishra
Comment 18•21 years ago
|
||
There are several related issues here: If the document is standalone="yes", we should be reading the entire internal subset, not stopping as we do in standalone="no" mode. This is bug 129392. If the document is standalone="no", we shouldn't be reporting undeclared entities as a wellformedness error, as we do (and should) in standalone="yes" mode. This is bug 204102. In either case, we should be reading external entities in order to do basic stuff like attribute defaulting, recognising ID attributes, and entity expansion. This is bug 22942. This, therefore, appears to be a subset of bug 22942. Marking dependency.
Depends on: entities
Comment 19•21 years ago
|
||
Thanks for updating this. In the meantime, while waiting for a fix, is there anyway to workaround this bug?
Comment 20•21 years ago
|
||
You could stick the entire list of entities in the internal subset I guess... (Maybe using server side includes to make life easier.)
Comment 21•21 years ago
|
||
It seems to me that this bug is considered as an infrequent problem. However, I want to raise the attention on several use cases that frequently happen in a professional environment: -DocBook document foster the use of external entities to build a book. External entities allow several persons to work on the same document at the same time. -web server applications can be built in the following way: an xml page converted in html using xsl to display the main layout, and an external entity loading the core data to be modified using jsp. In both cases mozilla cannot display correctly the pages whereas this is a standard compliant implementation of web applications frequently built in an enterprise.
Updated•21 years ago
|
QA Contact: rakeshmishra → ashishbhatt
Comment 22•21 years ago
|
||
>However, I want to raise the attention on several use cases that
>frequently happen in a professional environment:
And I can add: This bug blocks creation of simple localisable remote XUL
application in similar way as this can be for chrome XUL (all strings are in
external file as external entities).
Comment 23•21 years ago
|
||
re #22, localisation depends on the chrome protocol inserting the locale. So this is not gonna enable you to get localisation out of the box. Of course it is a prerequisite for solving that bug in the first place.
Comment 24•21 years ago
|
||
re23: Should be. E.g. in http://domain/file.xul you have your XUL application pointig to the http://doamuin/file.php as DTD with external entities and there you offer to the client appropriate locale determined by some cookie or Accept language in the HTTP request header by simple PHP script. The possibility to have all strings in external files simplified localization very much.
Comment 25•20 years ago
|
||
Here's an example that doesn't include the external entity in the document tree: Open this file: <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <!DOCTYPE notebook [ <!ELEMENT notebook (page|span)*> <!ELEMENT page (#PCDATA)> <!ELEMENT span (#PCDATA)> <!ENTITY stuff SYSTEM "s1p1.xml"> ]> <notebook> <span>in the beginning</span> &stuff; </notebook> The external entity s1p1.xml contains three lines: <page> this is page one </page>
Comment 26•20 years ago
|
||
there is also another interesting fact when the xhtml 1.1 dt-definition is extended... i tried the following index.xml: -- <?xml version="1.0" encoding="iso-8859-1" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd" [ <!ENTITY dasmenu SYSTEM "menux.xml"> <!ENTITY ichglaubsjawohlnicht "bla"> ]> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Einbinden von Menus in xhtml &ichglaubsjawohlnicht;</title> </head> <body> <div id="menu">&dasmenu;</div> <div id="content">CONTENT &ichglaubsjawohlnicht;</div> </body> </html> -- and menux.xml: -- <?xml version="1.0" encoding="iso-8859-1" ?> <p>test mit <b>bold</b> zwischendurch :D</p> -- it's interesting, that mozilla gave (when i deleted the xmlns= attribute in <html> it gave the source) the following lines out -- <html> <head> <title>Einbinden von Menus in xhtml bla</title> </head> <body> <div id="menu"> <!-- * The contents of this file are subject to the Mozilla Public * License Version 1.1 (the "License"); you may not use this file * except in compliance with the License. You may obtain a copy of * the License at http://www.mozilla.org/MPL/ * * Software distributed under the License is distributed on an "AS * IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or * implied. See the License for the specific language governing * rights and limitations under the License. * * The Original Code is mozilla.org code. * * The Initial Developer of the Original Code is Netscape * Communications Corporation. Portions created by Netscape are * Copyright (C) 2000 Netscape Communications Corporation. All * Rights Reserved. * * Contributor(s): --> <!-- * Predefined HTML entities to be loaded when parsing XHTML documents. * The contents match mozilla/htmlparser/src/nsHTMLEntityList.h, * except that Navigator entity extensions are not included. --> <!-- ISO 8859-1 entities --> <!-- Mathematical symbols and Greek letters --> <!-- Markup-significant and internationalization characters --> </div> <div id="content">CONTENT bla</div> </body> </html> -- seems, as if it simply copied the comments from the DTD into the XML document?
*** Bug 224739 has been marked as a duplicate of this bug. ***
Comment 28•20 years ago
|
||
*** Bug 239607 has been marked as a duplicate of this bug. ***
Comment 29•20 years ago
|
||
With the emergence of XML this bug is becoming more important but seems to have been forgotten. Work needs to be done on this bug. WORKS: test.xml: <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE test[ <!ELEMENT test (#PCDATA)> <!ENTITY entity "hello"> ]> <test>&entity;</test> DOES NOT WORK (unknown entity): test.xml: <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE test SYSTEM "test.dtd"> <test>&entity;</test> test.dtd: <!ELEMENT test (#PCDATA)> <!ENTITY entity "hello"> ALSO DOES NOT WORK: test.xml: <!DOCTYPE test[ <!ELEMENT test (#PCDATA)> <!ENTITY entity SYSTEM "test.dtd"> ]> <test>&entity;</test> test.dtd: <!ENTITY entity "hello">
Comment 30•20 years ago
|
||
I would mention that http://bugzilla.mozilla.org/show_bug.cgi?id=219958 is related to the same problem of XML/XSL full support.
Comment 31•20 years ago
|
||
It's going to be extremely difficult to hold any Mozilla milestone for this bug if there are no patches for it...
Updated•20 years ago
|
Flags: blocking1.8a3? → blocking1.8a3-
Comment 32•19 years ago
|
||
*** Bug 261516 has been marked as a duplicate of this bug. ***
Comment 33•19 years ago
|
||
*** Bug 300389 has been marked as a duplicate of this bug. ***
Suggestion: alter function IsLoadableDTD() of http://lxr.mozilla.org/seamonkey/source/parser/htmlparser/src/nsExpatDriver.cpp (I believe this function is the only culprit) Before attempting to load from the read-only directory <mozilla bin>/res/dtd, attempt to load from the user-writable directory <user's profile>/res/dtd. This will allow extensions to download DTDs if necessary. It's not a perfect solution but it's acceptable, isn't it ?
Comment 36•19 years ago
|
||
*** Bug 303664 has been marked as a duplicate of this bug. ***
Is there anyone working on this ? I've been trying to contact Heikki Toivonen by mail and irc (#developers) without success.
Peter, can you take a look at this, including the patch?
Assignee: hjtoi-bugzilla → peterv
Status: REOPENED → NEW
Comment 39•18 years ago
|
||
Append calls can fail, if they fail, then they will probably leave you w/ the wrong file reference which could easily exist and yet will not give you the right behavior. Ensure they succeed.
Attachment #188977 -
Attachment is obsolete: true
Attachment #208187 -
Flags: review-
Fwiw, timeless's patch looks good to me. Just wondering: shouldn't the user directory be checked first ?
Comment 41•18 years ago
|
||
How can I use xml fragment and Stylesheet to generate html fragment? I have the same problem in my application. I am trying to load xml fragment and apply the Stylesheet for transformToFragment. Is there a workarond for this? Is this bug is going to fix in the near future?
Comment 42•18 years ago
|
||
timeless: it looks like when you created the patch you marked it r- at the same time; is this correct?
Comment 44•17 years ago
|
||
For what it's worth: IE doesn't do much better than Firefox. Still, this should be fixed in both of them. If a DTD doesn't have a public ID, it should be cached in user dir, and loaded from there until it expires (i don't know how that's done for other files, but could be the same approach).
Comment 45•15 years ago
|
||
Just thought this ought to be mentioned, I'm writing my thesis in docbook and this bug keeps me from structuring it in different files.
Comment 46•15 years ago
|
||
Is there any way to include data from one xml file to another without using web serve? I'm trying to write xml files to display experiment results and I would like to add notes to the results in another file(so that no one changes the experiment results files) and also display the notes below the experiment results.
Comment 47•15 years ago
|
||
This bug has been around for 9 years. Could somebody give a try on the attached patch. thanks
Updated•15 years ago
|
QA Contact: ashshbhatt → xml
Comment 48•13 years ago
|
||
I think this should be WONTFIXed since this can't be implemented without bug 22942, but since this bug has an assignee, not taking the liberty to actually mark as WONTFIX myself.
Updated•6 years ago
|
Status: NEW → RESOLVED
Closed: 23 years ago → 6 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•