Closed
Bug 204102
Opened 22 years ago
Closed 7 years ago
Should not report undeclared entities in standalone="no" documents
Categories
(Core :: XML, defect)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: thomas, Assigned: peterv)
References
(Depends on 1 open bug, )
Details
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4a) Gecko/20030401
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4a) Gecko/20030401
Mozilla throws a fatal error the moment it hits an undefined entity in any XML
document. While the wording in the XML spec is **** on this issue, I believe
the correct interpretation is that you *should* allow undefined entities. Simply
dont replace them with the string they represent. In fact, there is a major
usablity issue here because not only does a simple undefined entity prevent
viewling of an entire document, BUT you are always forcing the user into replacing
the entities within the XML document. It does happen often that someone
editing/viewing an XML document *doesnt want* the entities replaced. I know this
is the case when I edit XML documents myself since some entities may have the
same replacement string, and its impossible to see what will get inserted into
the document from which entity once the entitie replacement is actually done by
the parser.
Reproducible: Always
Steps to Reproduce:
1.load the URL given above
2.
3.
Actual Results:
you get the message from the browser:
XML Parsing Error: undefined entity
Location:
ftp://xml.gsfc.nasa.gov/pub/adc/xml_archives/journals/ApJS/126/37/public/J_ApJS_126_37.xml
Line Number 32, Column 14:<email>cordes@spacenet.tn.cornell.edu</email>
-------------^
Expected Results:
You should have just put the entity in the document, e.g. <node>&entity;</node>
and/or allow the user to turn off entity replacement in XML documents for
viewing. This last choice not only solves the bug, but is a useability
enhancement for XML parsing.
Comment 1•22 years ago
|
||
The entity in question is (as far as I can tell) not defined in the DTD, making
the document invalid per the validity constraint in:
http://www.w3.org/TR/REC-xml#dt-entref
If we were a validating parser, that would make this bug INVALID.
However, we are supposedly a non-validating parser, so in theory we are only
bound by wellformedness constraints, and the relevant one:
http://www.w3.org/TR/REC-xml#wf-entdeclared
...specifically says:
# Note that if entities are declared in the external subset or in external
# parameter entities, a non-validating processor is not obligated to read and
# process their declarations; for such documents, the rule that an entity must
# be declared is a well-formedness constraint only if standalone='yes'.
...and since the document says standalone="no", in theory, we shouldn't be
reporting this particular error.
Heikki?
Summary: Cant render document with unknown entities. This breaks XML spec, AND usability of mozilla → Should not report undeclared entities in standalone="no" documents
Reporter | ||
Comment 2•22 years ago
|
||
Hi again,
After a closer reading of the spec (as per your comment below) I
believe I am still right. The facts are the following:
1. Mozilla is non-validating XML parser, so only well-formedness
matters.
2. The entity in question is external.
3. The document is standalone = "no", so according to the spec
snippet you provided, therefore all external entities DONT qualify
for well-formedness constraint and should therefore NOT be
checked.
So why doesnt mozilla just pass over the external entities, simply
inserting them as is into the parsed document?
The problem seems to me to be an inconsitency in approach here. On
one hand you are saying that Mozilla is a non-validating parser. Fine.
But then you are quibbling over whether or not the external entity
could invalidate the document, and because you didnt parse it (on
purpose),
you wont let the parser pass a well-formedness test. The end result is
a nasty catch-22 and that no xml document with external entities may be
viewed in Mozilla. This is why I filed the bug (and still believe it is a
bug).
=b.t.
Comment 3•22 years ago
|
||
Right, that was my understanding too. If the document is standalone="yes", then
the error message is definitely correct, but if the document is standalone="no"
as in this case, then we should not be reporting it as an error.
Exactly what we _should_ do is still unclear to me. I see no justification in
the spec for treating the entity as literal text (what would appear in the
DOM?), but I also see no other text saying what _should_ happen.
Reporter | ||
Comment 4•22 years ago
|
||
> Exactly what we _should_ do is still unclear to me. I see no justification in > the spec for treating the entity as literal text (what would appear in the > DOM?), but I also see no other text saying what _should_ happen. Well I believe that simply leaving the entity unchanged, and in place is the correct way to go, e.g. nodes with entities in PCDATA, CDATA and attributes looks the same if they are external entities. Perhaps, maybe, you can throw a _warning_ that the document contains external entitites but that Mozilla is not currently able to read these and will therefore leave them in place in the document. Please, please change Mozilla. I cant believe we (at NASA) are the only ones in the world with external entities in our XML documents. Functionally Mozilla is crippled as it now stands with regards to XML documents containing external entities (as I keep repeating: there is essentially NO way to load these things into Mozilla right now). -b.t.
Reporter | ||
Comment 5•22 years ago
|
||
Ugh. konquerer screwed up my comment.. my appologies, here it is again in a
more readable 'format'.
> Exactly what we _should_ do is still unclear to me. I see no justification in
> the spec for treating the entity as literal text (what would appear in the
> DOM?), but I also see no other text saying what _should_ happen.
Well I believe that simply leaving the entity unchanged, and in place is the
correct way to go, e.g. nodes with entities in PCDATA, CDATA and attributes
looks the same if they are external entities. Perhaps, maybe, you can throw
a _warning_ that the document contains external entitites but that Mozilla
is not currently able to read these and will therefore leave them in place
in the document.
Please, please change Mozilla. I cant believe we (at NASA) are the only
ones in the world with external entities in our XML documents. Functionally
Mozilla is crippled as it now stands with regards to XML documents
containing external entities (as I keep repeating: there is essentially NO
way to load these things into Mozilla right now).
-b.t.
Reporter | ||
Comment 7•22 years ago
|
||
> The issue of us not handling external entities is bug 22942. No, not really. As I read that bug, it describes loading external DTD's in general. In so far as it focuses on external entitity issue one comment has a salient point for us here: > About validation, being non-validating just says that we are not required to > load external DTDs, not that we must not. Which means you CANT avoid dealing with external entities, even if you are a non-validating parser. Mozilla, IMO, shouldnt behave the way it does currently. Solutions are several in nature: - make Mozilla load external entities - make Mozilla simply pass over the external entities, either inserting them as is or dropping them (I dont like that possiblity). A warning *might* be given to the user to let them know that Mozilla is aware that some external entity replacement did not occur. - have a default view of XML documents which are NOT rendered at all (e.g. IE6 approach) [I dont particularly like this one either] Of these solutions, the last is the easiest to implement, the second is the most correct, and the first is the best compremise between the two. =b.t.
Reporter | ||
Comment 8•22 years ago
|
||
Argh!!!! Konquerer, I hate you!! (sorry, just feeling emotional there for a moment)
My last message all screwed up (again). Here it is in more readable text:
> The issue of us not handling external entities is bug 22942.
No, not really. As I read that bug, it describes the problems that arise
from not loading external DTD's in general. In so far as it focuses on
external entitity issue one comment has a salient point for us here:
> About validation, being non-validating just says that we are not required to
> load external DTDs, not that we must not.
Which means you CANT avoid dealing with external entities, even if you
are a non-validating parser that doesnt load external entities.
Solutions are several in nature:
- make Mozilla load external entities
- make Mozilla simply pass over the external entities, either
inserting them as is or dropping them (I dont like that last
possiblity). A warning *might* be given to the user to let
them know that Mozilla is aware that some external entity
replacement did not occur.
- have a default view of XML documents which are NOT rendered at
all [I dont particularly like this one either]
Of these solutions, the last is the easiest to implement, the
second is the most correct, and the first is the best compremise
between the two.
=b.t.
Comment 9•22 years ago
|
||
(I watch heikki and harishd in lieu of component watching, so I'm removing my CC
for the time being)
Comment 10•22 years ago
|
||
Sorry, I meant bug 69799, which is specifically about external entities, and
depends on bug 22942, which is about loading external DTDs in the first place
(whether or not we validate the document using them, which would be bug 196355).
This bug, though, is still valid; even if we try to load external DTDs, we might
not always succeed, so we will always have to be able to deal with recognising
entities that are not declared in documents marked standalone="no".
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 11•20 years ago
|
||
Original file:
ftp://xml.gsfc.nasa.gov/pub/adc/xml_archives/journals/ApJS/126/37/public/J_ApJS_126_37.xml
Testcase:
http://www.hixie.ch/tests/adhoc/xml/parsing/002.xml
Assignee: hjtoi-bugzilla → core.xml
Comment 12•18 years ago
|
||
This bug bit me again yesterday. (And several other people now that XHTML Print is a Recommendation). Three years and still not fixed? PLEASE don't declare unrecognised entities as a well-formedness error when standalone="no".
Comment 14•16 years ago
|
||
The specification states what can be done if such an entity is not found because of not checking the external file in Section 4.4.3 at http://www.w3.org/TR/REC-xml/#include-if-valid :
"When an XML processor recognizes a reference to a parsed entity, in order to validate the document, the processor MUST include its replacement text. If the entity is external, and the processor is not attempting to validate the XML document, the processor MAY, but need not, include the entity's replacement text. If a non-validating processor does not include the replacement text, it MUST inform the application that it recognized, but did not read, the entity.
"This rule is based on the recognition that the automatic inclusion provided by the SGML and XML entity mechanism, primarily designed to support modularity in authoring, is not necessarily appropriate for other applications, in particular document browsing. Browsers, for example, when encountering an external parsed entity reference, might choose to provide a visual indication of the entity's presence and retrieve it for display only on demand."
So, to add somewhat to Brian Thomas' points, it seems what can be done to fix the problem is one of the following:
1) Replace the entities (Bug 22942)
2) Provide a visual indication of the entity's presence and retrieve it for display only on demand
3) Or at the very least, it would seem permissible per the spec, to just indicate the entity's presence (Opera does this by indicating the entity in source as though it had been manually indicated as an escaped ampersand + entity text)
There indeed should be no single point of failure here.
Updated•15 years ago
|
QA Contact: ashshbhatt → xml
Comment 15•15 years ago
|
||
Cross-posting on relevant bug pages:
For this bug, and a number of other associated bugs (Bug 204102, Bug 267350, Bug 22942, and to a lesser extent Bug 196355), I've started a pledge drive at http://pledgie.com/campaigns/7732 to try to hire a developer(s) who can work with the Mozilla devs (if they are ineligible themselves) to get these long-standing and niche but important-to-XML-users bugs fixed. Feel free to make a pledge to donate toward these fixes or, if you are a developer, make a bid in the comments there to offer to fix, in conjunction with Mozilla devs, this or any of the other aforementioned XML-related bugs/feature requests.
(If we can get enough momentum, Bug 234485, Bug 98413, Bug 275196, and Bug 94270 might be also nice candidates to get addressed too, but I've started with the (single-point-of-failure-causing) DTD issues.)
Comment 16•7 years ago
|
||
No browser supports this feature and it's not worth the added complexity.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•