Closed Bug 276434 Opened 21 years ago Closed 21 years ago

XML parser rejects valid entity names like "testing:test"

Categories

(Core :: XML, defect)

x86
Windows XP
defect
Not set
blocker

Tracking

()

RESOLVED INVALID

People

(Reporter: lwchk2001, Unassigned)

References

Details

Attachments

(2 files)

As subject. This bug is known to break a lot of extensions like RSS Editor (http://rsseditor.mozdev.org) To verify the validity of this bug, you should take a look at XML 1.0 spec: http://www.w3.org/TR/2004/REC-xml-20040204/ Or see some excerpt here: [2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */ [3] S ::= (#x20 | #x9 | #xD | #xA)+ [4] NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | CombiningChar | Extender [5] Name ::= (Letter | '_' | ':') (NameChar)* [68] EntityRef ::= '&' Name ';' [71] GEDecl ::= '<!ENTITY' S Name S EntityDef S? '>'
Attached file Offending document
Works with: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a6) Gecko/20041214 MultiZilla/1.8.0.0c Fails with: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a6) Gecko/20041216 MultiZilla/1.8.0.0c I think that this is introduced by the patch for bug 276434
Status: UNCONFIRMED → NEW
Ever confirmed: true
> I think that this is introduced by the patch for bug 276434 HJ, what do you mean by "bug 276434"? (I guess you copied a wrong number)
Try bug 192139, "Integrate latest Expat"
see the second note in http://www.w3.org/TR/2004/REC-xml-20040204/#sec-common-syn especially, see http://www.w3.org/TR/REC-xml-names/#Conformance The effect of conformance is that in such a document: [...] o No entity names, PI targets, or notation names contain any colons. invalid per "Namespaces in XML".
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → INVALID
(In reply to comment #6) > invalid per "Namespaces in XML". Exactly what I had in mind, thanks for the confirmation.
Is this really invalid? Namespaces in XML is a separate spec from the XML 1.0 or 1.1 spec, and it's conceivable that some higher layers use just XML, not NSes. What about comment 0 saying this broke an RSS editor? /be
(In reply to comment #8) > Is this really invalid? Namespaces in XML is a separate spec from the XML 1.0 > or 1.1 spec, and it's conceivable that some higher layers use just XML, not > NSes. I think so, because it was David Hyatt himself that told me, many moons ago now, not to use colons. > What about comment 0 saying this broke an RSS editor? Please note that there are more extensions broken in Mozilla builds 20041215 and later, not just the RSS editor, and they all broke after the patch for bug 192139.
Biesi: sorry, but no w3c spec can say that we must break compatibility. We can do something else, such as preserve backward compat and allow those upper layers that want Namespaces to enable some runtime option. HJ: your comments says less than it implies. What exactly were the extensions broken by the new expat landing? Were they all broken because of colons? /be
Brendan, what's breaking are XML documents (eg XUL dialogs/overlays/whatever that the extension adds) which are not conformant to the XML namespaces draft. Now either we have a namespace-aware XML parser (in which case those documents are in error), or we don't have a namespace-aware XML parser (in which case no namespace features should be available in those documents; in particular all nodes should be in the null namespace). If we plan to support a non-namespace-enabled-parser mode for Gecko, we can do that, I suppose, but Gecko with that option enabled would effectively have no XUL or XHTML or SVG or whatever support, since all nodes would end up in the null namespace as far as the DOM is concerned.
Sounds like we need a migration story, even if it's just "you are broken as of 1.8a6, change your extensions". Is someone here going to help communicate that and smooth over any rough edges in the transition? /be
(In reply to comment #11) > HJ: your comments says less than it implies. What exactly were the extensions > broken by the new expat landing? Were they all broken because of colons? PrefBar: https://bugzilla.mozilla.org/show_bug.cgi?id=272764#c38 Linky: https://bugzilla.mozilla.org/show_bug.cgi?id=272764#c39 BiDi Mail UI: https://bugzilla.mozilla.org/show_bug.cgi?id=272764#c42 QuickNote: https://bugzilla.mozilla.org/show_bug.cgi?id=272764#c45 And yes, these extensions were broken because the extensions author used colons.
(In reply to comment #13) > Sounds like we need a migration story, even if it's just "you are broken as of > 1.8a6, change your extensions". Is someone here going to help communicate that > and smooth over any rough edges in the transition? > > /be FYI: I already notified the project owners on mozdev.org (newsgroup) and I even fixed some of the broken extensions.
The problem is that we used to do the namespace handling inside Gecko, and we clearly didn't implement enough of the "Namespaces in XML" spec. Now that we've switched to Expat for namespace handling we get all those wellformedness checks for free, but that means we start rejecting invalid documents. Bug 274938 is another example of improved wellformedness checking causing trouble. Bug 103255 and bug 274775 are examples of improved wellformedness checks that could cause trouble. I'd rather bite the bullet and have people fix their documents, we can make a note of these changes in the release notes for the next alpha. The good thing is that with the checkin of the fix for bug 274775 we should be in much better shape already, so the tightening of the rules will happen all at once with the release of the next alpha.
What are colons? Why does Mozilla now not longer accept entity names like "name:name"? Is there any RFC that forces mozilla to fail on the ":"-character? Why is this bug invalid? Maybe it's just a reason of my bad englisch that I don't understand completely what happens here.... Currently I don't know how I could fix PrefBar at all. I have an update system that only checks in new buttons to the users database file (prefbar.rdf). I can't replace this file as this would kill all buttons the user has added on his own. If my updater can't touch the users file any more I would have to access this file with text functions to fix this file via string manipulation... Umpf......
the ":" character is a colon, and entity names containing it are invalid per the namespaces spec. see comment 6. it's just that mozilla has not enforced that up to now.
(In reply to comment #17) > What are colons? In namespace-enabled XML parsers, namespace separators. In non-namespace-enabled XML parsers, just another Name character. > Why does Mozilla now not longer accept entity names like "name:name"? Because accepting them violates the Namespaces in XML specification. The fact that we used to accept them is a bug. Just like accepting XML with mismatched tags would be a bug. > Is there any RFC that forces mozilla to fail on the ":"-character? See link in comment 6. > Why is this bug invalid? Because the current behavior is correct. > Currently I don't know how I could fix PrefBar at all. Don't use entity names that contain the ':' char... It's too bad that we had a bug in entity name parsing to start with, but now content that relied on this bug will have to deal. :(
The change (disallowing colons in entity names) might be valid, but it broke one of the most popular extentions: PrefBar. This indicates the validity of bug #258881, which requests an enhancement to integrate PrefBar into the Mozilla Suite (and, I would hope, into Firefox). If PrefBar were already part of the Suite, the change regarding colons would have required also changing PrefBar to maintain compatibility.
*** Bug 279222 has been marked as a duplicate of this bug. ***
*** Bug 301271 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: