501837 - Liberalize XML Names and VersionNum to reflect latest XML 1.0 edition (5th)

Reporter

Description

•

16 years ago

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1) Gecko/20090624 Firefox/3.5 Build Identifier: The fifth edition of XML, at http://www.w3.org/TR/2008/REC-xml-20081126/ , incorporates errata for edition 4. Among these errata is the significant change to be more permissive for XML Names (i.e., retroactively allowing in XML 1.0 those XML names which XML 1.1 first liberalized): http://www.w3.org/XML/xml-V10-4e-errata#E09 Presently, XML parsing in Firefox does not allow XML names to use these now permitted characters in XML element or attribute names, etc. This is probably the most important reason people wanted to use XML 1.1, so with the latest errata adjustment, it is no longer as necessary, assuming this aspect can be fixed. (For the record, I personally don't have any compelling need for this, and I am aware that XML as implemented currently allows a full range of XML characters in the character data of an XML document, but I wonder whether this could expand the XML options for others, allowing XML to work as the fully universal semantic and exchange format it was designed to be, especially if it would only require change to a small portion of code where an XML Name was defined.) Reproducible: Always Steps to Reproduce: 1. Use a character like 'ϴ' (U+03F4) in an element name 2. 3. Actual Results: Gives "XML Parsing Error: not well-formed" Expected Results: Allow the element and document

Brett Zamir

Reporter

Updated

•

16 years ago

Component: General → XML

Paula Robinson

Comment 1

•

16 years ago

Pkease consider this bug confirmed. It was among 200+ bugs planted in my computer.

Phil Ringnalda (:philor)

Updated

•

16 years ago

QA Contact: general → xml

Henri Sivonen (:hsivonen) (on vacation)

Comment 3

•

13 years ago

After XML 1.1 flopped, the W3C changed the definition of 1.0 to fold some changes from XML 1.1 while pretending that the result is still XML 1.0. Of course, such changes are a really bad idea in a format that has Draconian error handling. As far as I can tell, upstream expat doesn’t implement XML 1.0 5th edition. See http://blog.jclark.com/2008/10/xml-10-5th-edition.html for the thoughts of the developer of expat. I think we shouldn’t be in any hurry to change our copy of expat here. Files that Gecko rejects are also rejected by a decade of other software written according to pre-5th ed. XML 1.0, so anyone who is serious about XML interoperability cannot use 5th editition Names that are not also earlier edition Names. I’m inclined to suggest WONTFIXing this on the grounds that the W3C made a big mistake by changing a Draconian format in an “edition” and if we ever make the XML parser in Gecko more permissive, we should just go ahead and go all the way to XML-ER (aka. XML5).

Status: UNCONFIRMED → NEW

Ever confirmed: true

Anne (:annevk)

Comment 5

•

12 years ago

It appears Chrome supports this. The other notable change is that <?xml version="1.x"?> with x being one or more code points in the range 0-9 is no longer an error.

Severity: normal → minor

OS: Windows Vista → All

Hardware: x86 → All

Henri Sivonen (:hsivonen) (on vacation)

Comment 6

•

12 years ago

(In reply to Anne (:annevk) from comment #5) > It appears Chrome supports this. Test case? Link to their rational for supporting this?

Anne (:annevk)

Comment 7

•

12 years ago

data:text/xml,<?xml version="1.2"?><x/> libxml implements the 5th edition and I suspect Chrome uses that library without much scrutiny.

Anne (:annevk)

Comment 8

•

11 years ago

Note that our current parser has bugs here, e.g. U+00B5 data:text/xml,<%C2%B5/> is not allowed per either the 4th or 5th edition.

Masatoshi Kimura [:emk]

Updated

•

9 years ago

Summary: Liberalize XML Names to reflect latest XML 1.0 edition (5th) → Liberalize XML Names and VersionNum to reflect latest XML 1.0 edition (5th)

cscott

Comment 11

•

7 years ago

This has visible effects in the implementation of Document#createElementNS (and probably other places). See https://github.com/web-platform-tests/wpt/pull/12202#issuecomment-411650590

Gô Shoemake

Comment 12

•

6 years ago

This also appears (??) to have an impact on data-* attributes; for example document.body.dataset["\uAB57"] (which matches the Name production in XML 5th but not 4th) fails in Firefox (but not in Chrome, and should be allowed per the spec).

This will affect anyone trying to create dataset properties in languages which are encoded in Unicode ranges outside those allowed by XML fourth ed.

BMO Automation

Updated

•

3 years ago

Severity: minor → S4

Bugzilla

Liberalize XML Names and VersionNum to reflect latest XML 1.0 edition (5th)

Categories

(Core :: XML, defect)

Tracking

()

People

(Reporter: brettz9, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Updated

Comment 3

Comment 5

Comment 6

Comment 7

Comment 8

Updated

Comment 11

Comment 12

Updated