Closed Bug 277537 Opened 21 years ago Closed 21 years ago

isXMLName() should be properly implemented

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: martin.honnen, Assigned: igor)

References

Details

Attachments

(1 file, 1 obsolete file)

Fix: just follow E4X 13.1.2.1 and http://w3.org/TR/xml-names11/#NT-NCName 21 years ago Igor Bukanov 3.43 KB, patch		Details \| Diff \| Splinter Review
Patch change to work around jikes compiler bug 21 years ago Igor Bukanov 3.36 KB, patch		Details \| Diff \| Splinter Review

Martin Honnen

Reporter

Description

•

21 years ago

When I try isXMLName(String.fromCharCode(8364) + '1') with Rhino 1.6 release 1 2004 11 30 it yields true while I think the character with Unicode 8364 (it is the Euro symbol '€') is not allowed as the first character in an XML name, not even allowed in there at all.

Martin Honnen

Reporter

Comment 1

•

21 years ago

The problem with isXMLName doesn't seem to be restricted to that particular case, here are some tests where Rhino all yields true while the result should be false I think: Rhino 1.6 release 1 2004 11 30 js> isXMLName(String.fromCharCode(8364) + '1') true js> isXMLName('-el') true js> isXMLName('1el') So changing summary. Hmm, I have just looked at the source and indeed the implementation currently is public boolean isXMLName(Context cx, Object name) { // TODO: Check if qname.localName() matches NCName return true; } so obviously this is a known issue.

Summary: isXMLName(String.fromCharCode(8364) + '1') should give false → isXMLName() should give false for arguments String.fromCharCode(8364) + '1', '-el', isXMLName('1el'), needs to be implemented

Igor Bukanov

Assignee

Comment 2

•

21 years ago

Changing the title to reflect the real nature of the bug: currently isXMLName() in Rhino simply returns true.

Summary: isXMLName() should give false for arguments String.fromCharCode(8364) + '1', '-el', isXMLName('1el'), needs to be implemented → isXMLName() should be properly implemented

Igor Bukanov

Assignee

Updated

•

21 years ago

Blocks: 270779

Igor Bukanov

Assignee

Comment 3

•

21 years ago

Attached patch Fix: just follow E4X 13.1.2.1 and http://w3.org/TR/xml-names11/#NT-NCName (obsolete) — Details — Splinter Review

Igor Bukanov

Assignee

Comment 4

•

21 years ago

Attached patch Patch change to work around jikes compiler bug — Details — Splinter Review

Igor Bukanov

Assignee

Updated

•

21 years ago

Attachment #171078 - Attachment is obsolete: true

Igor Bukanov

Assignee

Comment 5

•

21 years ago

I committed the fix

Status: NEW → RESOLVED

Closed: 21 years ago

Resolution: --- → FIXED

Igor Bukanov

Assignee

Comment 6

•

21 years ago

(In reply to comment #0) > When I try > isXMLName(String.fromCharCode(8364) + '1') > with Rhino 1.6 release 1 2004 11 30 it yields true while I think the character > with Unicode 8364 (it is the Euro symbol 'â‚¬') is not allowed as the first > character in an XML name, not even allowed in there at all. BTW, according to http://www.w3.org/TR/xml11#NT-NameStartChar the characters within [#x2070-#x218F] are allowed as first XML name character so € (8364 or 0x 20ac) is allowed

Martin Honnen

Reporter

Comment 7

•

21 years ago

(In reply to comment #6) > according to http://www.w3.org/TR/xml11#NT-NameStartChar the characters > within [#x2070-#x218F] are allowed as first XML name character so € (8364 or 0x > 20ac) is allowed Only that E4X edition 1 (ECMA-357) only refers to XML 1.0 and Namespaces for XML but not to XML 1.1 and in XML 1.0 the Euro character '€' is not allowed (inside names). Have you now implemented isXMLName following the rules of the XML 1.1 specification? That will break compatibility between Spidermonkey E4X and Rhino E4X then as I think Spidermonkey attempts to implement XML 1.0 rules.

Igor Bukanov

Assignee

Comment 8

•

21 years ago

(In reply to comment #7) > > Have you now implemented isXMLName following the rules of the XML 1.1 > specification? That will break compatibility between Spidermonkey E4X and Rhino > E4X then as I think Spidermonkey attempts to implement XML 1.0 rules. You are right, I just followed XML 1.1 rules while E4X refer to XML 1.0. Now rules in XML 1.0, http://w3.org/TR/2004/REC-xml-20040204/#NT-Name , are much more complex then in XML 1.1 and implementing them directly would lead to a huge bloat. Note that it is not possible AFAICS to use java.lang.Character methods directly since in JDK 1.4 they refer to Unicode 3.0, in JDK 1.5 they refer to Unicode 4.0 while XML 1.0 uses Unicode 2.0. In a sense following XML 1.1 is much simpler but not E4X-compliant.

Cameron McCormack (:heycam)

Comment 9

•

19 years ago

Hi. While the number of distinct ranges that cover the XML 1.0 name characters is greater than that of XML 1.1 name characters, using a lookup table where each bit represents a character in plane 0 shouldn't be too much of a bloat. For example, see the arrays: http://svn.apache.org/viewcvs.cgi/xmlgraphics/batik/trunk/sources/org/apache/batik/xml/XMLCharacters.java?rev=216064&view=markup and the methods to look up the arrays: http://svn.apache.org/viewcvs.cgi/xmlgraphics/batik/trunk/sources/org/apache/batik/xml/XMLUtilities.java?rev=216064&view=markup that are used in Batik. You're welcome to use the arrays for Rhino (barring any licence complications).

You need to log in before you can comment on or make changes to this bug.

Bugzilla

isXMLName() should be properly implemented

Categories

(Rhino Graveyard :: E4X, defect)

Tracking

(Not tracked)

People

(Reporter: martin.honnen, Assigned: igor)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file, 1 obsolete file)

Description

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Updated

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Attachment

General

Description

File Name

Content Type