Closed Bug 277537 Opened 20 years ago Closed 20 years ago

isXMLName() should be properly implemented

Categories

(Rhino Graveyard :: E4X, defect)

1.6R1
x86
Windows XP
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: martin.honnen, Assigned: igor)

References

Details

Attachments

(1 file, 1 obsolete file)

When I try isXMLName(String.fromCharCode(8364) + '1') with Rhino 1.6 release 1 2004 11 30 it yields true while I think the character with Unicode 8364 (it is the Euro symbol '€') is not allowed as the first character in an XML name, not even allowed in there at all.
The problem with isXMLName doesn't seem to be restricted to that particular case, here are some tests where Rhino all yields true while the result should be false I think: Rhino 1.6 release 1 2004 11 30 js> isXMLName(String.fromCharCode(8364) + '1') true js> isXMLName('-el') true js> isXMLName('1el') So changing summary. Hmm, I have just looked at the source and indeed the implementation currently is public boolean isXMLName(Context cx, Object name) { // TODO: Check if qname.localName() matches NCName return true; } so obviously this is a known issue.
Summary: isXMLName(String.fromCharCode(8364) + '1') should give false → isXMLName() should give false for arguments String.fromCharCode(8364) + '1', '-el', isXMLName('1el'), needs to be implemented
Changing the title to reflect the real nature of the bug: currently isXMLName() in Rhino simply returns true.
Summary: isXMLName() should give false for arguments String.fromCharCode(8364) + '1', '-el', isXMLName('1el'), needs to be implemented → isXMLName() should be properly implemented
Blocks: 270779
Attachment #171078 - Attachment is obsolete: true
I committed the fix
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
(In reply to comment #0) > When I try > isXMLName(String.fromCharCode(8364) + '1') > with Rhino 1.6 release 1 2004 11 30 it yields true while I think the character > with Unicode 8364 (it is the Euro symbol '€') is not allowed as the first > character in an XML name, not even allowed in there at all. BTW, according to http://www.w3.org/TR/xml11#NT-NameStartChar the characters within [#x2070-#x218F] are allowed as first XML name character so € (8364 or 0x 20ac) is allowed
(In reply to comment #6) > according to http://www.w3.org/TR/xml11#NT-NameStartChar the characters > within [#x2070-#x218F] are allowed as first XML name character so € (8364 or 0x > 20ac) is allowed Only that E4X edition 1 (ECMA-357) only refers to XML 1.0 and Namespaces for XML but not to XML 1.1 and in XML 1.0 the Euro character '€' is not allowed (inside names). Have you now implemented isXMLName following the rules of the XML 1.1 specification? That will break compatibility between Spidermonkey E4X and Rhino E4X then as I think Spidermonkey attempts to implement XML 1.0 rules.
(In reply to comment #7) > > Have you now implemented isXMLName following the rules of the XML 1.1 > specification? That will break compatibility between Spidermonkey E4X and Rhino > E4X then as I think Spidermonkey attempts to implement XML 1.0 rules. You are right, I just followed XML 1.1 rules while E4X refer to XML 1.0. Now rules in XML 1.0, http://w3.org/TR/2004/REC-xml-20040204/#NT-Name , are much more complex then in XML 1.1 and implementing them directly would lead to a huge bloat. Note that it is not possible AFAICS to use java.lang.Character methods directly since in JDK 1.4 they refer to Unicode 3.0, in JDK 1.5 they refer to Unicode 4.0 while XML 1.0 uses Unicode 2.0. In a sense following XML 1.1 is much simpler but not E4X-compliant.
Hi. While the number of distinct ranges that cover the XML 1.0 name characters is greater than that of XML 1.1 name characters, using a lookup table where each bit represents a character in plane 0 shouldn't be too much of a bloat. For example, see the arrays: http://svn.apache.org/viewcvs.cgi/xmlgraphics/batik/trunk/sources/org/apache/batik/xml/XMLCharacters.java?rev=216064&view=markup and the methods to look up the arrays: http://svn.apache.org/viewcvs.cgi/xmlgraphics/batik/trunk/sources/org/apache/batik/xml/XMLUtilities.java?rev=216064&view=markup that are used in Batik. You're welcome to use the arrays for Rhino (barring any licence complications).
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: