Closed Bug 277537 Opened 20 years ago Closed 20 years ago

isXMLName() should be properly implemented

Categories

(Rhino Graveyard :: E4X, defect)

1.6R1
x86
Windows XP
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: martin.honnen, Assigned: igor)

References

Details

Attachments

(1 file, 1 obsolete file)

When I try
  isXMLName(String.fromCharCode(8364) + '1')
with Rhino 1.6 release 1 2004 11 30 it yields true while I think the character
with Unicode 8364 (it is the Euro symbol '€') is not allowed as the first
character in an XML name, not even allowed in there at all.
The problem with isXMLName doesn't seem to be restricted to that particular
case, here are some tests where Rhino all yields true while the result should be
false I think:

Rhino 1.6 release 1 2004 11 30
js> isXMLName(String.fromCharCode(8364) + '1')
true
js> isXMLName('-el')
true
js> isXMLName('1el')

So changing summary.

Hmm, I have just looked at the source and indeed the implementation currently is

    public boolean isXMLName(Context cx, Object name)
    {
        // TODO: Check if qname.localName() matches NCName

        return true;
    }

so obviously this is a known issue.
Summary: isXMLName(String.fromCharCode(8364) + '1') should give false → isXMLName() should give false for arguments String.fromCharCode(8364) + '1', '-el', isXMLName('1el'), needs to be implemented
Changing the title to reflect the real nature of the bug: currently isXMLName()
in Rhino simply returns true.

Summary: isXMLName() should give false for arguments String.fromCharCode(8364) + '1', '-el', isXMLName('1el'), needs to be implemented → isXMLName() should be properly implemented
Blocks: 270779
Attachment #171078 - Attachment is obsolete: true
I committed the fix
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
(In reply to comment #0)
> When I try
>   isXMLName(String.fromCharCode(8364) + '1')
> with Rhino 1.6 release 1 2004 11 30 it yields true while I think the character
> with Unicode 8364 (it is the Euro symbol '€') is not allowed as the first
> character in an XML name, not even allowed in there at all.

BTW, according to http://www.w3.org/TR/xml11#NT-NameStartChar the characters
within [#x2070-#x218F] are allowed as first XML name character so € (8364 or 0x
20ac) is allowed
(In reply to comment #6)
 
> according to http://www.w3.org/TR/xml11#NT-NameStartChar the characters
> within [#x2070-#x218F] are allowed as first XML name character so € (8364 or 0x
> 20ac) is allowed

Only that E4X edition 1 (ECMA-357) only refers to XML 1.0 and Namespaces for XML
but not to XML 1.1 and in XML 1.0 the Euro character '€' is not allowed (inside
names).

Have you now implemented isXMLName following the rules of the XML 1.1
specification? That will break compatibility between Spidermonkey E4X and Rhino
E4X then as I think Spidermonkey attempts to implement XML 1.0 rules.
(In reply to comment #7)
> 
> Have you now implemented isXMLName following the rules of the XML 1.1
> specification? That will break compatibility between Spidermonkey E4X and Rhino
> E4X then as I think Spidermonkey attempts to implement XML 1.0 rules.

You are right, I just followed XML 1.1 rules while E4X refer to XML 1.0. Now
rules in XML 1.0, http://w3.org/TR/2004/REC-xml-20040204/#NT-Name , are much
more complex then in XML 1.1 and implementing them directly would lead to a huge
bloat. 

Note that it is not possible AFAICS to use java.lang.Character methods directly
since in JDK 1.4 they refer to Unicode 3.0, in JDK 1.5 they refer to Unicode 4.0
while XML 1.0 uses Unicode 2.0. In a sense following XML 1.1 is much simpler but
not E4X-compliant.
Hi.

While the number of distinct ranges that cover the XML 1.0 name characters is greater than that of XML 1.1 name characters, using a lookup table where each bit represents a character in plane 0 shouldn't be too much of a bloat.  For example, see the arrays:

http://svn.apache.org/viewcvs.cgi/xmlgraphics/batik/trunk/sources/org/apache/batik/xml/XMLCharacters.java?rev=216064&view=markup

and the methods to look up the arrays:

http://svn.apache.org/viewcvs.cgi/xmlgraphics/batik/trunk/sources/org/apache/batik/xml/XMLUtilities.java?rev=216064&view=markup

that are used in Batik.  You're welcome to use the arrays for Rhino (barring any licence complications).
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: