Closed
Bug 277537
Opened 20 years ago
Closed 20 years ago
isXMLName() should be properly implemented
Categories
(Rhino Graveyard :: E4X, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: martin.honnen, Assigned: igor)
References
Details
Attachments
(1 file, 1 obsolete file)
3.36 KB,
patch
|
Details | Diff | Splinter Review |
When I try
isXMLName(String.fromCharCode(8364) + '1')
with Rhino 1.6 release 1 2004 11 30 it yields true while I think the character
with Unicode 8364 (it is the Euro symbol '€') is not allowed as the first
character in an XML name, not even allowed in there at all.
Reporter | ||
Comment 1•20 years ago
|
||
The problem with isXMLName doesn't seem to be restricted to that particular
case, here are some tests where Rhino all yields true while the result should be
false I think:
Rhino 1.6 release 1 2004 11 30
js> isXMLName(String.fromCharCode(8364) + '1')
true
js> isXMLName('-el')
true
js> isXMLName('1el')
So changing summary.
Hmm, I have just looked at the source and indeed the implementation currently is
public boolean isXMLName(Context cx, Object name)
{
// TODO: Check if qname.localName() matches NCName
return true;
}
so obviously this is a known issue.
Summary: isXMLName(String.fromCharCode(8364) + '1') should give false → isXMLName() should give false for arguments String.fromCharCode(8364) + '1', '-el', isXMLName('1el'), needs to be implemented
Assignee | ||
Comment 2•20 years ago
|
||
Changing the title to reflect the real nature of the bug: currently isXMLName()
in Rhino simply returns true.
Summary: isXMLName() should give false for arguments String.fromCharCode(8364) + '1', '-el', isXMLName('1el'), needs to be implemented → isXMLName() should be properly implemented
Assignee | ||
Comment 3•20 years ago
|
||
Assignee | ||
Comment 4•20 years ago
|
||
Assignee | ||
Updated•20 years ago
|
Attachment #171078 -
Attachment is obsolete: true
Assignee | ||
Comment 5•20 years ago
|
||
I committed the fix
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 6•20 years ago
|
||
(In reply to comment #0)
> When I try
> isXMLName(String.fromCharCode(8364) + '1')
> with Rhino 1.6 release 1 2004 11 30 it yields true while I think the character
> with Unicode 8364 (it is the Euro symbol '€') is not allowed as the first
> character in an XML name, not even allowed in there at all.
BTW, according to http://www.w3.org/TR/xml11#NT-NameStartChar the characters
within [#x2070-#x218F] are allowed as first XML name character so € (8364 or 0x
20ac) is allowed
Reporter | ||
Comment 7•20 years ago
|
||
(In reply to comment #6)
> according to http://www.w3.org/TR/xml11#NT-NameStartChar the characters
> within [#x2070-#x218F] are allowed as first XML name character so € (8364 or 0x
> 20ac) is allowed
Only that E4X edition 1 (ECMA-357) only refers to XML 1.0 and Namespaces for XML
but not to XML 1.1 and in XML 1.0 the Euro character '€' is not allowed (inside
names).
Have you now implemented isXMLName following the rules of the XML 1.1
specification? That will break compatibility between Spidermonkey E4X and Rhino
E4X then as I think Spidermonkey attempts to implement XML 1.0 rules.
Assignee | ||
Comment 8•20 years ago
|
||
(In reply to comment #7)
>
> Have you now implemented isXMLName following the rules of the XML 1.1
> specification? That will break compatibility between Spidermonkey E4X and Rhino
> E4X then as I think Spidermonkey attempts to implement XML 1.0 rules.
You are right, I just followed XML 1.1 rules while E4X refer to XML 1.0. Now
rules in XML 1.0, http://w3.org/TR/2004/REC-xml-20040204/#NT-Name , are much
more complex then in XML 1.1 and implementing them directly would lead to a huge
bloat.
Note that it is not possible AFAICS to use java.lang.Character methods directly
since in JDK 1.4 they refer to Unicode 3.0, in JDK 1.5 they refer to Unicode 4.0
while XML 1.0 uses Unicode 2.0. In a sense following XML 1.1 is much simpler but
not E4X-compliant.
Comment 9•19 years ago
|
||
Hi.
While the number of distinct ranges that cover the XML 1.0 name characters is greater than that of XML 1.1 name characters, using a lookup table where each bit represents a character in plane 0 shouldn't be too much of a bloat. For example, see the arrays:
http://svn.apache.org/viewcvs.cgi/xmlgraphics/batik/trunk/sources/org/apache/batik/xml/XMLCharacters.java?rev=216064&view=markup
and the methods to look up the arrays:
http://svn.apache.org/viewcvs.cgi/xmlgraphics/batik/trunk/sources/org/apache/batik/xml/XMLUtilities.java?rev=216064&view=markup
that are used in Batik. You're welcome to use the arrays for Rhino (barring any licence complications).
You need to log in
before you can comment on or make changes to this bug.
Description
•