Closed Bug 134963 Opened 23 years ago Closed 23 years ago

XML elment name in gb18030 surrogate character doesn't work

Categories

(Core :: Internationalization, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

VERIFIED INVALID

People

(Reporter: amyy, Assigned: shanjian)

Details

(Keywords: intl)

Attachments

(3 files)

480 bytes, application/vnd.mozilla.xul+xml
Details
1.03 KB, application/vnd.mozilla.xul+xml
Details
656 bytes, application/vnd.mozilla.xul+xml
Details
Build: 04-02 trunk build on WinXP-SimpChinese XML elment name in gb18030 surrogate characters doesn't work, although it works with the gb18030 surrogate characters contents of rugular element names.
Attached file doesn't work test page
Attached file a worked test case
This test page works with gb18030 surrogate characters content but not in element delaration
->shanjian
Assignee: yokoyama → shanjian
Summary: XML elment name in gb18030 surrogate character doesn't work → XML elment name in gb18030 surrogate character doesn't work
Keywords: intl
QA Contact: ruixu → ylong
Can you create test cases with surrogate in UTF-8 ?
ok, here is what happen in XML 1.0 see http://www.w3.org/TR/2000/REC-xml-20001006 a well-formed xml is defined as follow http://www.w3.org/TR/2000/REC-xml-20001006#sec-well-formed >[1] document ::= prolog element Misc* and if you look at the definitation of element http://www.w3.org/TR/2000/REC-xml-20001006#NT-element >[39] element ::= EmptyElemTag | STag content ETag and if you look at the definitation of STag http://www.w3.org/TR/2000/REC-xml-20001006#NT-STag >[40] STag ::= '<' Name (S Attribute)* S? '>' [WFC: Unique Att Spec] and look at the definitation of Name http://www.w3.org/TR/2000/REC-xml-20001006#NT-Name >[5] Name ::= (Letter | '_' | ':') (NameChar)* and you look at NameChar http://www.w3.org/TR/2000/REC-xml-20001006#NT-NameChar >[4] NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | CombiningChar | Extender and look at Letter http://www.w3.org/TR/2000/REC-xml-20001006#NT-Letter >[84] Letter ::= BaseChar | Ideographic you will see both Unicode ideograph extension A (U+3400-U+4dff) and Extension B (in surrogate) are not listed for BaseChar nor Ideographic. Therefore, those characters cannot be used as Name in XML 1.0 We should talk to XML author about this issue and maybe they will change it for later version of XML but untill then, this is an invalid bug.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → INVALID
Mark as verified not a mozilla problem according to Frank's comment.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: