Build: 04-02 trunk build on WinXP-SimpChinese
XML elment name in gb18030 surrogate characters doesn't work, although it works
with the gb18030 surrogate characters contents of rugular element names.
Created attachment 77286 [details]
doesn't work test page
Created attachment 77287 [details]
a worked test case
This test page works with gb18030 surrogate characters content but not in
Created attachment 77288 [details]
Another not working test page
Can you create test cases with surrogate in UTF-8 ?
ok, here is what happen
in XML 1.0 see http://www.w3.org/TR/2000/REC-xml-20001006
a well-formed xml is defined as follow
> document ::= prolog element Misc*
and if you look at the definitation of element
> element ::= EmptyElemTag | STag content ETag
and if you look at the definitation of STag
> STag ::= '<' Name (S Attribute)* S? '>' [WFC: Unique Att Spec]
and look at the definitation of Name
> Name ::= (Letter | '_' | ':') (NameChar)*
and you look at NameChar
> NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' |
CombiningChar | Extender
and look at Letter
> Letter ::= BaseChar | Ideographic
you will see both Unicode ideograph extension A (U+3400-U+4dff) and Extension B
(in surrogate) are not listed for BaseChar nor Ideographic. Therefore, those
characters cannot be used as Name in XML 1.0
We should talk to XML author about this issue and maybe they will change it for
later version of XML
but untill then, this is an invalid bug.
Mark as verified not a mozilla problem according to Frank's comment.