should be serialized to "asdf
qwer" in XML so that an XML parser that
conforms to section 3.3.3 of the XML-1.0 spec
(http://www.w3.org/TR/REC-xml#AVNormalize) will normalize it to "asdf\nqwer"
again. Currently the XML serializer writes out the newline character without
escaping, so a conforming parser normalizes the attribute value to "asdf qwer".
The same applies to tab and carriage-return.
I think it should suffice to set kAttrEntities to "
Unfortunately the work-around to convert the newline to "
" already in the
DOM before serializing does not work. Here the serializer is smart and replaces
the ampersand by "&".
Nice, a 12-year old bug! And still unresolved.
Added a testcase to show what is happening: http://jsfiddle.net/u5ka8f36/
Turns out that a literal newline in attribute value is parsed okay *as long as the parser operates in text/html mode*. For application/xml documents, the literal newline is converted and normalized to a space, as per the original post (and spec).
So clearly we need to explain to the XMLSerializer that it should escape newlines, because the result it not text/html, but rather application/xml.
This is indeed a bug. U+000A and U+0009 should be escaped in XML. (I think it's fine to not escape U+000D, since HTML serializer doesn't escape it and nobody complains about that.)