Closed Bug 45552 Opened 24 years ago Closed 24 years ago

XMLSerializer.serializeToString doesn't translate special chars to entity refs

Categories

(Core :: XML, defect, P3)

x86
Windows NT
defect

Tracking

()

VERIFIED FIXED
mozilla0.9

People

(Reporter: taras.tielkes, Assigned: hjtoi-bugzilla)

Details

(Keywords: testcase, Whiteboard: [fixinhand, see bug 45627])

Attachments

(2 files)

(1) create an xml document
(2) add a text node containing "&"
(3) serialize
The "&" char doesn't get replaces by the "amp" entity ref, making the resulting
xml document string non-well-formed.

Attached testcase also shows that feeding this string to the DOMParser results
in parser errors inside the document tree.

The MSXML serializer auto-translates to ent.refs in a case like this.
Keywords: testcase
Target Milestone: --- → mozilla0.9
I have some kind of a fix to this, but I am not yet sure if it is the correct
one. Basically I just translate & and < to &amp; and &lt; in
nsDOMSerializer::SerializeText() (Where there some other characters I was
supposed to escape? Should I do this espacing with attribute values as well?
Anything else?).

There is something not quite right, though. I noticed that if the
parseFromString got an incorrect document I got doubly escaped stuff in the alert.

I can't think. Being sick sucks.
Better fix is in bug 45627.
Whiteboard: [fixinhand, see bug 45627]
I am taking all XMLExtras bugs to make it easier to see what I am working on...
Assignee: vidur → heikki
Fixed.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Reopening.

Heikki, this reaction is kind of late(sorry), but I believe you should replace 
5 characters with their character entities. Perhaps you did that in the patches 
for 45627, but I didn't check those)

The five chars to be escaped in XML (outside of CDATASections):
(not sure how if browsers will do something with my text)

&           &amp;
<           &lt;       
>           &gt;
'           &apos;
"           &quot;

This should be a quick fix, thanks for all the work on the xmlextras.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Above comment renders as intended in IE5, I guess HTML entities are all of the 
numerical form.
Did you test? I think those were fixed as well. The real fix to this bug is in
bug 45627. We are just using the XMLContentSerializer in layout, which escapes
all of those things. We are only escaping " and ' in attribute values. And by
the way, escaping > would be unnecessary even though we do escape it as well...

Here are the entity replacement tables we use:

http://lxr.mozilla.org/seamonkey/source/layout/base/src/nsXMLContentSerializer.cpp#570

I am closing this bug again. If, after testing, you see that it does not work
feel free to reopen.
Status: REOPENED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
I stand corrected. (not yet tested, getting a recent build now.) Heikki, if you 
have a few minutes could you tell me how nsXMLContentSerializer is hooked up to 
nsDOMSerializer from the xmlextras? (I mean what files I should start looking, 
not how it actually works.)

Btw, I get a "document does not exist" when I a try to look at 
http://lxr.mozilla.org/mozilla/source/extensions/xmlextras/base/src/nsIContentSe
rializer.h (Included here :  
http://lxr.mozilla.org/mozilla/source/extensions/xmlextras/base/src/nsDOMSeriali
zer.cpp#31 )

Thanks, Taras

Yeah, LXR is not too smart about includes. It always assumes the incudes are in
the same directory as the file that includes them. Use the LXR file search to
search for nsIContentSerializer.h.

Ok, so when somebody asks XMLSerializer.serializeToString(), we create a
document encoder (in nsDOMSerializer.cpp), initialize the encoder and then
simply tell it to encode to string. The encoder determines which content
serializer it needs to create to do the actual job (in our case it will be the
XML Content Serializer). Files to look at:

nsDOMSerializer.cpp
nsDocumentEncoder.cpp
nsXMLContentSerializer.cpp
Marking verified per last comments.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: