Closed Bug 230099 Opened 21 years ago Closed 21 years ago

nsRDFXMLSerializer incorrectly serializes newline characters in literals

Categories

(Core Graveyard :: RDF, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mrj, Assigned: tingley)

Details

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031016
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031016

When literal containing newline characters serialized into XML newline
characters insert as is into the attribute of RDF:Description. Then, while
reading this value from XML parser treats these newline characters as spaces, so
writing and rereading of XML leads to lost line breaks.

Reproducible: Always

Steps to Reproduce:
text_res = this.RDF.GetLiteral("Some \n text \n goes \n here");
this.ds.Assert(this.selected, this.rdf_text, text_res, true);
this.flush();
Actual Results:  
In the RDF file:
<RDF:Description about="rdf:#$XHK122"
     otl:text="Some 
 text 
 goes 
 here" />



There are two possible ways to resolve this prob:

1) Quote newline characters while serializing literal into the Description's
attribute. &#10; works fine. Heh, I've got no idea about possible issues with
different newline representation in different OS, could it be a prob here?

2) Serialize complex literals as tag, not attribute. I.e.
<otl:text>Some 
 text 
 goes 
 here</otl:text>

Both parses correctly, but second is more readable (IMHO).
I don't have a tree at the moment, but this isn't a dup, and I'm pretty sure
this problem exists as described.  Tentatively confirming.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Attached patch patchSplinter Review
This patch adopts method #2.

I'm not totally thrilled with the design of this whole section of the code,
since  the tests on each property can be run redundantly, but I'd rather not
fix that here.
Attachment #138913 - Flags: review?(rjc)
I think it would be best to see what the XML/RDF specs say, if anything, about this matter.

[
  One interesting, somewhat-related note that I stumbled across:
       http://www.w3.org/TR/REC-xml#sec-line-ends
]
Actually, I looked it up before writing the patch, because I wasn't sure either.
 I'm pretty sure, though, that this is the write thing to do.

Normalization of whitespace in XML attributes is defined here:
	http://www.w3.org/TR/REC-xml#AVNormalize

The first of several steps is the normalization of line breaks as described in
the link you posted.  The critical portion, however, is the third bullet under "3.":
 * For a white space character (#x20, #xD, #xA, #x9), append a space character
(#x20) to the normalized value.

In other words, our attribute parsing is correct -- when reading something of
the form
	<rdf:Description NC:foo="a value
that spans
several lines" />

It should be processed as "a value that spans several lines".

However, this changes the value of the RDF literal!  Plain RDF literals are
simply Unicode strings, and comparison of two literals is accomplished through
bytewise comparison of their respective Unicode characters.

http://www.w3.org/TR/2003/PR-rdf-concepts-20031215/#dfn-literal

Monkeying with literal values is -- as this bug correctly points out -- naughty
behavior.  By serializing the literal as an XML element body, we can preserve
the whitespace as we originally read it.
Comment on attachment 138913 [details] [diff] [review]
patch

Looks acceptable to me, r=rjc
Attachment #138913 - Flags: review?(rjc) → review+
Attachment #138913 - Flags: superreview?(bz-vacation)
Comment on attachment 138913 [details] [diff] [review]
patch

Yeah, seems ok.  sr=bzbarsky.  Use nsnull instead of NULL maybe?  And is
GetValueConst guaranteed to give a valid nonzero ptr that you can deref?
Attachment #138913 - Flags: superreview?(bz-vacation) → superreview+
Taking.
Assignee: rjc → tingley
Fix checked in, with bz's suggested changes.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: