Closed Bug 545644 Opened 11 years ago Closed 11 years ago

REGRESSION: HTML "nbsp" entity is not handled correctly in xHTML mode


(Core :: DOM: Serializers, defect)

Not set



Tracking Status
blocking2.0 --- beta1+
blocking1.9.2 --- needed
status1.9.2 --- .7-fixed


(Reporter: nbelaevski, Assigned: laurent)



(Keywords: regression, testcase, verified1.9.2)


(3 files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/ Safari/532.5
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6

Prototype.js framework uses innerHTML for text escapement. Regression issue in FF 3.6 make innerHTML incorrectly handle "\u00A0" symbol in xHTML mode.

Reproducible: Always

Steps to Reproduce:
Open attached file in FF 3.5.6 & FF 3.6
Actual Results:  
Error is caught in FF 3.6 and alert is shown

Expected Results:  
Error should not occur
Attached file Proposed test case
Attachment #426494 - Attachment mime type: text/html → application/xhtml+xml
Regression range:
Likely a regression from bug 422403.
Blocks: 422403
Severity: normal → major
Component: General → Serializers
Ever confirmed: true
Keywords: regression, testcase
OS: Windows Vista → All
Product: Firefox → Core
QA Contact: general → dom-to-text
Hardware: x86 → All
blocking2.0: --- → ?
blocking1.9.2: --- → ?
status1.9.2: --- → ?
the XHTML serializer now translates entities like the HTML serializer does, and not like the XML serializer does. (methods nsXHTMLContentSerializer::AppendAndTranslateEntities and nsXMLContentSerializer::AppendAndTranslateEntities in the current trunk).

So now, it converts some characters to its entity equivalent. It converts then 0xA0 characters to  .

However, the XML parser doesn't like entities in text we insert in text nodes (which is normal I guess). The HTML parser seems less strict with that, this is why we don't have the error in HTML.

Perhaps I made a mistake by translating entities in XHTML.

Henry, any thought ? You confirm that we shouldn't create entities (except for < > & "), because of the xhtml parser ?
Assignee: nobody → laurent
We'd be interested in a patch, but I don't think this blocks based on my understanding of the impact. bz, you nominated it, if you disagree please let me know why!
blocking1.9.2: ? → -
(In reply to comment #3)
> Henry, any thought ? You confirm that we shouldn't create entities (except for
> < > & "), because of the xhtml parser ?

Confirmed. Creating entity references that aren't pre-defined in XML is unsafe in Web-exposed serialization APIs when outputting any flavor of XML.

In the XMLSerializer case, I explicitly set the flags so that &nbsp; isn't emitted. Perhaps that mode should be the default so that anyone who wants to use internal APIs to emit &nbsp; (etc.) in XML needs to pass an explicit flag.
Ok, I will provide a patch soon.
Duplicate of this bug: 546474
blocking2.0: ? → beta1
Attached patch the patchSplinter Review
With this patch, the XHTML serializer does not support any more these flags: OutputEncodeBasicEntities, OutputEncodeLatin1Entities, OutputEncodeHTMLEntities and OutputEncodeW3CEntities. So it have the same behavior with entities as it had before the rewrite of the xhtml serializer.
Attachment #428398 - Flags: superreview?(jonas)
Attachment #428398 - Flags: review?(Olli.Pettay)
Comment on attachment 428398 [details] [diff] [review]
the patch

>+static const PRUint16 kValNBSP = 160;
>+static const char kEntityNBSP[] = "nbsp";
Make sure the handling of these two is tested both in

>+    for (aStr.BeginReading(iter);
>+          iter != done_reading;
>+          iter.advance(PRInt32(advanceLength))) {
align iter under aStr

>diff --git a/content/base/test/test_bug545644.html b/content/base/test
>+function testInner () {
>+  var div = document.getElementById("test_inner");
>+    try {
>+      div.innerHTML = "some \u00A0 text";
>+      div.innerHTML += "!";
>+      ok (true, "innerHTML in xhtml", "test ok");
>+    } catch (e) {
>+      ok(false, "innerHTML in xhtml", "test failed, unwanted exception "+e);
The test comments are wrong here. The file is html, not xhtml.
Attachment #428398 - Flags: review?(Olli.Pettay) → review+
Attachment #428398 - Flags: superreview?(jonas) → superreview+
Checked in the patch, with adressed comments.
Is this bug fixed?
Yes it is fixed on mozilla-central (gecko 1.9.3)
Closed: 11 years ago
Resolution: --- → FIXED
the patch for gecko 1.9.2 is the same as for the trunk
Attachment #433947 - Flags: approval1.9.2.3?
Will this fix be in 3.6? I checked 3.6.4 beta, and the problem persists there...
Comment on attachment 433947 [details] [diff] [review]
Patch for the 1.9.2 branch

This patch fixes a serious regression in the use of innerHtml in gecko 1.9.2, so it should be applied in the 1.9.2 branch
Attachment #433947 - Flags: approval1.9.2.4? → approval1.9.2.5?
I checked Apr 28 nightly of 3.6.5pre, and the fix is not in that build either.

I guess "should be applied" is a bit ambiguous. Do you mean has already been applied, or that it will be applied sometime in the future? If it has been applied my testing says the problem is not fixed...

V 3.6.4 would be nice, but 3.6.5 should be a certainty!
@marx : stop to check all versions of Firefox. When the patch will be applied (if it will) in the 3.6 branch, there will be a comment here with the url of the changeset.

I'm waiting for an approval to land it in the branch.
blocking1.9.2: - → ?
blocking1.9.2: ? → needed
Comment on attachment 433947 [details] [diff] [review]
Patch for the 1.9.2 branch

a=beltzner for mozilla-1.9.2 default

Thanks for writing tests!
Attachment #433947 - Flags: approval1.9.2.5? → approval1.9.2.5+
Attachment #433947 - Flags: approval1.9.2.5+ → approval1.9.2.6+
landed into mozilla-1.9.2  (gecko
Verified as fixed in 1.9.2 with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/20100624 Namoroka/3.6.6pre ( .NET CLR 3.5.30729) using testcase.
Keywords: verified1.9.2
You need to log in before you can comment on or make changes to this bug.