U+30D8 causes XML error in UTF-16BE

RESOLVED FIXED in Firefox 21

Status

()

Core
DOM
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: Ryan Richards, Assigned: emk)

Tracking

({regression})

20 Branch
mozilla23
x86
Windows Vista
regression
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox20 wontfix, firefox21+ fixed, firefox22+ fixed, firefox23+ fixed)

Details

(Whiteboard: [qa-])

Attachments

(3 attachments, 1 obsolete attachment)

(Reporter)

Description

5 years ago
Created attachment 735628 [details]
U+30D8 in a minimal XML file

User Agent: Mozilla/5.0 (Windows NT 6.0; rv:20.0) Gecko/20100101 Firefox/20.0
Build ID: 20130326150557

Steps to reproduce:

After upgrading to version 20 I navigated to http://アニメイヘム.com/


Actual results:

I received an XML Parsing Error: not well-formed


Expected results:

The document is well formed and should have displayed.
Last good nightly: 2012-11-08
First bad nightly: 2012-11-09

Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=36e99ea02c05&tochange=90cea19e27e2

bug 801402 is most likely the culprit here.
I bisect this if required
Status: UNCONFIRMED → NEW
status-firefox20: --- → affected
status-firefox21: --- → affected
status-firefox22: --- → affected
tracking-firefox21: --- → ?
tracking-firefox22: --- → ?
tracking-firefox23: --- → ?
Component: Untriaged → DOM
Ever confirmed: true
Flags: needinfo?(VYV03354)
Keywords: regression
Product: Firefox → Core
(Assignee)

Comment 2

5 years ago
Similar to 844007 which is supposed to be fixed in Firefox 20. However the page encoding is actually utf-16 (but big endian) here...
Flags: needinfo?(VYV03354)
(Assignee)

Updated

5 years ago
Attachment #735628 - Attachment mime type: text/plain → application/xhtml+xml
(Reporter)

Comment 3

5 years ago
30
Summary: U+30DB causes XML error in UTF-16BE → U+30D8 causes XML error in UTF-16BE
(Reporter)

Comment 4

5 years ago
Checking other characters, looks like every U+nnD8 to U+nnDF character causes the error.
Blocks: 801402
Tracking and passing this on to :emk as this is a regression from 801402. Will be very helpful to have a patch uplifted before Beta 3/Beta 4 goes to build.

Updated

5 years ago
Assignee: nobody → VYV03354
status-firefox23: --- → affected
tracking-firefox21: ? → +
tracking-firefox22: ? → +
tracking-firefox23: ? → +
(Assignee)

Comment 6

5 years ago
This is because the "UTF-16" label no longer represents a sniffing decoder. It is an alias of "UTF-16LE" now. I'm working on a fix.
(Assignee)

Comment 7

5 years ago
Created attachment 736323 [details] [diff] [review]
Prefer UTF-16BE/LE to UTF-16

This patch will basically undo bug 335531. The Encoding Standard intentionally "misuses" the UTF-16 label.
Attachment #736323 - Flags: review?(hsivonen)
(Assignee)

Comment 8

5 years ago
https://tbpl.mozilla.org/?tree=Try&rev=bca208240bda
(Assignee)

Comment 9

5 years ago
Created attachment 736501 [details] [diff] [review]
Regression tests
Attachment #736501 - Flags: review?(hsivonen)
Attachment #736323 - Flags: review?(hsivonen) → review+
(Assignee)

Comment 10

5 years ago
Created attachment 737026 [details] [diff] [review]
Regression tests, v2

Forgot to call revokeObjectURL().
Attachment #736501 - Attachment is obsolete: true
Attachment #736501 - Flags: review?(hsivonen)
Attachment #737026 - Flags: review?(hsivonen)
(Reporter)

Comment 11

5 years ago
Just a bit of a follow up to the intial report,

U+30D7,プ, when swapped to D730 is 휰, a valid code point.
U+30D8,ヘ, when swapped to D830 is an invalid code point.
U+30D7 displays properly as プ while U+30D8 errors out.
Comment on attachment 737026 [details] [diff] [review]
Regression tests, v2

Did we establish somewhere that Web compat prevents treating bogus labeling as fatal? Did that get written up in a spec somewhere?
Attachment #737026 - Flags: review?(hsivonen) → review+
(Assignee)

Comment 13

5 years ago
(In reply to Henri Sivonen (:hsivonen) from comment #12)
> Comment on attachment 737026 [details] [diff] [review]
> Regression tests, v2
> 
> Did we establish somewhere that Web compat prevents treating bogus labeling
> as fatal? Did that get written up in a spec somewhere?

We need a self-contained encoding sniffing algorithm for XML because the XML spec assumes that the "UTF-16" can represent both endians.
(Assignee)

Comment 14

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/e58f99cad251
https://hg.mozilla.org/integration/mozilla-inbound/rev/5d40dce142e0
(Assignee)

Updated

5 years ago
Flags: in-testsuite+
https://hg.mozilla.org/mozilla-central/rev/e58f99cad251
https://hg.mozilla.org/mozilla-central/rev/5d40dce142e0
Status: NEW → RESOLVED
Last Resolved: 5 years ago
status-firefox23: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Depends on: 862757
(In reply to Masatoshi Kimura [:emk] from comment #14)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/e58f99cad251
> https://hg.mozilla.org/integration/mozilla-inbound/rev/5d40dce142e0

Can you please request nomination for aurora/beta here ? Beta 4 which goes to build tomorrow will be the last opportunity to land speculative low risk fixes.
(Assignee)

Comment 17

5 years ago
Comment on attachment 736323 [details] [diff] [review]
Prefer UTF-16BE/LE to UTF-16

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 801402
User impact if declined: Some websites will not be displayed at all.
Testing completed (on m-c, etc.): on m-c in a few days.
Risk to taking this patch (and alternatives if risky): low 
String or IDL/UUID changes made by this patch: none
Attachment #736323 - Flags: approval-mozilla-beta?
Attachment #736323 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 18

5 years ago
Comment on attachment 737026 [details] [diff] [review]
Regression tests, v2

[Approval Request Comment]
See above.
Attachment #737026 - Flags: approval-mozilla-beta?
Attachment #737026 - Flags: approval-mozilla-aurora?
Comment on attachment 736323 [details] [diff] [review]
Prefer UTF-16BE/LE to UTF-16

The patch is low risk and an Fx20 regression which does not display a few websites.
Please make sure to land on mozilla-beta asap, as we are going to build today with Fx21 beta 4.
Attachment #736323 - Flags: approval-mozilla-beta?
Attachment #736323 - Flags: approval-mozilla-beta+
Attachment #736323 - Flags: approval-mozilla-aurora?
Attachment #736323 - Flags: approval-mozilla-aurora+

Updated

5 years ago
Attachment #737026 - Flags: approval-mozilla-beta?
Attachment #737026 - Flags: approval-mozilla-beta+
Attachment #737026 - Flags: approval-mozilla-aurora?
Attachment #737026 - Flags: approval-mozilla-aurora+
(In reply to Ryan Richards from comment #0)
> Created attachment 735628 [details]
> U+30D8 in a minimal XML file
> 
> User Agent: Mozilla/5.0 (Windows NT 6.0; rv:20.0) Gecko/20100101 Firefox/20.0
> Build ID: 20130326150557
> 
> Steps to reproduce:
> 
> After upgrading to version 20 I navigated to http://アニメイヘム.com/
> 
> 
> Actual results:
> 
> I received an XML Parsing Error: not well-formed
> 
> 
> Expected results:
> 
> The document is well formed and should have displayed.

Thanks for the report ! Can you try our latest nightly or aurora when the patch is landed, to confirm the issue is fixed for you ? Thanks
(Assignee)

Comment 21

5 years ago
Could you also approve attachment 738720 [details] [diff] [review]? Otherwise this patch will cause permaorange on Thundirbird tinderboxes.
If not, I'll withdraw requesting beta approval.
(In reply to Masatoshi Kimura [:emk] from comment #21)
> Could you also approve attachment 738720 [details] [diff] [review]?
> Otherwise this patch will cause permaorange on Thundirbird tinderboxes.
> If not, I'll withdraw requesting beta approval.

Who would be able to review this ?
(Assignee)

Comment 23

5 years ago
(In reply to bhavana bajaj [:bajaj] from comment #22)
> (In reply to Masatoshi Kimura [:emk] from comment #21)
> > Could you also approve attachment 738720 [details] [diff] [review]?
> > Otherwise this patch will cause permaorange on Thundirbird tinderboxes.
> > If not, I'll withdraw requesting beta approval.
> 
> Who would be able to review this ?

It's already reviewed by Simon and even landed (as I wrote in the approval request).
(In reply to Masatoshi Kimura [:emk] from comment #23)
> (In reply to bhavana bajaj [:bajaj] from comment #22)
> > (In reply to Masatoshi Kimura [:emk] from comment #21)
> > > Could you also approve attachment 738720 [details] [diff] [review]?
> > > Otherwise this patch will cause permaorange on Thundirbird tinderboxes.
> > > If not, I'll withdraw requesting beta approval.
> > 
> > Who would be able to review this ?
> 
> It's already reviewed by Simon and even landed (as I wrote in the approval
> request).

I was just getting to 863025, its approved now :)
https://hg.mozilla.org/releases/mozilla-aurora/rev/e36ec18f4854
https://hg.mozilla.org/releases/mozilla-aurora/rev/940312febd4b

https://hg.mozilla.org/releases/mozilla-beta/rev/a1b7febaccf0
https://hg.mozilla.org/releases/mozilla-beta/rev/8df81abcb329
status-firefox20: affected → wontfix
status-firefox21: affected → fixed
status-firefox22: affected → fixed
(Reporter)

Comment 26

5 years ago
(In reply to bhavana bajaj [:bajaj] from comment #20)
> (In reply to Ryan Richards from comment #0)
> > Created attachment 735628 [details]
> > U+30D8 in a minimal XML file
> > 
> > User Agent: Mozilla/5.0 (Windows NT 6.0; rv:20.0) Gecko/20100101 Firefox/20.0
> > Build ID: 20130326150557
> > 
> > Steps to reproduce:
> > 
> > After upgrading to version 20 I navigated to http://アニメイヘム.com/
> > 
> > 
> > Actual results:
> > 
> > I received an XML Parsing Error: not well-formed
> > 
> > 
> > Expected results:
> > 
> > The document is well formed and should have displayed.
> 
> Thanks for the report ! Can you try our latest nightly or aurora when the
> patch is landed, to confirm the issue is fixed for you ? Thanks

Open my site in Nightly, everything is working.
Are there any user facing regression risks which QA should look at not covered by existing tests?
Flags: needinfo?(VYV03354)
(Assignee)

Comment 28

5 years ago
Unlikely, hence low risk.
Flags: needinfo?(VYV03354)
(In reply to Masatoshi Kimura [:emk] from comment #28)
> Unlikely, hence low risk.

Okay, thanks Masatoshi. I just wanted to confirm before flagging this [qa-].
Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.