Closed Bug 1287071 Opened 8 years ago Closed 8 years ago

DOMParser.parseFromString does not honor XML encoding

Tracking

()

Status:

RESOLVED INVALID

People

(Reporter: yan12125, Unassigned)

Details

Chih-Hsuan Yen [:yan12125]

Reporter

Description

•

8 years ago

I have a XML string encoded in big5:

var data = atob('PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iYmlnNSIgPz48dGl0bGU+pKSk5TwvdGl0bGU+')

(<?xml version="1.0" encoding="big5" ?><title>中文</title> in UTF-8.)

DOMParser does not give a correct answer:
(new DOMParser()).parseFromString(data), 'text/xml').firstChild.textContent

In dom/base/DOMParser.cpp I see DOMParser::ParseFromString() has hard-coded encoding UTF-8. Is it intentional by W3C standards?

Loic

Comment 1

•

8 years ago

Boris, your thoughts?

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 2

•

8 years ago

The spec is at http://domparsing.spec.whatwg.org/#the-domparser-interface and the input to it is a sequence of UTF-16 codepoints, kinda (DOMString), not a sequence of bytes (which would be ByteString).

So the observed behavior is correct per spec.

Status: UNCONFIRMED → RESOLVED

Closed: 8 years ago

Flags: needinfo?(bzbarsky)

Resolution: --- → INVALID

Chih-Hsuan Yen [:yan12125]

Reporter

Comment 3

•

8 years ago

Thanks. I was not sure whether the standard or Firefox needs modification. Now it's clear :)

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

DOMParser.parseFromString does not honor XML encoding

Categories

(Core :: XML, defect)

Tracking

()

People

(Reporter: yan12125, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3