Open
Bug 560242
Opened 15 years ago
Updated 3 years ago
Unknown encoding in XML declaration should be a fatal error
Categories
(Core :: XML, defect, P3)
Tracking
()
NEW
People
(Reporter: ap, Unassigned)
Details
Attachments
(1 file)
276 bytes,
application/xhtml+xml
|
Details |
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8; en-us) AppleWebKit/531.22.7 (KHTML, like Gecko) Version/4.0.5 Safari/531.22.7
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; ru; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3
From WebKit bug <https://bugs.webkit.org/show_bug.cgi?id=37629>.
There is a difference between WebKit and Firefox in that we report a fatal error for something like <?xml version="1.0" encoding="default"?>, but Firefox seems to use utf-8 whenever it gets an unknown encoding name.
My understanding is that detecting a fatal error is required, see section 4.3.3 of XML 1.0 spec: "it is a fatal error when an XML processor encounters an entity with an encoding that it is unable to process."
Reproducible: Always
Steps to Reproduce:
There should be a parsing error reported when opening the attached test case.
Reporter | ||
Comment 1•15 years ago
|
||
![]() |
||
Comment 2•15 years ago
|
||
I'm guessing the issue is this code in ParserWriteFunc:
2848 if (pws->mParser->DetectMetaTag(buf, theNumRead, guess, guessSource) ||
2849 ((count >= 4) &&
2850 DetectByteOrderMark((const unsigned char*)buf,
2851 theNumRead, guess, guessSource))) {
2852 nsCOMPtr<nsICharsetAlias> alias(do_GetService(NS_CHARSETALIAS_CONTRACTID));
2853 result = alias->GetPreferred(guess, preferred);
2854 // Only continue if it's a recognized charset and not
2855 // one of a designated set that we ignore.
2856 if (NS_SUCCEEDED(result) &&
2857 ((kCharsetFromByteOrderMark == guessSource) ||
2858 (!preferred.EqualsLiteral("UTF-16") &&
2859 !preferred.EqualsLiteral("UTF-16BE") &&
2860 !preferred.EqualsLiteral("UTF-16LE") &&
2861 !preferred.EqualsLiteral("UTF-32") &&
2862 !preferred.EqualsLiteral("UTF-32BE") &&
2863 !preferred.EqualsLiteral("UTF-32LE")))) {
etc.
DetectByteOrderMark is what seems to deal with <?xml encoding="..."?> stuff. We probably need to make it a fatal error here in XML mode to get back an encoding we don't recognize, right?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 3•15 years ago
|
||
(In reply to comment #2)
> DetectByteOrderMark is what seems to deal with <?xml encoding="..."?> stuff.
> We probably need to make it a fatal error here in XML mode to get back an
> encoding we don't recognize, right?
So it seems.
(I wonder if we should move towards a model where Gecko provides decoders to expat and expat ingests bytes instead of nsParser performing some of the duties of the XML processor and performing them incorrectly.)
Updated•8 years ago
|
Priority: -- → P3
Updated•3 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•