Closed Bug 693806 Opened 14 years ago Closed 13 years ago

RFC2231/5987 encoding: charset information should be treated as authoritative

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla15

People

(Reporter: julian.reschke, Assigned: julian.reschke)

References

(Blocks 1 open bug,
URL
)

Details

Attachments

(1 file, 3 obsolete files)

Proposed patch, incl test cases 14 years ago Julian Reschke 12.61 KB, patch		Details \| Diff \| Splinter Review
Proposed patch, incl test case 14 years ago Julian Reschke 3.92 KB, patch		Details \| Diff \| Splinter Review
Proposed patch, incl test case 14 years ago Julian Reschke 3.92 KB, patch		Details \| Diff \| Splinter Review
Proposed patch, incl test case 14 years ago Julian Reschke 4.26 KB, patch	jduell.mcbugs : review+	Details \| Diff \| Splinter Review

Julian Reschke

Assignee

Description

•

14 years ago

Test case <http://greenbytes.de/tech/tc2231/#attwithfn2231utf8-bad>: Content-Disposition: attachment; filename*=iso-8859-1''foo-%c3%a4-%e2%82%ac.html After percent-unescaping, Firefox apparently decides that this is more likely to be UTF-8. Other UAs either ignore the parameter (good, as it contains a non-ISO-8859-1 octet) or treat it as in a superset-encoding of ISO-8859-1 (where %82 represents a valid code point). Optimally, FF would align with IE9's and Konqueror's strict behavior.

Julian Reschke

Assignee

Updated

•

14 years ago

Assignee: nobody → julian.reschke

Julian Reschke

Assignee

Comment 1

•

14 years ago

Attached patch Proposed patch, incl test cases (obsolete) — Details — Splinter Review

This patch is big considering the change of behavior, so below a few explanations. Functional changes: - DecodeParameter() always skips the check, we trust the charset as obtained from the 2231/5987 format - Before returning data from GetParameterInternal(), if we do have a charset we always check that the octet sequence decodes according to the character encoding, otherwise we ignore the parameter To get there, I made a few more changes: - rather than short-circuiting, we always collect all information and only decide at the (only) exit point which param to use - to do that, we collect param values for caseA, caseB and caseCorD separately (and for the two latter cases, their charset values) - renamed "needUnquote" to "needPercentDecoding" (to avoid confusion with quoted-string unquoting) - made the code that copies fragments into new strings more readable by calculating the length only once I hope I haven't added a memory leak, and I also think I fixed one what would occur when both caseCorD and caseB are present (in this order), and both specify the charset

Attachment #567559 - Flags: review?(bzbarsky)

Julian Reschke

Assignee

Comment 2

•

14 years ago

Comment on attachment 567559 [details] [diff] [review] Proposed patch, incl test cases Refactoring the code, and trying to fix bug 384571 first.

Attachment #567559 - Flags: review?(bzbarsky)

Julian Reschke

Assignee

Comment 3

•

14 years ago

Attached patch Proposed patch, incl test case (obsolete) — Details — Splinter Review

To be applied on top of the patch for bug 384571.

Attachment #567559 - Attachment is obsolete: true

Attachment #568455 - Flags: review?(bzbarsky)

Julian Reschke

Assignee

Comment 4

•

14 years ago

Attached patch Proposed patch, incl test case (obsolete) — Details — Splinter Review

Updated patch (applies on top of bug 384571)

Attachment #568455 - Attachment is obsolete: true

Attachment #568455 - Flags: review?(bzbarsky)

Attachment #575733 - Flags: review?(bzbarsky)

Julian Reschke

Assignee

Comment 5

•

14 years ago

Comment on attachment 575733 [details] [diff] [review] Proposed patch, incl test case (patch needs to be updated)

Attachment #575733 - Flags: review?(bzbarsky)

Julian Reschke

Assignee

Comment 6

•

14 years ago

Attached patch Proposed patch, incl test case — Details — Splinter Review

Attachment #575733 - Attachment is obsolete: true

Attachment #607983 - Flags: review?(jduell.mcbugs)

Jason Duell

Comment 7

•

13 years ago

Comment on attachment 607983 [details] [diff] [review] Proposed patch, incl test case Review of attachment 607983 [details] [diff] [review]: ----------------------------------------------------------------- Looks good. Only one question: ::: netwerk/mime/nsMIMEHeaderParamImpl.cpp @@ -721,2 @@ > return cvtUTF8->ConvertStringToUTF8(aParamValue, aCharset, > - IS_7BIT_NON_ASCII_CHARSET(aCharset), aResult); Any reason why you've scrapped the 7bit optimization?

Julian Reschke

Assignee

Comment 8

•

13 years ago

(In reply to Jason Duell (:jduell) from comment #7) > Comment on attachment 607983 [details] [diff] [review] > Proposed patch, incl test case > > Review of attachment 607983 [details] [diff] [review]: > ----------------------------------------------------------------- > > Looks good. Only one question: > > ::: netwerk/mime/nsMIMEHeaderParamImpl.cpp > @@ -721,2 @@ > > return cvtUTF8->ConvertStringToUTF8(aParamValue, aCharset, > > - IS_7BIT_NON_ASCII_CHARSET(aCharset), aResult); > > Any reason why you've scrapped the 7bit optimization? It's not an optimization; it's a heuristic that will cause the wrong thing to happen when the name encoding is declared to be ISO-8859-1, but it *looks* as if it was UTF-8. (see http://greenbytes.de/tech/tc2231/#attwithfn2231utf8-bad) This bug is about removing this heuristic (that is, to treat the declared character encoding to be authoritative, not just advisory)

Jason Duell

Comment 9

•

13 years ago

Comment on attachment 607983 [details] [diff] [review] Proposed patch, incl test case OK. I'm going to hold off on landing this until the next code fork (April 24th), since we've closed mozilla-central to all but approved patched till then.

Attachment #607983 - Flags: review?(jduell.mcbugs) → review+

Julian Reschke

Assignee

Comment 10

•

13 years ago

(In reply to Jason Duell (:jduell) from comment #9) > OK. I'm going to hold off on landing this until the next code fork (April > 24th), since we've closed mozilla-central to all but approved patched till > then. Makes perfect sense. Thank you!

Jason Duell

Comment 11

•

13 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/7902fd4ca51d

(no longer active)

Comment 12

•

13 years ago

https://hg.mozilla.org/mozilla-central/rev/7902fd4ca51d

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla15

You need to log in before you can comment on or make changes to this bug.

Bugzilla

RFC2231/5987 encoding: charset information should be treated as authoritative

Categories

(Core :: Networking, defect)

Tracking

()

People

(Reporter: julian.reschke, Assigned: julian.reschke)

References

(Blocks 1 open bug,
URL
)

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file, 3 obsolete files)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Attachment

General

Description

File Name

Content Type