Last Comment Bug 562915 - escape characters in content-type charset param not parsed properly
: escape characters in content-type charset param not parsed properly
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Networking: HTTP (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla11
Assigned To: Julian Reschke
:
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-04-30 08:03 PDT by Julian Reschke
Modified: 2012-02-01 13:59 PST (History)
0 users
dao+bmo: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Proposed patch, incl test case (2.61 KB, patch)
2011-11-09 13:55 PST, Julian Reschke
no flags Details | Diff | Splinter Review
Proposed patch, incl test case (2.75 KB, patch)
2011-11-29 06:08 PST, Julian Reschke
jduell.mcbugs: review+
Details | Diff | Splinter Review

Description Julian Reschke 2010-04-30 08:03:33 PDT
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 (.NET CLR 3.5.30729)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.3) Gecko/20100401 

According to RFC 2616, Section 3.7, the following headers are equivalent:

  Content-Type: text/plain; charset="U\TF-8"
  Content-Type: text/plain; charset="UTF-8"

However, Mozilla seems no to un-escape the string value, and thus doesn't see the charset value of UTF-8.

Yes, this is an edge case. It's a bug nevertheless.
  

Reproducible: Always
Comment 1 Julian Reschke 2011-08-07 07:29:01 PDT
See test case at

  http://greenbytes.de/tech/tc/httpcontenttype/#textplainutf8qsescaped

(currently Konqueror and Safari passing).

See also

  http://trac.tools.ietf.org/wg/httpbis/trac/ticket/270
Comment 2 Julian Reschke 2011-11-09 06:52:41 PST
The problem is caused by

  net_ParseContentType

returning the unescaped charset; however this is non-trivial to change because it just returns pointers into the raw header field, and that header field is const.

The right approach seems to be to change the method signature to return a new string, however this means touching multiple other places.
Comment 3 Julian Reschke 2011-11-09 13:39:03 PST
(In reply to Julian Reschke from comment #2)
> The problem is caused by
> 
>   net_ParseContentType
> 
> returning the unescaped charset; however this is non-trivial to change
> because it just returns pointers into the raw header field, and that header
> field is const.
> ...

Turns out I was too pessimistic, please ignore above comment.
Comment 4 Julian Reschke 2011-11-09 13:55:22 PST
Created attachment 573312 [details] [diff] [review]
Proposed patch, incl test case

Applies on top of patch for bug 700589
Comment 5 Jason Duell [:jduell] (needinfo me) 2011-11-28 11:40:39 PST
Comment on attachment 573312 [details] [diff] [review]
Proposed patch, incl test case

Review of attachment 573312 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/src/nsURLHelper.cpp
@@ +888,4 @@
>          } while (curParamStart < flatStr.Length());
>      }
>  
> +    bool charsetNeedsUnquote = false;

From my understanding of the code, this variable isn't actually controlling whether we need to find a matching " at the end of the string, but rather just whether we need to ignore any backslashes in the charset.  Correct?  If so, maybe rename to 'charsetRemoveBackslashes'?

@@ +926,4 @@
>  
>          if ((!eq && *aHadCharset) || typeHasCharset) {
>              *aHadCharset = true;
> +            if (charsetNeedsUnquote) {

Maybe add comment here:

   // RFC 2616, Section 3.7 says we must ignore XYZ

Actually, looking at section 3.7, I don't see how it says that we need to be ignoring the backslashes.  But that probably just means I haven't chased enough RFC definitions far enough (is the behavior implied by 'parameter'?  I only see it saying the match needs to be case-insensitive).
Comment 6 Julian Reschke 2011-11-29 06:08:13 PST
Created attachment 577587 [details] [diff] [review]
Proposed patch, incl test case

Addresses review comments (made name of flag more precise, add comment why we're unescaping here)
Comment 7 Jason Duell [:jduell] (needinfo me) 2011-11-29 20:08:58 PST
Comment on attachment 577587 [details] [diff] [review]
Proposed patch, incl test case

Review of attachment 577587 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks Julian!
Comment 8 Julian Reschke 2011-11-30 04:34:42 PST
Try results: https://tbpl.mozilla.org/?tree=Try&rev=74b71078993b
Comment 10 Ed Morley [:emorley] 2011-12-04 07:20:16 PST
https://hg.mozilla.org/mozilla-central/rev/abe0e4c2c403

Note You need to log in before you can comment on or make changes to this bug.