Closed Bug 700589 Opened 13 years ago Closed 13 years ago

HTTP content type charset parameter accepts single quotes

Categories

(Core :: Networking: HTTP, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla11

People

(Reporter: julian.reschke, Assigned: julian.reschke)

References

()

Details

Attachments

(1 file, 4 obsolete files)

The parser for the HTTP content-type header field apparently accepts single quotes as delimiter. This is a bug according to RFC 2616, and also not the case in IE9 and Firefox.

See test case at:

  http://greenbytes.de/tech/tc/httpcontenttype/#textplainutf8qsa
> also not the case in IE9 and Firefox.

s/Firefox/some other browser/  ?   Otherwise we're already fixed :)
Ah, looking at the link I see it was IE9 and Opera that don't accept.  I suppose that's probably good enough to get strict about this.
(In reply to Jason Duell (:jduell) from comment #2)
> Ah, looking at the link I see it was IE9 and Opera that don't accept.  I
> suppose that's probably good enough to get strict about this.

Yep, sorry for that.

Note that the single quote is a legal character in param values, and thus any kind of special handling will cause problems with headers like:

  Content-Type: foo/bar; ext='; charset=qux

(will add a test case)
(In reply to Julian Reschke from comment #3)
> Note that the single quote is a legal character in param values, and thus
> any kind of special handling will cause problems with headers like:
> 
>   Content-Type: foo/bar; ext='; charset=qux
> 
> (will add a test case)

Test case: http://greenbytes.de/tech/tc/httpcontenttype/#textplainutf8extparamsq

And not surprisingly, Firefox fails that, too.
Seems to be quite easy to fix; now looking at the test cases which appear to check for this behavior.
Assignee: nobody → julian.reschke
Attachment #572910 - Flags: review?(jduell.mcbugs)
Comment on attachment 572910 [details] [diff] [review]
Proposed patch, incl updated test case results, plus one new test

Review of attachment 572910 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the patch.  A couple minor questions.

::: netwerk/test/unit/test_parse_content_type.js
@@ +112,3 @@
>  
>    type = netutil.parseContentType('text/plain, TEXT/HTML; param=charset=UTF8; charset=\'ISO-8859-1\'; param2=charset=UTF16, text/html, TEXT/HTML', charset, hadCharset);
> +  check("text/html", "'ISO-8859-1'", true);  

I'm not clear on why you're adding all the single quoting to the ISO-8859-1 args.  Is that just to test that single quotes make it through the parser and are returned as part of the value, or is this what should actually show up on the web?  If we suspect that most servers are sending the value in double quotes, why not change most of the tests to pass in a dbl-quoted value instead?

@@ +135,5 @@
>    check("text/html", "", false);
>  
> +  // Bug 700589
> +  type = netutil.parseContentType("text/plain; x='; charset=\"UTF-8\"", charset, hadCharset);
> +  check("text/plain", "UTF-8", true);

Do you want to check for x="'" too?
(In reply to Jason Duell (:jduell) from comment #7)
> Comment on attachment 572910 [details] [diff] [review] [diff] [details] [review]
> Proposed patch, incl updated test case results, plus one new test
> 
> Review of attachment 572910 [details] [diff] [review] [diff] [details] [review]:
> -----------------------------------------------------------------
> 
> Thanks for the patch.  A couple minor questions.
> 
> ::: netwerk/test/unit/test_parse_content_type.js
> @@ +112,3 @@
> >  
> >    type = netutil.parseContentType('text/plain, TEXT/HTML; param=charset=UTF8; charset=\'ISO-8859-1\'; param2=charset=UTF16, text/html, TEXT/HTML', charset, hadCharset);
> > +  check("text/html", "'ISO-8859-1'", true);  
> 
> I'm not clear on why you're adding all the single quoting to the ISO-8859-1
> args.  Is that just to test that single quotes make it through the parser
> and are returned as part of the value, or is this what should actually show
> up on the web?  If we suspect that most servers are sending the value in
> double quotes, why not change most of the tests to pass in a dbl-quoted
> value instead?

Well, I just adjusted test results. Apparently the person working on this before felt it was important to test all these cases; I will review those, and also make sure that " has sufficient coverage.

> @@ +135,5 @@
> >    check("text/html", "", false);
> >  
> > +  // Bug 700589
> > +  type = netutil.parseContentType("text/plain; x='; charset=\"UTF-8\"", charset, hadCharset);
> > +  check("text/plain", "UTF-8", true);
> 
> Do you want to check for x="'" too?

Yes, I'll that for clarity.

Thanks so far.
Updated tests; I modified those that didn't seem to test ' vs " to use " instead (so the test changes, but the test result does not). Also added a specific one to the end of the test suite covering the bug for which this ticket was opened.
Attachment #572910 - Attachment is obsolete: true
Attachment #572910 - Flags: review?(jduell.mcbugs)
Attachment #573779 - Flags: review?(jduell.mcbugs)
Comment on attachment 573779 [details] [diff] [review]
Proposed patch, incl updated test case results, plus new tests

Review of attachment 573779 [details] [diff] [review]:
-----------------------------------------------------------------

OK, looks good.  Thanks Julian!
Attachment #573779 - Flags: review?(jduell.mcbugs) → review+
Keywords: checkin-needed
Comment on attachment 573779 [details] [diff] [review]
Proposed patch, incl updated test case results, plus new tests

># HG changeset patch
># Parent 1374294a61191a19ba6f52be7ee980b1f5186532
># User julian.reschke@gmx.de
>Bug 700589 - HTTP content type charset parameter accepts single quotes

Julian: It looks like this is the description of the bug rather than the description of the change.  Could you suggest a checkin message that describes the change?
"Fix Content-Type parser to treat single quote as regular character instead of a delimiter"
Target Milestone: --- → mozilla11
Pushed to try; assuming that goes well, I can push this to mozilla-inbound later today.
  https://tbpl.mozilla.org/?tree=Try&rev=143a2ba7ad7c
This makes us fail a mochitest:
{
10593 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_bug397234.html | Headers should match - got text/plain; charset=UTF-8, expected text/plain; charset='uTf-8'
}
https://tbpl.mozilla.org/php/getParsedLog.php?id=7409173&tree=Try

It may be that the test is broken; if so, the test-fixup needs to be included as part of the patch, so that our test boxes will stay green when this is pushed.
Keywords: checkin-needed
Target Milestone: mozilla11 → ---
Version: unspecified → Trunk
So what happens is that the test case checks that the case of the charset param is preserved in the request being sent *if* it does match the charset that is actually in use.

The header set by the test case uses

  charset='uTf-8'

from which we now extract "'uTf-8'" as charset, previously "UTf-8" (single quotes now preserved).

This is an illegal charset, so XMLHttpRequest defaults to UTF-8 (my theory), and thus the code that tries to re-insert the funny-cased original value doesn't get triggered anymore.

Essentially this works as intended, and we can change the test to use double quotes rather than single quotes. It would be good to know though whether it's a pure accident that the test uses single quotes, or whether this was really seen in the wild.
The conservative fix seems to add a workaround to the code in XmlHTTPRequest so that the charset *with* single quotes matches as well, such as (pseudo-code):

  if (specifiedCharset.startsWith("'") && specifiedCharset.ensWith("'")) {
    ...remove the quoting
  }

That way, the observable behavior would stay the same.

(It would be better if this hack could be removed; it might be good to check what other UAs do here, and whether the XHR spec says something about it)
Same as previous patch; except that it adds a workaround on top of another workaround, so that the logic in XmlHttpRequest that rewrites the content-type header continues to do the right thing even for charset names in single quotes.
Attachment #573779 - Attachment is obsolete: true
Attachment #574984 - Flags: review?(jduell.mcbugs)
Comment on attachment 574984 [details] [diff] [review]
Proposed patch, incl updated test case results, plus new tests

specifiedCharset could be empty there, no?  You need to check specifiedCharset.Length() >= 2.
Attached patch Proposed patch, incl test case (obsolete) — Splinter Review
Patch updated as suggested, mochitests not run yet; will try try server now.
Attachment #574984 - Attachment is obsolete: true
Attachment #574984 - Flags: review?(jduell.mcbugs)
Attachment #575957 - Flags: review?(bzbarsky)
Try run for d967b35c9d35 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=d967b35c9d35
Results (out of 13 total builds):
    exception: 12
    failure: 1
Builds available at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/julian.reschke@gmx.de-d967b35c9d35
Comment on attachment 575957 [details] [diff] [review]
Proposed patch, incl test case

Try results: https://tbpl.mozilla.org/?tree=Try&rev=2002a0e4a1f2 (I don't believe that any problems related here have anything to do with the patch; in particular the test failure for test_bug397234.html is gone)
Try run for 2002a0e4a1f2 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=2002a0e4a1f2
Results (out of 247 total builds):
    exception: 2
    success: 206
    warnings: 35
    failure: 4
Builds available at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/julian.reschke@gmx.de-2002a0e4a1f2
Comment on attachment 575957 [details] [diff] [review]
Proposed patch, incl test case

This doesn't seem quite right either for the case when specifiedCharset is, say, "'ut'f-8'".
Attachment #575957 - Flags: review?(bzbarsky) → review-
(addressed review comment, try results: https://tbpl.mozilla.org/?tree=Try&rev=38eaa7d2fb60)
Attachment #575957 - Attachment is obsolete: true
Attachment #577733 - Flags: review?(bzbarsky)
Comment on attachment 577733 [details] [diff] [review]
Proposed patch, incl test case

r=me on the XHR bits

Now we get to watch out for this breaking sites....
Attachment #577733 - Flags: review?(bzbarsky) → review+
Keywords: checkin-needed
(In reply to Boris Zbarsky (:bz) from comment #26)
> Now we get to watch out for this breaking sites....

Maybe. We'll see.
I removed 2 instances of end-of-line whitespace (one in nsXMLHttpRequest.cpp, one in test_parse_content_type.js) and pushed:
  https://hg.mozilla.org/integration/mozilla-inbound/rev/4b085d906272
Status: NEW → ASSIGNED
Flags: in-testsuite+
Keywords: checkin-needed
Target Milestone: --- → mozilla11
https://hg.mozilla.org/mozilla-central/rev/4b085d906272
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: