Open Bug 560388 Opened 14 years ago Updated 2 years ago

Bogus content-type headers indistinguishable from absence of content-type header

Tracking

()

Status:

NEW

People

(Reporter: zwol, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: sec-low, Whiteboard: [sg:low] data exfiltration from sites with bad HTTP content labeling[necko-would-take])

Zack Weinberg (:zwol)

Reporter

Description

•

14 years ago

nsIChannel::GetContentType appears to return the magic string "application/x-unknown-content-type" not only when there was no Content-Type header at all, but when it was unparseable (e.g. "Content/Type: */*", "Content-Type: bogus", or "Content-Type:") or when the server actually provided "application/x-unknown-content-type" as the header value.

For security reasons (see bug 524223 -- the instant concern is style sheets) I need to be able to reliably distinguish the total absence of a Content-Type header (which should trigger content sniffing) from a Content-Type header that was present but gobbledygook (which should, at least for style sheets, cause the load to be discarded).

Proposal:

We have three internal-use-only MIME types: application/x-unknown-content-type, application/x-vnd.mozilla.guess-from-ext, and application/x-view-source. Move these to a new x-internal/ type group (so x-internal/unknown, x-internal/guess-from-ext, x-internal/view-source -- no need for double x- prefixes). Add another such type, x-internal/parse-error. Make the Content-Type header parser give back x-internal/parse-error whenever the Content-Type header is empty or nonsense. If we see x-internal/anything from the server, map that to x-internal/parse-error as well.

Consumers of this information should normally treat x-internal/parse-error as equivalent to application/octet-stream, but it's possible that debugging extensions or equivalent might want to distinguish them.

Consumers should also treat failure of GetContentType() as equivalent to a result of x-internal/parse-error.

This is not technically a problem with the HTTP code -- my best guess at the proper location of the fix is netwerk/base/nsUrlHelper.cpp:net_parseMediaType -- but HTTP is, as far as I can tell, the only protocol that we have that believes Content-Type headers sent by the server, so filing it there.

Zack Weinberg (:zwol)

Reporter

Updated

•

14 years ago

Whiteboard: [sg:low] data exfiltration from sites with bad HTTP content labeling

Boris Zbarsky [:bzbarsky]

Comment 1

•

14 years ago

data: presumably has similar behavior, right?  Or does it end up falling back on text/plain or bailing out of the type is not parseable?

If all we cared about is HTTP you could look at the header value yourself, but I agree that it would be better to not create special-cases like that.

The proposal sounds fine to me, except the part about treating the parse error as octet-stream.  I'd need some data on what other UAs do for that; it could turn into a web compat issue.

Zack Weinberg (:zwol)

Reporter

Comment 2

•

14 years ago

(In reply to comment #1)
> data: presumably has similar behavior, right?  Or does it end up falling back
> on text/plain or bailing out of the type is not parseable?

Dunno, will investigate.

> The proposal sounds fine to me, except the part about treating the parse error
> as octet-stream.  I'd need some data on what other UAs do for that; it could
> turn into a web compat issue.

There's a test page that serves CSS under a variety of content types at http://crypto.stanford.edu/~collinj/research/css/ and it shouldn't be that hard to extend to other stuff.  I get the impression that (post their equivalent of bug 524223 being fixed) other browsers are pickier than we are about malformed Content-Type.

Boris Zbarsky [:bzbarsky]

Comment 3

•

14 years ago

Oh, I don't mean for CSS.  For CSS I'm happy to be picky and treat the bogus types as application/octet-stream here.  My concern is mostly full-page loads.

Zack Weinberg (:zwol)

Reporter

Comment 4

•

14 years ago

http://tools.ietf.org/html/draft-abarth-mime-sniff-04 seems to consider unparseable content-type headers as equivalent to none at all, but it doesn't consider CSS at all, and is quite eager about falling back to text/plain or application/octet-stream.

Boris Zbarsky [:bzbarsky]

Comment 5

•

14 years ago

Yeah, that's basically the algorithm for loads in <iframe>s.  For other types of loads, different sniffing rules need to apply.

Curtis Koenig [:curtisk-use curtis.koenig+bzATgmail.com]]

Updated

•

12 years ago

Keywords: sec-low

Zack Weinberg (:zwol)

Reporter

Updated

•

12 years ago

Blocks: mimesniff

Patrick McManus [:mcmanus]

Updated

•

8 years ago

Whiteboard: [sg:low] data exfiltration from sites with bad HTTP content labeling → [sg:low] data exfiltration from sites with bad HTTP content labeling[necko-would-take]

Firefox Bug Husbandry Bot

Comment 6

•

7 years ago

Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258

Priority: -- → P5

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Bogus content-type headers indistinguishable from absence of content-type header

Categories

(Core :: Networking: HTTP, defect, P5)

Tracking

()

People

(Reporter: zwol, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: sec-low, Whiteboard: [sg:low] data exfiltration from sites with bad HTTP content labeling[necko-would-take])

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Updated

Updated

Comment 6

Updated