Closed Bug 1255570 (CVE-2016-5251) Opened 8 years ago Closed 8 years ago

HTTP(S) URL spoof in location bar

Categories

(Firefox :: Address Bar, defect)

45 Branch
defect
Not set
normal

Tracking

()

VERIFIED FIXED
Firefox 48
Tracking Status
firefox48 --- verified

People

(Reporter: firace, Assigned: Gijs)

References

Details

(Keywords: csectype-spoof, sec-low, Whiteboard: [necko-backlog][adv-main48+])

Attachments

(3 files)

Attached file testcase.html
User Agent: Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36

Steps to reproduce:

Combination of data URI, Unicode characters and frames. 
Spoof is not perfect but good enough to easily fool your mom and dad. 
At least it worked with mine :)

Follow link in testcase file for a simple demo.



Actual results:

Browser navigates to arbitrary website but URL bar shows https://secure.paypal.com/   


Expected results:

At the very least misleading Unicode characters should be detected/escaped and/or a big warning should be displayed.

Note: I'm submitting a similar report to Chromium.
Component: Untriaged → Location Bar
Forgot to mention, tested with Firefox 45.0.
Flags: needinfo?(dveditz)
Flags: needinfo?(abillings)
Flags: needinfo?(abillings)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: csectype-spoof
This is similar to (inspired by?) bug 1221444 fixed in Firefox 43, too lax handling of data: urls. That bug messed with the mediatype, this one messes with the optional ";base64" bit. If we encounter a ";" in the content type section and it's not followed by "base64" that should be an error. Instead it looks like (from symptoms, have not debugged code) we ignore anything not "base64" and go on our merry way. Since we display the URL that means this field can be used for spoofing arbitrary text. There are a few characters we'd percent-encode, but I bet you could get around that using Unicode look-alikes.
Group: firefox-core-security → network-core-security
Component: Location Bar → Networking
Flags: needinfo?(dveditz)
Keywords: sec-low
Product: Firefox → Core
See Also: → CVE-2015-7211
looks like we treat it as an unknown, and thus ignored, content type parameter. What breaks if we stop ignoring those and instead treat them as invalid? Probably something -- see bug 781693 where sites are putting invalid parameters after the ";base64" bit. That means, by the way, that you could definitely do the same kind of spoof by putting this text after the base64 instead of before.

URL in this bug (shortened):

data:text/html;%E2%80%83%F0%9F%94%92%E2%80%83https://paypal.com/%E2%80%83etc.%E2%80%83charset=utf-8;base64,c3Bvb2Y=

see:
https://dxr.mozilla.org/mozilla-central/rev/dd1abe874252e507b825a0a4e1063b0e13578288/netwerk/protocol/data/nsDataHandler.cpp#197

It "works" just as well (but at a different point in the code) to put the base64 first:

data:text/html;base64;%E2%80%83%F0%9F%94%92%E2%80%83https://paypal.com/%E2%80%83etc.%E2%80%83charset=utf-8,c3Bvb2Y=

see:
https://dxr.mozilla.org/mozilla-central/rev/dd1abe874252e507b825a0a4e1063b0e13578288/netwerk/protocol/data/nsDataHandler.cpp#182
Whiteboard: [necko-backlog]
Escaping non-ASCII characters where media type parameters are expected should be sufficient, no?
(In reply to Julian Reschke from comment #7)
> Escaping non-ASCII characters where media type parameters are expected
> should be sufficient, no?

nsDataHandler doesn't have a say about what characters are escaped.
The location bar is where the entire string is unescaped, and it does so regardless of what media type, etc.
We could probably start ignoring certain characters, or rejecting them, but that's likely to break some odd use case for someone.
(In reply to Valentin Gosu [:valentin] from comment #8)
> (In reply to Julian Reschke from comment #7)
> > Escaping non-ASCII characters where media type parameters are expected
> > should be sufficient, no?
> 
> nsDataHandler doesn't have a say about what characters are escaped.
> The location bar is where the entire string is unescaped, and it does so
> regardless of what media type, etc.
> We could probably start ignoring certain characters, or rejecting them, but
> that's likely to break some odd use case for someone.

The location bar could be taught not to unescape non-ascii for data: and javascript: URIs for display, without much loss of use for 99.99% of users, I think.

I haven't looked into this bug in detail, so I'm not sure, but:
1) Marco, would you agree that my suggestion is feasible in principle;
2) Dan, would that address the issue here?
Flags: needinfo?(mak77)
Flags: needinfo?(dveditz)
I moved this to Core::Networking figuring we'd fix it in the data: URL handler like bug 1221444. The only real fix there, though, is either to reject such URLs as invalid or to strip out the unknown bits and re-write the URL--either of which is fairly guaranteed to break some site some where (cf bug 781693). We could only take that approach if we added telemetry to these and measured the use of these variants. That would take a while.

Gijs' suggestion puts the ball back in the front-end court. I like it, so I'll move the bug back. I'd prefer a whitelist approach that says we WILL unescape http(s)/ftp/??? rather than a blacklist of data: and javascript:, but I'd accept either as a fix.

NB: expect complaints from a small number of developers angry their bookmarklets are suddenly filled with ugly %20. I guess it depends on whether you skip the "prettify" step entirely for those schemes or just change it to treat non-ascii differently.
Group: network-core-security → core-security-release
Component: Networking → Keyboard Navigation
Flags: needinfo?(dveditz)
Product: Core → Firefox
(moving groups because this is unfixed; dveditz, let me know if you moved it to -release over fx-frontend intentionally...)
Group: core-security-release → firefox-core-security
I think it's feasible, I don't know if the devs complains would be so many.
The bookmarks dialogs are already not unescaping the url, we have bugs filed but not so many complains.
The awesomebar popup shows unescape bookmarklets and we could and should continue to do that since it's just a temporary visualization. So when the user selects the bookmarklet, it will be pretty-printed.

The only thing that will change is the final url shown in the input field.
So I assume the broken use-case is when one selects a bookmarklet from the popup, and then wants to quickly edit it? I don't know how common is as a use-case, I could argue it's less common than editing a bookmarklet stored as a bookmark, where we don't unescape.
An intermediate solution to that use'case, could be to unescape the url only when it's being edited, but it may be more risky and regression-prone.
Flags: needinfo?(mak77)
Attached patch Patch v1.0Splinter Review
Always reassuring if there is already test coverage for this. I looked where that test came from, and it seems to have been there from the start of bug 666964, but there's not much rationale in that bug as to why/how we're unescaping even non-ascii.
Assignee: nobody → gijskruitbosch+bugs
Status: NEW → ASSIGNED
Attachment #8731360 - Flags: review?(mak77)
Component: Keyboard Navigation → Location Bar
Comment on attachment 8731360 [details] [diff] [review]
Patch v1.0

Review of attachment 8731360 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/base/content/browser.js
@@ +2374,5 @@
> +      // This only decodes ascii characters (hex) 20-7e, except 25 (%).
> +      // This avoids both cases stipulated below (%-related issues, and \r, \n
> +      // and \t, which would be %0d, %0a and %09, respectively) as well as any
> +      // non-US-ascii characters.
> +      value = value.replace(/%(2[0-4]|2[6-9a-f]|[3-6][0-9a-f]|7[0-9a-e])/g, decodeURI);

Should I assume decoding %20 is not considered a problem regarding this bug? Cause in the end one could still isolate any wanted address from the rest in the input field through spaces.
Or maybe I'm misunderstanding part of this.
(In reply to Marco Bonardo [::mak] from comment #14)
> Comment on attachment 8731360 [details] [diff] [review]
> Patch v1.0
> 
> Review of attachment 8731360 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: browser/base/content/browser.js
> @@ +2374,5 @@
> > +      // This only decodes ascii characters (hex) 20-7e, except 25 (%).
> > +      // This avoids both cases stipulated below (%-related issues, and \r, \n
> > +      // and \t, which would be %0d, %0a and %09, respectively) as well as any
> > +      // non-US-ascii characters.
> > +      value = value.replace(/%(2[0-4]|2[6-9a-f]|[3-6][0-9a-f]|7[0-9a-e])/g, decodeURI);
> 
> Should I assume decoding %20 is not considered a problem regarding this bug?
> Cause in the end one could still isolate any wanted address from the rest in
> the input field through spaces.
> Or maybe I'm misunderstanding part of this.

I think your understanding is correct. The patch assumes that US-ascii is OK.

The more I think about it though, the more this seems generically problematic - replacing the \u2003 from the testcase with ascii spaces seems to work, too? Why is ascii whitespace allowed in the media type here?
Flags: needinfo?(valentin.gosu)
(In reply to :Gijs Kruitbosch from comment #15)
> > Should I assume decoding %20 is not considered a problem regarding this bug?
> > Cause in the end one could still isolate any wanted address from the rest in
> > the input field through spaces.
> > Or maybe I'm misunderstanding part of this.
> 
> I think your understanding is correct. The patch assumes that US-ascii is OK.
> 
> The more I think about it though, the more this seems generically
> problematic - replacing the \u2003 from the testcase with ascii spaces seems
> to work, too? Why is ascii whitespace allowed in the media type here?

From what I remember whitespace is stripped from a data URI, so maybe that's why that works.
Flags: needinfo?(valentin.gosu)
Can you provide a link to the Chromium issue?
Flags: needinfo?(firace)
Sure, here's a link to the Chromium issue. But note that meanwhile, it was marked as WontFix as the LOCK Unicode character is not decoded by Chromium.

https://bugs.chromium.org/p/chromium/issues/detail?id=594057
Flags: needinfo?(firace)
(In reply to Valentin Gosu [:valentin] from comment #16)
> (In reply to :Gijs Kruitbosch from comment #15)
> > > Should I assume decoding %20 is not considered a problem regarding this bug?
> > > Cause in the end one could still isolate any wanted address from the rest in
> > > the input field through spaces.
> > > Or maybe I'm misunderstanding part of this.
> > 
> > I think your understanding is correct. The patch assumes that US-ascii is OK.
> > 
> > The more I think about it though, the more this seems generically
> > problematic - replacing the \u2003 from the testcase with ascii spaces seems
> > to work, too? Why is ascii whitespace allowed in the media type here?
> 
> From what I remember whitespace is stripped from a data URI, so maybe that's
> why that works.

Correct, ascii spaces are stripped, which is why I had to get a little creative. :)
Hm, so I can reproduce the spaces collapsing if I modify the testcase to navigate to that URI with spaces instead of the \u2003. However, if I myself type:

"data:text/html;                    
Umm, wow, bugzilla, you don't like that comment, do you? Let's try again...
----

Hm, so I can reproduce the spaces collapsing if I modify the testcase to navigate to that URI with spaces instead of the \u2003. However, if I myself type:

"data:text/html;                      [lock]                      https://secure.paypal.com/,foo"

sans quotes in the URL bar, the spaces don't get collapsed. :-\

Anyway, it sounds like a good mitigation here will be to escape non-ascii, even if we leave spaces, as far as exploiting this from the web is concerned.

One other thing that we could consider doing is mangling the data URI completely for display, but that would involve reimplementing some of the parsing stuff in frontend JS which I'm not keen on - feels like we'll just be introducing bugs there instead.
(In reply to :Gijs Kruitbosch from comment #21)
> Anyway, it sounds like a good mitigation here will be to escape non-ascii,
> even if we leave spaces, as far as exploiting this from the web is concerned.

yeah, I think we only care about web exploitability here.
Attached file test.html
to clarify, this is what I meant.
So I thought we might be able to allow "data:text/html,", but:



data:text/html,                          | PayPal Inc. (US) |  https://paypal.com/                                                                                                                                                                                                                                                          <html><body><p>Hello, this is a test</p><script>var p = document.getElementsByTagName("p")[0]; while (p.previousSibling) p.previousSibling.remove();</script></body></html>

would work just as well.

So really, there is no way we can display this with the spaces intact, and not suffer spoofing. :-\

I think that means we can't take the patch as-is (I mean, we can, but it won't address all the issues here).

(In reply to Marco Bonardo [::mak] from comment #12)
> An intermediate solution to that use'case, could be to unescape the url only
> when it's being edited, but it may be more risky and regression-prone.

Where would we do this?
Flags: needinfo?(mak77)
(In reply to :Gijs Kruitbosch (away 24-29/3, incl.) from comment #24)
> So really, there is no way we can display this with the spaces intact, and
> not suffer spoofing. :-\

Right, that's why I was asking if we should decode %20 or not... As soon as we decode them, it is spoofable. Sure we won't show the lock, that is the worst part of this bug...
We could decide to not decode %20, that is what may annoy devs using bookmarklets (comment 10).

> (In reply to Marco Bonardo [::mak] from comment #12)
> > An intermediate solution to that use'case, could be to unescape the url only
> > when it's being edited, but it may be more risky and regression-prone.
> 
> Where would we do this?

it would be somewhere in the focus/blur event handlers in urlbarBindings.xml. Basically it would work similarly to the formaValue thing that currently hilights the domain.
http://mxr.mozilla.org/mozilla-central/source/browser/base/content/urlbarBindings.xml#1015

Though, it's by far more regressions prone than the approach in the current patch.
Flags: needinfo?(mak77)
we could limit the decoding-on-edit to javascript urls though... that may be less scary.
Attachment #8731360 - Flags: review?(mak77)
(In reply to Marco Bonardo [::mak] from comment #26)
> we could limit the decoding-on-edit to javascript urls though... that may be
> less scary.

I mean, selfishly, I use data: URIs fairly frequently for testing, and having spaces would annoy me. Just sticking:

data:text/html,                                       <body>Hello this is a test</body>

in Google Chrome also leaves all the spacing alone and unescaped, even when not editing.

Dan, what's the bar for spoofing that we want to be using here?
Flags: needinfo?(dveditz)
(In reply to :Gijs Kruitbosch from comment #11)
> dveditz, let me know if you moved it to -release over fx-frontend intentionally...)

I did. It's a low-severity bug and somewhat similar to a fixed one we've announced and unhidden. I figured better internal access was a win, and we didn't need the extra security of compartmentalization--especially since this kind of straddles front-end and networking. For example, we could decide the data: protocol handler should error out on any non-ascii encountered in the media type section. The various relevant RFCs would seem to give us permission to do that although it'll probably break something somewhere.

But then, as you point out in comment 24, the spaces and spoofy stuff (like the lock icon) could be in the body and we can't reject that. All we can do is decide to unescape or not.

We might have to leave this wontfix like Chrome.
Group: firefox-core-security → core-security-release
Flags: needinfo?(dveditz)
(In reply to Daniel Veditz [:dveditz] from comment #28)
> We might have to leave this wontfix like Chrome.

not decoding the lock could still be a good thing to do regardless.
Comment on attachment 8731360 [details] [diff] [review]
Patch v1.0

Review of attachment 8731360 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Marco Bonardo [::mak] from comment #29)
> (In reply to Daniel Veditz [:dveditz] from comment #28)
> > We might have to leave this wontfix like Chrome.
> 
> not decoding the lock could still be a good thing to do regardless.

I agree. Re-requesting review based on the last few comments. :-)
Attachment #8731360 - Flags: review?(mak77)
Comment on attachment 8731360 [details] [diff] [review]
Patch v1.0

Review of attachment 8731360 [details] [diff] [review]:
-----------------------------------------------------------------

OK, let's try this.
It's not perfect, but the important thing is that the spoofed url is not good enough to trick people, and if we can avoid graphics and (mostly) lock icons, we are close enough to that.
On the other side if we'd be more strict we'd annoy any technical user by not decoding ascii, so that part will stay a wontfix. That means things like comment 23 can still happen, but we assume the different scheme, missing colors, missing lock and the security info panel, are good enough to avoid tricking the user.
Attachment #8731360 - Flags: review?(mak77) → review+
https://hg.mozilla.org/mozilla-central/rev/995e7890dd61
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 48
Sorry, I'm not very familar with Firefox development. 
Is a nightly build with this fix already available somewhere?
(In reply to firace from comment #34)
> Sorry, I'm not very familar with Firefox development. 
> Is a nightly build with this fix already available somewhere?

Yes, https://nightly.mozilla.org .
Ah, that was easy! Thanks.
See Also: → 1268975
Verified as fixed using Firefox 48 beta 10 under Win 10 64-bit, Ubuntu 14.04 64-bit and Mac OS X 10.11.
Status: RESOLVED → VERIFIED
Alias: CVE-2016-5251
Whiteboard: [necko-backlog] → [necko-backlog][adv-main48+]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.