Closed Bug 878485 Opened 11 years ago Closed 10 years ago

URL Parsing Error on login.persona.org (and in Persona SDK)

Categories

(Cloud Services :: Server: Identity, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: karthik.bhargavan, Assigned: francois)

Details

(Keywords: sec-moderate, wsec-authentication, Whiteboard: [site:login.persona.org][reporter-external])

Attachments

(3 files, 1 obsolete file)

Attached file persona_url.txt
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36
(Inlining the attached report as text)

Summary:
--------------------------

The Persona verification service at https://verifier.login.persona.org/verify incorrectly
parses the "audience" URL parameter provided to it. For some carefully crafted audience URLs, when the verifier parses the audience, it obtains a incorrect hostname, and bases
its verification result on this hostname. As a result, it returns true for an assertion
sent from one origin, even if the assertion was issued for a different origin. 

This URL parsing vulnerability is not isolated; I have investigated the Persona sources on Github and determined that the bug lies in the url.parse function included with nodejs, 
which is used (in seemingly security-criticl ways) also in other Persona files, such as http_forward. 

I am independently reporting the bug to Node developers, but recommend that Persona protects its website by pushing its own patch to fix URL parsing, at least for audience URLs. Moreover, to protect developers using its SDK (who may be using unpatched versions of Node), you may consider pushing the patch to the Persona SDK.

Vulnerability
--------------------------

The verification service at https://verifier.login.persona.org takes as a parameter an assertion and an audience URL which is supposed to be an origin, e.g. https://voo.st:443

So, suppose you send "assertion=<Assertion>&audience=https://voo.st:443"  to this URL,
it should return true if and only if the <Assertion> was issued for this audience. Otherwise,
a malicious website may be able to use an assertion given to it by a user to impersonate that user at other websites.

However, if you send "assertion=<Assertion>&audience=https://a@voo.st@google.com",
the verifier still accepts the assertion and returns true. This is because verifier.login.persona.org parses the audience as the origin https://voo.st, even though
all RFC3986-compliant browsers would in fact parse this audience as the origin https://google.com (with user a%40voo.st)

In other words, verifier.login.persona.org does not correctly parse audience URLs of the
form http[s]://a@b@c; it thinks that the host is b and the path is /@c, whereas in fact the host here is c and the path is /

This vulnerability stems from Persona's use of the NodeJS url library to parse such URLs,
which is unnecessary here because though the audience field is meant to be an HTTP[s] origin, which is much less general than full URLs and easier to parse using a tight regular expression.

Potential Attacks
--------------------------

The vulnerability in verifier.login.persona.org is useful to illustrate the problem, but perhaps not so easy to exploit. It relies on a user convincing an honest website (say google.com) into requesting assertion verification for a carefully crafted audience URL. Then even if google.com verifies that the audience URL has origin https://google.com; it would be fooled into accepting an assertion that was issued to https://evil.com

However, on inspecting the code for Persona on Github, we discovered that url.parse was being used in a wide variety of Persona files, and some of these uses are certainly security-critical (e.g. http_forward.js and wsapi.js). I did not want to try live attacks on the Persona APIs, but I am confident that a full review of all uses of url.parse is in order.
If the URL being parsed is untainted or authenticated in some way, it is safe, otherwise one must be wary of carefully crafted url's breaking URL parsing.

Countermeasure
--------------------------

For verifier.login.persona.org, we suggest replacing url.parse with something far more specific only for parsing origins in a particular format, and documenting this format in
the Persona RP documentation. For other uses of url.parse where full URL parsing may be needed, we recommend waiting for a patch to NodeJS or writing your own 
RFC3986-compliant parser in JavaScript.

About us
--------------------------

We are an academic security research group called Prosecco at INRIA in France (http://prosecco.inria.fr). Our work focuses on formal security proofs for cryptographic and web applications. We found this attack when trying to build and formally verify our own implementations of single sign-on protocols. We would welcome the opportunity to study any fixes you may propose and incorporate them into our analyses.
assigning to yvan for verification
https://wiki.mozilla.org/Security/Web_Bug_Rotation#Web_Bug_Verification
Assignee: nobody → yboily
Whiteboard: [site:login.persona.org][verif?]
Great find!  If you don't hear from the node.js folks in a timely fashion, please let us know, and watch this bugs for updates as we resolve the issue!
Status: UNCONFIRMED → NEW
Ever confirmed: true
Whiteboard: [site:login.persona.org][verif?] → [site:login.persona.org][reporter-external]
Flags: needinfo?(lhilaiel)
This might be more easily exploitable in sites that

a) blindly copy the audience from the Host: header
b) Work ok even if the Host: header looks malformed to make this exploit work

I know I do a) in development mode. There might also be persona libraries that do it.
We definitely don't blindly copy from Host: headers, as that is a problem even without this bug :)

Great find, I think more relevant to our internal uses of the system rather than the verifier, though of course we want to fix that in the verifier too. I don't think this is an exploitable bug yet, but it's very much worth fixing.

adding Francois
Flags: needinfo?(lhilaiel)
@yvan or benadida given what ben said, do we still consider this a sec-high? I have removed the sec-high for now, can one of you update the whiteboard tag with what it should be?
Keywords: sec-high
I am hesitating to try some more examples on the live persona server.

But do note that the email.js file uses url.parse to compute the RP for
sendNewUserEmail, sendForgotPasswordEmail, sendConfirmationEmail, etc.
Line 96:   site = url.parse(site).hostname;
Line 127: 'X-BrowserID-RelyingParty': site

Examples like this are (naturally) littered through the browserid code.
Is this one exploitable? I'll leave it to you to suss it out quicker than me.

We are currently in the process of testing a nodejs patch for url.js, 
but given the many pitfalls of browser-consistent URL parsing (whatever that means: https://code.google.com/p/browsersec/wiki/Part1#Uniform_Resource_Locators) I would recommend minimizing the use of generic URL parsing unless necessary.

Of course, I am plugging to retain the high-security tag :)
But I don't know your labeling process well enough.

Best,
Karthik
To document the attack scenario on the original https://a@b@c bug above,
the one I think we have all been considering is as follows.
(Note that I do not know any RP that does this, but it seems slikely someone will):

----
login.yahoo.com decides to use Persona and accepts assertions issued 
for both search.yahoo.com and mail.yahoo.com

To verify these assertions it provides a REST API at login.yahoo.com/verify that
accepts both an assertion and an audience (since the request may come from search or mail). It verifies that the audience is either mail.yahoo.com or search.yahoo.com by calling
url.parse(audience).hostname (or to be safe, by extracting the full origin.) 
It then sends on the assertion and audience to verifier.login.persona.org for verification
and says yes to mail or search, iff verifier says yes.

In the above scenario, an attacker could send login.yahoo.com/verify an audience=https://a@mail.yahoo.com@evil.com and an assertion issued to user Alice for https://evil.com
Then login.yahoo.com would accept this assertion and log the attacker into Alice's mail account.
----

Does this sound about right?
I can also confirm that the malformed URL bug works both ways on verifier.login.persona.org

(A) Verifier accepts audience=https://a@B@C for assertions issued to B (as described above),
(B) Verifier accepts audience=https://B for assertions issued to https://a@B@C

That is both the assertion and the audience are vulnerable to malicious URIs.
If either an RP or a browser implementation of Persona are willing to accept arbitrary
audience URIs and only enforce that the origin of the URI is correct (by comparing with a postMessage.origin, for example), then the URI parsing bug on persona.org becomes exploitable.

For (B) above, the attack demo is as follows.

-----
Attack scenario:
1. 
Suppose an attacker evil.com can get a user to sign an assertion for the audience
https://alice@mozilla.org@evil.com
Note that this above URL, if parsed in Chrome/Firefox, would yield the origin https://evil.com, so it is not inconceivable that such a URI may pass browser-side origin-authentication.

To generate such an assertion, you can try in the persona.login.org dialog box: BrowserID.User.getAssertion("test@mockmyid.com","https://alice@mozilla.org@evil.com",function(a) {console.log(a);})

2. 
The attacker can then use this assertion to pretend to be Alice@mockmyid.com at https://mozilla.org

I tried this assertion using 
curl -d "assertion=<Assertion>&audience=https://mozilla.org" "https://verifier.login.persona.org/verify" 

and it succeeded:
{"status":"okay","email":"test@mockmyid.com","audience":"https://google.com","expires":1370434091232,"issuer":"mockmyid.com"}
-----
As far as I can tell, exploiting this vulnerability on the verifier requires the attacker to find a site that ignores our security guidelines and derives the audience from user-controlled input instead of using a hard-coded string. We should fix it, but those RPs have bigger problems.

More worrying would be if we could trick the shim into generating an assertion for "http://A@victim.com@attacker.com" because then an attacker could use that to impersonate a user on victim.com whenever that user logs into attacker.com.

I can't see how that could be done though since the audience in getAssertion is constructed from either:

- WinChan's origin (dialog/js/modules/dialog.js:76)
- window.location.protocol + window.location.hostname (common/js/user.js:1565)
- JSChannel's origin(communication_iframe/start.js:94)
Thanks for the summary, Francois, it would have taken me forever to gather this list of origin sources. 

If we agree that getAssertion relies on these three methods to get the origin right, then the problem reduces to seeing if they could potentially return you malformed origins.

A quick look at winchan yields the following two fragments of code that parse origins,
and yes, they do it incorrectly as well. Whether this translates to an exploit remains an open question. I will report it independently to the winchan authors.
----
winchan.js: line 48
// given a URL, extract the origin
  function extractOrigin(url) {
    if (!/^https?:\/\//.test(url)) url = window.location.href;
    var m = /^(https?:\/\/[\-_a-zA-Z\.0-9:]+)/.exec(url);
    if (m) return m[1];
    return url;
  }
---
winchan.js: line 62
  if (frames[i].location.href.indexOf(origin) === 0 
----


For both these checks, given a URI = https://www.mozilla.org@evil.com, 
they will accept to match it against the origin https://www.mozilla.org
(whereas this URL is really, of course, on evil.com)

The first function seems to be used for a "sanity check" (line 106).
The second check appears in "findRelay" which is a crucial mechanism for making winchan work on IE.

Whether either of these partial URI checks results in an exploit on BrowserID or other winchan applications, I do not know. But I hope you agree that relying on other libraries (Node/WinChan)
to do URI validation for BrowserID code is a bit hazardous. 

If I had a student to spare, I could give you a firmer answer on how far these vulnerabilities could be pushed towards a real-world exploit. In any case, implementing a strong Origin URL parsing routine for audiences seems worthwhile?
Dear All,

I just submitted a separate bug report on email address parsing in Persona, this time
on regexps in Persona code (e.g. lib/primary.js), rather than in Node.
https://bugzilla.mozilla.org/show_bug.cgi?id=880660

It shows a potential "exploit" against RPs that allow a wider range of email addresses (e.g. full RFC5322) than persona.login.org. Perhaps not a very realistic attack, but still one that allows self-signed assertions for arbitrary users' email addresses. Again the culprit is inadequate validation of user-provided input.

Best,
Karthik
@francois this looks like a valid vuln to me.  Do you have any further comments?
Flags: needinfo?(francois)
I can't say anything about the bug (880660) mentioned in comment #12 because I don't have access to it. Feel free to CC me on it.

For the winchan issues mentioned in comment #11:

> winchan.js: line 48
>   // given a URL, extract the origin
>   function extractOrigin(url) {
>     if (!/^https?:\/\//.test(url)) url = window.location.href;
>     var m = /^(https?:\/\/[\-_a-zA-Z\.0-9:]+)/.exec(url);
>     if (m) return m[1];
>     return url;
>   }

While this regexp is wrong, the only place where extractOrigin() is used (lines 106-107) is where it's being fed input from the site setting up the channel to talk to Persona. So that site would have to use hostile URLs in their call to it to run into problems.

However, there is a risk that this broken function will get used in other places later. Karthik: are you aware of anything we can use that parses URLs properly? (i.e. what would be your proposed fix here?)
Attached patch winchan-878485.patch (obsolete) — Splinter Review
> winchan.js: line 62
>   if (frames[i].location.href.indexOf(origin) === 0

This is a hard one to reproduce. I'm not sure what browsers (and under what circumstances) will expose the username and password as part of location.href.

Nevertheless, it is easy to fix so we should do it. I've attached a proposed fix for it.
Assignee: yboily → francois
Status: NEW → ASSIGNED
Flags: needinfo?(francois)
> However, there is a risk that this broken function will get used in other
> places later. Karthik: are you aware of anything we can use that parses URLs
> properly? (i.e. what would be your proposed fix here?)

Parsing URLs in a browser-generic manner seems quite difficult (too many exceptions) 
without rejecting some valid URLs. The JavaScript URL parsing libraries I have encountered
are almost all buggy because they try to accept all the URLs that a browser can interpret.

However, in this case the task is easier: we need to ensure that the origin returned by "extractOrigin" is the same as the origin as interpreted by the current browser. So, for this use case, I would suggest using the browser's builtin URL parser.

My fix would be the following (tested lightly on FF, Chrome, IE, Safari; probably needs more.)

function extractOrigin(u){
  var a = document.createElement('a'); 
  a.href=u; 
  return a.protocol+"//"+a.host}
I don't have a good answer on how to repro it. I see that your patch also uses the browser's URL parser, like I suggested above for the other instance, and it looks pretty good.

My other concern in this piece of code is the loop that examines all the frames in the opener window. Why not simply search for frames[RELAY_FRAME_NAME] and set RELAY_FRAME_NAME as both "name" and "id" on the corresponding iframe?? Is there some browser-specific incompatibility that I am unaware of? 

Other than the inefficiency of this loop and the fact that it logs security exceptions in some browsers (attempted cross-origin access), the use of frames[i] also triggers a security bug I recently reported in Chrome and the Android Browser. The bug turned out to be in V8, fixed last week, and it bypasses Origin restrictions when accessing frames by index. The simplest would be to open my demo page in Chrome: http://prosecco.inria.fr/iframes.html
(Please treat as confidential till the fix is released.)

The basic problem is that Chrome and Android browser allow a window to overwrite frames[i] with an object and this object then becomes accessible to all other frames, breaking SOP. E.g. in our context, suppose the opening window overwrites frames[0]:

o = {}
o.__defineGetter__("location",function(){/*arbitrary code*/})
frames[0] = o

This seems strange yes, but the above code is allowed in Chrome and Android Browser, and actually changes frames[0]. Now, when the Persona window goes through the frames in window.opener, it in fact will trigger the arbitrary code in frames[0].location, even though it is on a different origin. As you can guess, this origin bypass can lead to XSS. 

What I've described above is a bug in Chrome/V8 and not really of direct concern to Persona; but it suggests that we should avoid iterating through unknown frames on untrusted origins in case these kinds of bugs appear in other forms. Accessing frames[RELAY_FRAME_NAME] seems safer.
Simplified this patch. It turns out that location.host includes the port number already.
Attachment #783595 - Attachment is obsolete: true
Fix for the broken extractOrigin function that's not currently used anywhere important.
> My other concern in this piece of code is the loop that examines all the frames in the
> opener window. Why not simply search for frames[RELAY_FRAME_NAME] and set
> RELAY_FRAME_NAME as both "name" and "id" on the corresponding iframe?? Is there some
> browser-specific incompatibility that I am unaware of? 

I don't know enough about window.frames to know whether we can do that in a way that's compatible with every browser we support, but MDN seems to suggest that we can only iterate through this array-like structure using numeric indices: https://developer.mozilla.org/en-US/docs/Web/API/window.frames

So I would encourage you to move your suggestion to the public bug tracker for winchan (https://github.com/mozilla/winchan/issues) where more people are likely to see it.

I'm not personally worried about the security implications of this loop though because it sounds like it's a pretty serious bug in V8 and that it will get fixed soon.
You're right Francois, the V8 bug will hopefully be pushed out soon.

In any case, I'm quite happy with both your proposed patches.
Hi Karthik,

Thanks for reporting the bug. Francois comment implies that the risk to users is low unless the RP is already configured incorrectly. We have tentatively rated this as a sec-moderate issue and decided not to pay out a bounty. However if there are other circumstances / workflows which we do not know about, we may reconsider a bounty on this bug.
Flags: sec-bounty? → sec-bounty-
Keywords: sec-moderate
This has gone to production on Persona and has been merged to dev.

I opened this pull request on the upstream winchan project: https://github.com/mozilla/winchan/pull/35
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Can this bug be made public now?
Flags: needinfo?(francois)
Yes, it looks like it's been fixed both in Persona and winchan.
Flags: needinfo?(francois)
Group: cloud-services-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: