1353429 - ContentPrincipal makes assumptions about escaping of '^' in URIs that don't seem to be true

Reporter

Description

•

7 years ago

Testcase

  <a href="http://test^test/foo^bar#x^y">Test</a>
  <script>
    alert(document.querySelector("a").href);
  </script>

Note that the '^' in the hostname is NOT escaped.  Same thing for .host.

As far as I can tell, ContentPrincipal::GenerateOriginNoSuffixFromURI assumes that GetAsciiHostPort() on such a URI _will_ escape '^' or something.  I suspect that this assumption is false.

(Note that Gecko does escape ^ in the path, but not in the anchor or the hostname... The good news for the former is that we strip off the anchor and query in GenerateOriginNoSuffixFromURI in the "full spec" case, so don't need to worry about those bits.)

Flags: needinfo?(bobbyholley)

Bobby Holley (:bholley)

Comment 1

•

7 years ago

On our call, bz determined that Chrome and edge seem to reject ^, and so we can probably change this spec. NI bz to fill in the details and follow up with Anne.

Flags: needinfo?(bobbyholley) → needinfo?(bzbarsky)

Anne (:annevk)

Comment 2

•

7 years ago

Note that in the standard we've been through this various times already: https://github.com/whatwg/url/issues/159 and https://github.com/whatwg/url/issues/214. Putting arbitrary restrictions on hosts is arguably not in line with the design of generic URLs and might break non-DNS systems. Chrome rejecting this particular code point makes that less of a concern (though I believe that doesn't mean they'll reject it in all code paths that involve their URL parser, so maybe it is), but what do we tell Safari about the rationale for blocking this?

I'd rather we not depend on such risky things to begin with.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 3

•

7 years ago

So in terms of browser behavior, here is a testcase:

  <pre><script>
  var strs = [
    "http://test^test/foo\\bar",
    "http://test:2^3/foo\\bar",
    "http://test/foo^bar",
  ];
  for (var str of strs) {
    var a = document.createElement("a");
    a.setAttribute("href", str);
    var href;
    try {
      href = a.href;
    } catch(e) {
      href = "href getter threw";
    }
    var url;
    try {
      url = (new URL(str).href);
    } catch(e) {
      url = "constructor threw";
    }
    document.writeln(str, " -- ", href, " -- ", url);
  }
  </script>


with a live version at https://jsfiddle.net/yb02ng3r/2/ if someone wants to see it.  Browser behavior seems to be as follows:

* Safari: Allows '^' in host or path.  Does not allow it in port; fails URL parsing in
  that case (as you can tell by the URL ctor throwing on the second output line
  and the '\\' not getting changed to '/' in the href).
* Firefox: Allows '^' in host.  Escapes it in path.  As Safari for port.
* Edge: Escapes '^' in host and path.  Fails parsing for '^' in port, and also
  throws from href getter.
* Chrome: Escapes '^' in host via href, but throws from new URL().  Allows '^' in port
  via href, but throws from new URL() (!).  Escapes '^' in path.

I believe the current spec text aligns with Safari, for what it's worth.

Anne, what we need in practice is a faithful string representation of an origin, in a slightly broader meaning than the HTML spec one.  That means a URL (but without the ref and query bits) and some other additional information.  We need a way to clearly separate the additional information from the URL.  At the time we set this up, it was pretty clear from the URL RFCs that '^' was not allowed in URLs in unescaped form.  For example, it's not in either the "reserved" or the "unreserved" set in RFC 2396, same for RFC 3986.  I _think_ the whatwg URL spec at the time also forbid it in unescaped form, though I don't have a good way to check that.

If we can't use '^' (e.g. by aligning on the Edge behavior instead of the Safari one), what can we use?  I guess we can try '>'.  This is in the userinfo and path percent-encode sets in the URL spec, is not allowed in URLs by the URI RFCs (because <> is used to delimit URLs in text), and I _think_ shouldn't be allowed in hosts, for that very same reason.  That said, I think the URL spec allows it in hosts afaict, and Safari follows the URL spec, nicely breaking every single URL linkifier since 1994 that tries to be reasonable.  :(

> what do we tell Safari about the rationale for blocking this

It aligns with the URI RFCs and the Chrome/Edge behavior (modulo Chrome's "yeah, let's have two different URL parser behaviors" mess), and is thus the quickest way to interop across a broad range of things, including but not limited to web browsers?

Flags: needinfo?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Reporter

Comment 4

•

7 years ago

I filed https://github.com/whatwg/url/issues/291 on the problem I see with '>' in the URL spec as it stands today.  :(

Bobby Holley (:bholley)

Comment 5

•

7 years ago

Also, it would be a PITA to change from ^ to something else, because it's already embedded in the filenames of various persistent storage bits in the file system and whatnot.

Bobby Holley (:bholley)

Comment 6

•

7 years ago

If we _were_ to change the format, mystor was proposing that we just always append the ^, rather than only in the case of having originAttributes. This has the same backwards-compat issues, but at least would remove the need to find a perfect character for this.

Nika Layzell [:nika] (ni? for response)

Comment 7

•

7 years ago

(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #6)
> If we _were_ to change the format, mystor was proposing that we just always
> append the ^, rather than only in the case of having originAttributes. This
> has the same backwards-compat issues, but at least would remove the need to
> find a perfect character for this.

If we do have to make this change, it would be very nice to be able to unconditionally add a suffix to an origin URI. This would help us with:
a) Always being able to rfind the `^` or whatever other character and split on it,
b) Avoid errors where code assumes that an origin is a valid URI, and tries to parse it as such, which succeeds in most cases but fails when container tabs or private browsing is present (there is an example of this happening, for example, in SessionStorage.jsm which I am fixing in bug 1353844),
c) Easily identify origin strings vs URIs when seeing them in debug output or similar

This is only worthwhile to do if we already have to change origins due to security problems with the current implementation. Changing the format of origins will be a monumental effort so it would be nice to avoid changing it if at all possible (it is used as a literal string for comparison in a bunch of different persistent storage forms, such as IIRC indexeddb, sessionstore, and the permission manager)

Boris Zbarsky [:bzbarsky]

Reporter

Comment 8

•

7 years ago

> but at least would remove the need to find a perfect character for this.

Would it?  Can the originAttributes not contain '^'?

> Avoid errors where code assumes that an origin is a valid URI

Or vice versa.  I have recently had to tell someone repeatedly not to call the "from origin" principal stuff while passing it a URL instead of an origin...

Anne (:annevk)

Comment 9

•

7 years ago

U+0009 TAB, U+000A LF, or U+000C CR would work. U+0023 (#) too probably. Anything non-ASCII if that's an option. From your test results changing whether ^ escapes might be reasonable though. Hopefully Safari is okay with it.

Matt Wobensmith [:mwobensmith][:matt:]

Comment 10

•

7 years ago

Sec bug triage group: What is the security impact of our current implementation? Should this bug remain private?

Nika Layzell [:nika] (ni? for response)

Comment 11

•

7 years ago

(In reply to Anne (:annevk) from comment #9)
> U+0009 TAB, U+000A LF, or U+000C CR would work. U+0023 (#) too probably.
> Anything non-ASCII if that's an option. From your test results changing
> whether ^ escapes might be reasonable though. Hopefully Safari is okay with
> it.

I think the problem with these symbols is that we had to make sure that whatever character we used was legal to use in a filename on all of our supported platforms, as we use origins as part of paths for indexeddb IIRC.

I'd be inclined to keep it a readable character for that reason as well. # seems like a bad idea as it would be possible for a URI to look like that which could mess things up due to anchor references.

(In reply to Matt Wobensmith [:mwobensmith][:matt:] from comment #10)
> Sec bug triage group: What is the security impact of our current
> implementation? Should this bug remain private?

I'm not in said triage group but I think this bug should probably remain private at least until we're confident that we can mitigate the issues. If it was possible to get an unescaped ^ character in a website's domain it would be possible for that website to, for example, escape private browsing mode or read cookies from a different container tab.

(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #8)
> > but at least would remove the need to find a perfect character for this.
> 
> Would it?  Can the originAttributes not contain '^'?

When we were originally making this decision I don't think we had a string origin attribute, but I think that we do (firstOriginDomain). I have no idea if that can contain ^ characters, but theoretically we could define a more strict encoding system for origin attributes then we do for URIs as they don't need to be web compatible.

Andrew McCreight [:mccr8]

Updated

•

7 years ago

Keywords: sec-other

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 12

•

4 years ago

Attached file Bug 1353429: Test Principal URI escaping.r=valentin — Details

Phabricator Automation

Updated

•

4 years ago

Assignee: nobody → ckerschb

Status: NEW → ASSIGNED

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 13

•

4 years ago

Landed: https://hg.mozilla.org/integration/autoland/rev/21aedccf1abbfabba2989e00fb3856bcbb12b871

Backed out for eslint failures:

https://hg.mozilla.org/integration/autoland/rev/d4a9dcd3f6e4e90c4acbfa2dd017d0ddc0158ab4

Push with failure: https://treeherder.mozilla.org/#/jobs?repo=autoland&group_state=expanded&selectedJob=293479753&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel%2Crunnable&revision=21aedccf1abbfabba2989e00fb3856bcbb12b871
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=293482540&repo=autoland

Flags: needinfo?(ckerschb)

Christoph Kerschbaumer [:ckerschb]

Assignee

Comment 14

•

4 years ago

(In reply to Sebastian Hengst [:aryx] [limited availability until end of March](needinfo on intermittent or backout) from comment #13)

Backed out for eslint failures:

Arrrh - on it!

Flags: needinfo?(ckerschb)

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 15

•

4 years ago

https://hg.mozilla.org/integration/autoland/rev/1ed0cfd336ddda0d6ba49818a36f6847569cd9bf
https://hg.mozilla.org/mozilla-central/rev/1ed0cfd336dd

Group: dom-core-security → core-security-release

Status: ASSIGNED → RESOLVED

Closed: 4 years ago

status-firefox76: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla76

Ryan VanderMeulen [:RyanVM]

Updated

•

4 years ago

status-firefox55: affected → wontfix

status-firefox74: --- → wontfix

status-firefox75: --- → wontfix

status-firefox-esr68: --- → wontfix

Flags: in-testsuite+

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 16

•

4 years ago

Backed out for conflict with backed out bug 1620624:

https://hg.mozilla.org/integration/autoland/rev/e72622db09a1ee360b89b7f54b8e4c38a86e2d5e

Status: RESOLVED → REOPENED

status-firefox76: fixed → ---

Flags: needinfo?(sstreich)

Resolution: FIXED → ---

Target Milestone: mozilla76 → ---

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 17

•

4 years ago

https://hg.mozilla.org/integration/autoland/rev/f2c995169b5c3337d3d73a664fce95eb0fe6b829
https://hg.mozilla.org/mozilla-central/rev/f2c995169b5c

Status: REOPENED → RESOLVED

Closed: 4 years ago → 4 years ago

status-firefox77: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla77

BugBot [:suhaib / :marco/ :calixte]

Comment 18

•

4 years ago

Since the status are different for nightly and release, what's the status for beta?
For more information, please visit auto_nag documentation.

status-firefox76: --- → ?

Ryan VanderMeulen [:RyanVM]

Updated

•

4 years ago

status-firefox76: ? → wontfix

Flags: needinfo?(sstreich)

Brindusa Tot, Desktop QA

Updated

•

4 years ago

Flags: qe-verify+

Whiteboard: [post-critsmash-triage]

Alin Ilea, Desktop QA

Comment 19

•

4 years ago

Reproduced the behavior before the fix following the test case from the description, with an old Nightly build 56.0a1 (build id: 20170731100325) using Windows 10.
Verified - Fixed in Nightly 77.0a1 (build id: 20200423095248) and latest Nightly 78.0a1 (build id: 20200505094621).

Status: RESOLVED → VERIFIED

status-firefox77: fixed → verified

Flags: qe-verify+

Frederik Braun [:freddy]

Updated

•

4 years ago

Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main77-]

Daniel Veditz [:dveditz]

Updated

•

3 years ago

Group: core-security-release