Closed Bug 1863622 Opened 1 year ago Closed 1 year ago

URL Constructor Allows DOM-Based XSS via Incorrect Scheme Parsing

Categories

(Core :: Networking, defect, P3)

Firefox 119
ARM64
macOS
defect

Tracking

()

RESOLVED DUPLICATE of bug 1374505

People

(Reporter: gmishra010, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: reporter-external, Whiteboard: [reporter-external] [client-bounty-form] [verif?][necko-triaged])

Attachments

(2 files)

Attached file weak_url_parser.zip

Summary

I've identified a security concern within the SpiderMonkey engine, specifically related to the URL constructor used for parsing URLs and extracting properties such as protocol, host, port, pathname, etc. When untrusted user input is provided to this constructor, it mishandles the parsing of unknown schemes. This can lead to a DOM-based Cross-Site Scripting (XSS) vulnerability when the parsed pathname is used in functions/properties like window.location.assign(), window.open(), window.location.href, etc.

Why is it an issue?

This behavior can be misleading to developers. They might assume that the user will be redirected to a page on the same host, but in reality, an attacker could exploit the incorrect parsing to perform actions like redirecting users to arbitrary domains or executing JavaScript code.

Firefox Version: 119.0.1 (64-bit)
Operating System: Mac OS Ventura 13.6.1

Steps to Reproduce:

  1. Setup: Download and host the attached HTML file on the web server.
  2. Environment: Open the file using the Firefox browser on the MacOS platform
  3. Query Parameter: Append a query parameter named originUrl to the end of the URL.
  4. Test Cases: Observe the behavior with the following parameter values:
  • x:javascript:alert(window.origin) - This value should be accepted, parsed, and executed in the context of the same host.
  • x://google.com - This value should be accepted, and redirection should occur to google.com domain, demonstrating the issue.

I've attached all the relevant screenshots for reference. Notably, the behavior of Chrome on Windows differs, as it categorizes an unknown scheme as a 'file' scheme and appends a forward slash for the pathname, mitigating the issue.

Flags: sec-bounty?
OS: Unspecified → macOS
Hardware: Unspecified → ARM64
Version: unspecified → Firefox 119

SpiderMonkey is not relevant here.

Group: firefox-core-security → network-core-security
Component: Security → Networking
Product: Firefox → Core
Summary: URL Constructor in SpiderMonkey Allows DOM-Based XSS via Incorrect Scheme Parsing → URL Constructor Allows DOM-Based XSS via Incorrect Scheme Parsing

Valentin: are these cases already covered in the interop-2023 effort?

Flags: needinfo?(moz.valentin)

So the issue here is that we're parsing x://google.com as protocol x:, pathname //google.com. We do this for all unknown protocols, and have done so for a long time. Chrome has a similar behaviour.
https://jsdom.github.io/whatwg-url/#url=eDovL2dvb2dsZS5jb20=&base=YWJvdXQ6Ymxhbms=

We are close to fixing it in bug 1603699, but I don't really understand how this is a security issue.
If the intention is that passing the pathname to location.assign in the example will never navigate to another origin, passing x:google.com will still make the pathname be google.com, and that is being parsed correctly.

https://jsdom.github.io/whatwg-url/#url=eDpnb29nbGUuY29t&base=YWJvdXQ6Ymxhbms=

Flags: needinfo?(moz.valentin)

Hello Valentine,

I appreciate your attention to my report. On Mac and Linux versions, Chrome behaves similarly to Firefox, but there is a distinction on Windows. I've reported this issue to Google, and they are currently working on the fix. Unfortunately, I'm unable to provide more details at this time. For reference, you can view the parsing for Chrome's Windows version in the following image link:
https://i.imgur.com/zsfrRTt.png

I've developed a sample application to illustrate the impact:

Redirection to Arbitrary Origin:

DOM-based XSS:

If you require any additional information, please let me know.

--
Thanks,
Gaurav

Just wanted to add that the URL parsing on Chrome's Windows version is considered to be safe and a fix will be released for Mac and Linux versions.

--
Thanks,
Gaurav

Status: UNCONFIRMED → NEW
Ever confirmed: true

(In reply to Gaurav Mishra from comment #5)

Redirection to Arbitrary Origin:

DOM-based XSS:

If you require any additional information, please let me know.

Thank you, Gaurav.
What would you expect Firefox to do in this case?
As far as I can tell, apart from the divergence from the URL spec with non-special URLs here this is just a normal XSS that all websites are vulnerable to, the same way they are if they eval a string I pass as a query parameter.

Severity: -- → S3
Flags: needinfo?(gmishra010)
Priority: -- → P3
Whiteboard: [reporter-external] [client-bounty-form] [verif?] → [reporter-external] [client-bounty-form] [verif?][necko-triaged]
Whiteboard: [reporter-external] [client-bounty-form] [verif?][necko-triaged] → [reporter-external] [client-bounty-form] [verif?][necko-triaged][necko-priority-new]

Please share the link to the Chrome bug you reported. Even if it's private, the reference will help us when we talk to them about this issue and the standard. Feel free to share the link to this bug with them, as well.

Hello Valentine,

I believe the URL parsing in the Windows version of Chrome is secure, and it would be beneficial for Firefox to adopt a similar approach. The issue lies in the potential confusion for developers when parsing a user-controlled URL. They might assume that parsed_url.pathname should only provide the path and not the hostname, as is the case here due to the weak implementation of the URL constructor. While developers are generally cautious about using the eval function, the URL constructor is not widely recognized as a potentially risky function.

Hi Daniel, here is the link to the Chromium bug: https://bugs.chromium.org/p/chromium/issues/detail?id=1500405

--
Thanks,
Gaurav

Flags: needinfo?(gmishra010)

They might assume that parsed_url.pathname should only provide the path and not the hostname, as is the case here due to the weak implementation of the URL constructor.

I think that's an incorrect assumption. As I mentioned in comment 4, when parsing x:google.com the pathname is google.com and the parsing is performed correctly.

While developers are generally cautious about using the eval function, the URL constructor is not widely recognized as a potentially risky function.

I think it's not the URL constructor that is the problem, but navigating to a user/attracker provided location. While the URL constructor can parse the URL it's not suitable for sanitizing data.

One thing we might want to do here: update the MDN docs so users know not expect URL components to be sanitized.
I will defer to the security team to determine the impact of this issue on users and its severity.

Blocks: url
Whiteboard: [reporter-external] [client-bounty-form] [verif?][necko-triaged][necko-priority-new] → [reporter-external] [client-bounty-form] [verif?][necko-triaged]

Hello Valentin,

When you compare the URL parsing of Firefox with that of Chrome's Windows version for unknown schemes, notable differences can be seen. In Chrome (Windows version), the pathname for x:https://google.com is presented as /X:/https://google.com. This approach seems secure as it addresses arbitrary redirection or JavaScript execution concerns. However, in Firefox, the pathname parsing yields https://google.com, which could potentially result in arbitrary redirection if subsequently employed in functions such as window.location.assign().

One thing we might want to do here: update the MDN docs so users know not expect URL components to be sanitized.
I believe that input sanitization is more effective when carried out by native functions.

--
Thanks,
Gaurav

Just wanted to add that the URL parsing on Chrome's Windows version is considered to be safe and a fix will be released for Mac and Linux versions.

[...]

In Chrome (Windows version), the pathname for x:https://google.com is presented as /X:/https://google.com. This approach seems secure as it addresses arbitrary redirection or JavaScript execution concerns.

This behavior violates the URL spec, but even if they do extend it to Mac and Linux I don't see how it addresses the underlying problem. They can argue (and have) that a single-letter scheme should be seen as a drive letter and thus an implicit file:// url, but if you switch to a longer scheme like xx:javascript:alert(window.origin) you're back to the original problem.

The live example site shows this as an example of an XSS challenge, but if it weren't it's just a bad way to do it. In the simplified testcase attached here the guts of the code (ignoring the bits that obtain the search param) is

    	const url = new URL(originUrl);
        const relativeURL = url.pathname;
        window.location.assign(relativeURL);

No checking that the schemes involved are the same even, but fundamentally, this is mixing URL objects and strings in dangerous ways. If the code did window.location.pathname = url.pathname instead, even without more sanity checking that would be safer: you are manipulating the same kind of object (pathnames) instead of a mix.

The whole concept of taking a random web argument of an arbitrary URL and thinking a relative part would make sense on the current site is just so odd that it's a hard place for me to start in thinking about how this might be a real security risk and not just a site security bug. If the parameter is supposed to be a relative URL then it should be validated as one.

Another safe way to do this would be to construct a relative URL before assigning it to location and check that the origins match after the supposed "relative" part was added. something like

    const origURL = new URL(originURL);
    const relativeURL = new URL( origURL.pathname, location );
    if ( relativeURL.origin == location.origin)
      location = relativeURL;
    else
      console.error("Invalid relative URL", relativeURL);

What to do with this bug?

  • we're not going to change parsing of single-letter schemes unless the spec is updated (URL-spec interop is a 2023 goal)
  • we are already fixing the mishandling of hierarchical-looking URLs in bug 1603699. That part of this bug is a dupe of bug 1374505
  • neither change makes the presented testcase any less broken, and it's basically the same problem in every browser (for multi-char schemes) so it's not a Firefox security bug.
Group: network-core-security
Status: NEW → RESOLVED
Closed: 1 year ago
Duplicate of bug: 1374505
Resolution: --- → DUPLICATE
Flags: sec-bounty? → sec-bounty-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: