Closed Bug 1615315 (CVE-2020-6802) Opened 5 years ago Closed 5 years ago

Bleach mutation xss in <noscript> handling

Categories

(Webtools :: Bleach-security, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: u581815, Assigned: u581815)

References

()

Details

(Keywords: reporter-external, sec-moderate, wsec-xss, Whiteboard: keep hidden until 3rd party consumers are notified)

Attachments

(1 file)

From: Yaniv Nizry<Yaniv.Nizry@checkmarx.com>
To: "security@mozilla.org" <security@mozilla.org>

Hello,

My name is Yaniv Nizry and I’m a researcher in the CxSCA group at Checkmarx.

I discovered a security vulnerability (mXSS) in Mozilla-bleach python package.

Details:

noscript tag in HTML is treated differently whether JS is enable or disabled. When JS is enabled the data inside the tag is parsed as JS, but when its disabled the data is parsed as html.

Bleach relies on html5lib, a python library for parsing HTML. By looking at the implementation of html5lib in bleach’s code we can see that there is a variable named “scripting” and its default value is False

Later on in the parsing process the noscript tag will be parsed either as raw text or as html depending on the scripting value

Because JavaScript is enabled by default in the browsers but it’s disabled in the sanitizer, it causes a mutation XSS (mXSS).

PoC:

Firstly, I tried to give the sanitizer this input (a known mXSS string):

But for some reason the “lower than” characters inside an attribute are converted to an entity (but I saw it added the a closing tag which means the parsing inside it is HTML).

In order to get over this sanitization, we can use a tag that parses as raw text (Ex. “style” tag).

So what happened here?

The parser entered the noscript tag and started parsing as html, after that it entered the style tag and the parser changed to raw text (now nothing will get sanitize). And the parser closed the style and noscript automatically.

Now lets take a look on how the browser parses the output: The parser enters the noscript tag and starts parsing as JavaScript (no style tag is created here), and right after that exits the noscript tag.

What comes after is a normal HTML in this case an img tag with an onerror attribute.

For the PoC purposes I made a server that bleach the “payload” parameter in the url and returns it.

In order for this vulnerability to work we need the “noscript” tag to be whitelisted as well as one of the raw text parsed tags:

title
textarea
script
style
noembed
noframes
iframe
xmp

Payload example:

<noscript><style></noscript><img src=x onerror=alert(1)>

Out in the wild example:

Here is one of the scenarios we encountered that is exploitable.

https://github.com/galaxyproject/galaxy

Their sanitizer whitelists our needs to exploit the vulnerability: https://github.com/galaxyproject/galaxy/blob/master/lib/galaxy/util/sanitize_html.py

In the Galaxy project there is a feature to various file types, one of them is HTML that gets bleached.

Here for example the uploaded files are on the right (payload-normal.html is a simple script tag)

But when we choose to view payload.html with our exploit:

We plan on notifying the Galaxy project after you release a fix for the vulnerability.

Feel free to reply to this email for any questions.

Best regards,

Yaniv

TODO(g-k): add screenshots from email

+peterbe since MDN might be impacted. Does MDN whitelist noscript and any of the following tags

title
textarea
script
style
noembed
noframes
iframe
xmp

?

Flags: needinfo?(peterbe)

Confirmed for versions v3.1.0, v3.0.2, and v2.1.4 (getting an html5lib@1.0.1 import error on v2.1, which probably applies to other older libs). Tested with:

import bleach
from bleach import clean

print(bleach.__version__)

cleaned = clean(
    "<noscript><style></noscript><img src=x onerror=alert(1)>",
    tags=["noscript", "style"],
)
bad = "<img src=x onerror=alert(1)>"

assert bad not in cleaned, "got:\n{!r} did not escape:\n{!r}".format(cleaned, bad)
Assignee: nobody → gguthe
Status: NEW → ASSIGNED

Looks like allowing noscript isn't required and anything up to an unclosed raw tag (anything using parseRCDataRawtext) gets passed through:

Python 3.8.0 (default, Nov 21 2019, 17:41:43)
[GCC 7.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import bleach
>>> bleach.__version__
'3.1.0'
>>> bleach.clean('<xmp><script>confirm(1)</script>', tags=['xmp'])
'<xmp><script>confirm(1)</script></xmp>'
>>> bleach.clean('<xmp><script>confirm(1)</script><script>confirm(1)</script>', tags=['xmp'])
'<xmp><script>confirm(1)</script><script>confirm(1)</script></xmp>'
>>> bleach.clean('<xmp><script>confirm(1)</script><script>confirm(1)</script></xmp><script>confirm(1)</script>', tags=['xmp'])
'<xmp><script>confirm(1)</script><script>confirm(1)</script></xmp>&lt;script&gt;confirm(1)&lt;/script&gt;'
>>> 

(In reply to Greg Guthe [:g-k] [:gguthe] from comment #3)

Looks like allowing noscript isn't required and anything up to an unclosed raw tag (anything using parseRCDataRawtext) gets passed through:

Python 3.8.0 (default, Nov 21 2019, 17:41:43)
[GCC 7.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import bleach
>>> bleach.__version__
'3.1.0'
>>> bleach.clean('<xmp><script>confirm(1)</script>', tags=['xmp'])
'<xmp><script>confirm(1)</script></xmp>'
>>> bleach.clean('<xmp><script>confirm(1)</script><script>confirm(1)</script>', tags=['xmp'])
'<xmp><script>confirm(1)</script><script>confirm(1)</script></xmp>'
>>> bleach.clean('<xmp><script>confirm(1)</script><script>confirm(1)</script></xmp><script>confirm(1)</script>', tags=['xmp'])
'<xmp><script>confirm(1)</script><script>confirm(1)</script></xmp>&lt;script&gt;confirm(1)&lt;/script&gt;'
>>> 

Correct but since the data inside those tags is parsed as raw-text in the browser as well, nothing will run.
The vulnerability here is that the data inside the noscript's tag is treated as html in bleach, but as raw text in the browser. This way we can trick bleach to believe something is good when in reality it’s not (I used additional raw-text tag to get over another sanitization).

(In reply to Yaniv Nizry from comment #4)

(In reply to Greg Guthe [:g-k] [:gguthe] from comment #3)

Correct but since the data inside those tags is parsed as raw-text in the browser as well, nothing will run.
The vulnerability here is that the data inside the noscript's tag is treated as html in bleach, but as raw text in the browser. This way we can trick bleach to believe something is good when in reality it’s not (I used additional raw-text tag to get over another sanitization).

Gotcha, that's clever.

So, setting scripting=True seems to work (bleach gives me <noscript><style></noscript>&lt;img src=x onerror=alert(1) /&gt; which doesn't pop an alert on Fx Nightly).

willkg: do you have time to review that change? I r?'d you on the private GH fork, but I can attach the patch here too.

Flags: needinfo?(willkg)

Thanks Greg,
I'm CC'ing Dan who's been looking at our bleach dependency most recently.

Dan, would you mind owning the Kuma angle on this? As you can see, Greg is about to submit a patch to willkg but it would be great to know how or if it's impacting our kuma wiki code.

Flags: needinfo?(peterbe)

Thanks Peter! Hopefully MDN doesn't need to whitelist noscript tags.

Created a draft advisory and PR on GH: https://github.com/mozilla/bleach/security/advisories/GHSA-q65m-pv3f-wr5r

Will is back on Tuesday, so assuming the patch looks good I think we can release on Tuesday or Wednesday next week. That way people aren't vulnerable and trying to patch over the weekend and we can notify people in advance.

addons-server uses bleach https://github.com/mozilla/addons-server/search?q=bleach&unscoped_q=bleach but I don't see a vulnerable use case so I don't think there's anything actionable on their end.

+:cgrebs and :muffinresearch on the AMO team if they want to double check.

https://github.com/mozilla/elmo/search?q=bleach&unscoped_q=bleach is using bleach.clean with the default tags arg, so it should be fine.

(In reply to Greg Guthe [:g-k] [:gguthe] from comment #1)

Does MDN whitelist noscript and any of the following tags

Yes, MDN whitelists textarea and iframe, so we'll need to coordinate to get the update rolled out on MDN. I'll own that.

Our full whitelist is at https://github.com/mdn/kuma/blob/master/kuma/wiki/constants.py

Greg, is it possible to grant my GitHub account (callahad) access to that security advisory? I can't see it at present.

Flags: needinfo?(gguthe)

(In reply to Dan Callahan [:callahad] from comment #12)

(In reply to Greg Guthe [:g-k] [:gguthe] from comment #1)

Does MDN whitelist noscript and any of the following tags

Yes, MDN whitelists textarea and iframe, so we'll need to coordinate to get the update rolled out on MDN. I'll own that.

Our full whitelist is at https://github.com/mdn/kuma/blob/master/kuma/wiki/constants.py

That looks fine because noscript isn't whitelisted.

Greg, is it possible to grant my GitHub account (callahad) access to that security advisory? I can't see it at present.

Yep, I added callahad.

From an earlier email from willkg sent me of bleach users the following look fine:

Time permitting I'll put together a CodeQL query to submit with the advisory over the weekend.

Flags: needinfo?(gguthe)
Summary: mutation xss from yaniv.Nizry@checkmarx.com → Bleach mutation xss in <noscript> handling
Whiteboard: keep hidden until 3rd party consumers are notified

I reviewed the GitHub PR.

Flags: needinfo?(willkg)

Thanks willkg!

v3.1.1 is published to pypi

Yaniv, can you let us know when galaxyproject/galaxy is updated so we can make this bug public?

Flags: needinfo?(yaniv.nizry)

(In reply to Greg Guthe [:g-k] [:gguthe] from comment #16)

Yaniv, can you let us know when galaxyproject/galaxy is updated so we can make this bug public?

Sure, I notified them and they pushed this commit to dev
https://github.com/galaxyproject/galaxy/commit/5cf82b587d26f7ecc144aa596a4885614e365d66

Flags: needinfo?(yaniv.nizry)

(In reply to Yaniv Nizry from comment #17)

(In reply to Greg Guthe [:g-k] [:gguthe] from comment #16)

Yaniv, can you let us know when galaxyproject/galaxy is updated so we can make this bug public?

Sure, I notified them and they pushed this commit to dev
https://github.com/galaxyproject/galaxy/commit/5cf82b587d26f7ecc144aa596a4885614e365d66

And this to master.
https://github.com/galaxyproject/galaxy/commit/259d51edce92031183dad86a4dc2714b6799bffa

Flags: sec-bounty?
Alias: CVE-2020-6802
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED

This isn't really part of our bug bounty program (outside of it affecting a Mozilla property), but I'd love to grant you a hall of fame entry.

How would you like to be credited on our hall of fame? If you have a link, we can do that too.

Flags: sec-bounty?
Flags: sec-bounty-hof+
Flags: sec-bounty-

(In reply to April King [:April] from comment #19)

This isn't really part of our bug bounty program (outside of it affecting a Mozilla property), but I'd love to grant you a hall of fame entry.

How would you like to be credited on our hall of fame? If you have a link, we can do that too.

My name "Yaniv Nizry" with a link to my linkedin account (https://www.linkedin.com/in/yaniv-n-8b4a76193) would be great,
Thank you :)

Per https://bugzilla.mozilla.org/show_bug.cgi?id=1615315#c18 galaxyproject/galaxy was updated, so I think it's safe to make this public.

Group: mozilla-employee-confidential, webtools-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: