Bug 1615315 Comment 0 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Original comment by

Greg G

on 2020-02-13 07:36:39 PST

TODO fill me in

Revision 1 by

Greg G

on 2020-02-13 07:37:46 PST

From: Yaniv Nizry<Yaniv.Nizry@checkmarx.com>
To: "security@mozilla.org" <security@mozilla.org>

TODO fill me in

Revision 2 by

Greg G

on 2020-02-13 07:39:44 PST

From: Yaniv Nizry<Yaniv.Nizry@checkmarx.com>
To: "security@mozilla.org" <security@mozilla.org>

Hello,

My name is Yaniv Nizry and I’m a researcher in the CxSCA group at Checkmarx.

I discovered a security vulnerability (mXSS) in Mozilla-bleach python package.

**Details:**

Revision 3 by

Greg G

on 2020-02-13 07:39:59 PST

From: Yaniv Nizry<Yaniv.Nizry@checkmarx.com>
To: "security@mozilla.org" <security@mozilla.org>

Hello,

My name is Yaniv Nizry and I’m a researcher in the CxSCA group at Checkmarx.

I discovered a security vulnerability (mXSS) in Mozilla-bleach python package.

**Details:**

noscript tag in HTML is treated differently whether JS is enable or disabled. When JS is enabled the data inside the tag is parsed as JS, but when its disabled the data is parsed as html.

Bleach relies on html5lib, a python library for parsing HTML. By looking at the implementation of html5lib in bleach’s code we can see that there is a variable named “scripting” and its default value is False

Revision 4 by

Greg G

on 2020-02-13 07:44:13 PST

From: Yaniv Nizry<Yaniv.Nizry@checkmarx.com>
To: "security@mozilla.org" <security@mozilla.org>

Hello,

My name is Yaniv Nizry and I’m a researcher in the CxSCA group at Checkmarx.

I discovered a security vulnerability (mXSS) in Mozilla-bleach python package.

**Details:**

noscript tag in HTML is treated differently whether JS is enable or disabled. When JS is enabled the data inside the tag is parsed as JS, but when its disabled the data is parsed as html.

Bleach relies on html5lib, a python library for parsing HTML. By looking at the implementation of html5lib in bleach’s code we can see that there is a variable named “scripting” and its default value is False

Later on in the parsing process the noscript tag will be parsed either as raw text or as html depending on the scripting value

Because JavaScript is enabled by default in the browsers but it’s disabled in the sanitizer, it causes a mutation XSS (mXSS).

**PoC:**

Firstly, I tried to give the sanitizer this input (a known mXSS string):

But for some reason the “lower than” characters inside an attribute are converted to an entity (but I saw it added the a closing tag which means the parsing inside it is HTML).

In order to get over this sanitization, we can use a tag that parses as raw text (Ex. “style” tag).

So what happened here?

The parser entered the noscript tag and started parsing as html, after that it entered the style tag and the parser changed to raw text (now nothing will get sanitize). And the parser closed the style and noscript automatically.

Now lets take a look on how the browser parses the output: The parser enters the noscript tag and starts parsing as JavaScript (no style tag is created here), and right after that exits the noscript tag.

What comes after is a normal HTML in this case an img tag with an onerror attribute.

For the PoC purposes I made a server that bleach the “payload” parameter in the url and returns it.

In order for this vulnerability to work we need the “noscript” tag to be whitelisted as well as one of the raw text parsed tags:

```
title
textarea
script
style
noembed
noframes
iframe
xmp
```

**Payload example:**

`<noscript><style></noscript><img src=x onerror=alert(1)>`

**Out in the wild example:**

Here is one of the scenarios we encountered that is exploitable.

https://github.com/galaxyproject/galaxy

Their sanitizer whitelists our needs to exploit the vulnerability: https://github.com/galaxyproject/galaxy/blob/master/lib/galaxy/util/sanitize_html.py

In the Galaxy project there is a feature to various file types, one of them is HTML that gets bleached.

Here for example the uploaded files are on the right (payload-normal.html is a simple script tag)

But when we choose to view payload.html with our exploit:

We plan on notifying the Galaxy project after you release a fix for the vulnerability.

Feel free to reply to this email for any questions.

Best regards,

Yaniv

TODO add screenshots

Revision 5 by

Greg G

on 2020-02-13 07:44:34 PST

From: Yaniv Nizry<Yaniv.Nizry@checkmarx.com>
To: "security@mozilla.org" <security@mozilla.org>

Hello,

My name is Yaniv Nizry and I’m a researcher in the CxSCA group at Checkmarx.

I discovered a security vulnerability (mXSS) in Mozilla-bleach python package.

**Details:**

noscript tag in HTML is treated differently whether JS is enable or disabled. When JS is enabled the data inside the tag is parsed as JS, but when its disabled the data is parsed as html.

Bleach relies on html5lib, a python library for parsing HTML. By looking at the implementation of html5lib in bleach’s code we can see that there is a variable named “scripting” and its default value is False

Later on in the parsing process the noscript tag will be parsed either as raw text or as html depending on the scripting value

Because JavaScript is enabled by default in the browsers but it’s disabled in the sanitizer, it causes a mutation XSS (mXSS).

**PoC:**

Firstly, I tried to give the sanitizer this input (a known mXSS string):

But for some reason the “lower than” characters inside an attribute are converted to an entity (but I saw it added the a closing tag which means the parsing inside it is HTML).

In order to get over this sanitization, we can use a tag that parses as raw text (Ex. “style” tag).

So what happened here?

The parser entered the noscript tag and started parsing as html, after that it entered the style tag and the parser changed to raw text (now nothing will get sanitize). And the parser closed the style and noscript automatically.

Now lets take a look on how the browser parses the output: The parser enters the noscript tag and starts parsing as JavaScript (no style tag is created here), and right after that exits the noscript tag.

What comes after is a normal HTML in this case an img tag with an onerror attribute.

For the PoC purposes I made a server that bleach the “payload” parameter in the url and returns it.

In order for this vulnerability to work we need the “noscript” tag to be whitelisted as well as one of the raw text parsed tags:

```
title
textarea
script
style
noembed
noframes
iframe
xmp
```

**Payload example:**

`<noscript><style></noscript><img src=x onerror=alert(1)>`

**Out in the wild example:**

Here is one of the scenarios we encountered that is exploitable.

https://github.com/galaxyproject/galaxy

Their sanitizer whitelists our needs to exploit the vulnerability: https://github.com/galaxyproject/galaxy/blob/master/lib/galaxy/util/sanitize_html.py

In the Galaxy project there is a feature to various file types, one of them is HTML that gets bleached.

Here for example the uploaded files are on the right (payload-normal.html is a simple script tag)

But when we choose to view payload.html with our exploit:

We plan on notifying the Galaxy project after you release a fix for the vulnerability.

Feel free to reply to this email for any questions.

Best regards,

Yaniv

TODO(g-k): add screenshots from email

Back to Bug 1615315 Comment 0