From the reporter submitted via security@ email:
My name is Yaniv Nizry and I’m a researcher in the CxSCA group at Checkmarx.
I reported a mXSS in bleach last month, And I encountered another way to exploit bleach again.
There are some tags that are parsed differently whether they are inside or outside a Math/Svg tags.
For example, style tag:
Inside the svg tag it parses as XML and outside it parses as raw text.
In addition, there are some tags that when applying them to the inner html of svg/math tag they pop out:
TL;DR – Poc example <svg><style><img src=x onerror=alert(1)>
In order to exploit this we need a tag that will not sanitize our data but will be parsed as HTML/XML (in order for the tag to pop out and run).
Here is a list of those tags (our condition for exploitation is svg or math whitelisted as well as one of the following):
The data in the style tag usually doesn’t supposed to run, but since it’s in an svg, the data inside it is XML\HTML. So the img tag will pop out of the svg and run.
I started investigating where in the code the problem is. It seems that the parsing is done right (it does recognize the tags inside the svg->style not as raw text) but for some reason it doesn’t sanitize it.
this vulnerability will not work in case the strip flag in bleach.clean function is set to true (default is false), here is the place it deletes the unwanted data:
I have one speculation where the problem is:
In bleach/_vendor/html5lib/serializer.py line 297, it ignores the namespace so if a tag is in an svg it will be treated the same, I did the patch above (not sure if it’s a good one). And also in line 302 looks like namespace check is in the TODO list.