HTML injection vulnerability in Reader View
Categories
(Fenix :: General, defect)
Tracking
(firefox87 wontfix, firefox88 fixed)
People
(Reporter: jwkbugzilla, Assigned: csadilek)
Details
(Keywords: sec-low, Whiteboard: [post-critsmash-triage][adv-main88+])
Attachments
(3 files)
A similar issue has been reported in bug 1658175 before. The fix is incomplete however. While article.title
and article.byLine
are being escaped, article.dir
and other fields are not. In fact, some attributes aren’t even quoted. While CSP prevents direct code execution here, this could still be harmful.
I’ve attached two proof of concept pages, both with a malicious dir
attribute on the <html>
tag. The first one contains an <img>
tag with an unclosed src
attribute. The closing quote is located in the article “text” then. This makes sure that the entire HTML code before the closing quote is sent in a request to http://localhost/
(some data here but nothing that should be unknown to the attacking website). You can try it out by navigating to https://palant.info/temp/tP0wR6yK0uN5fN6s/reader_view.html in Fenix and switching on Reader View.
The second proof of concept page injects a <meta refresh>
tag, causing the page to navigate to Reader View of a data: URI. Normally, top-level navigation to data: URIs is prohibited. This approach allows navigation to other websites as well, the location bar keeps displaying the original website. You can try it out by navigating to https://palant.info/temp/tP0wR6yK0uN5fN6s/reader_view2.html in Fenix and switching on Reader View.
If you inspect the page source code for the second proof of concept and look at the <a class="domain">
tag, you will notice that the URL parameter is another potential HTML injection vector. Its current handling is: URL-decoding twice, then passing it through new URL(...).href
, then putting it into an unquoted attribute in the HTML code. This setup happens to be not exploitable for HTTP URLs (new URL()
removes any whitespace, and with the HTML5 parser escaping an unquoted attribute is impossible without spaces) but it is with data: URIs.
Recommendations
I generally recommend escaping everything, ideally automatically. Assumptions about some values being “safe” often break.
This means for https://github.com/mozilla-mobile/android-components/blob/4155fff604166170c4b2b3139cdc1f369ace1b8c/components/feature/readerview/src/main/assets/extensions/readerview/readerview-background.js#L16: Don’t call encodeURIComponent()
on one value and leave the rest alone. Use URLSearchParams
to compile this query, it will take care of escaping automatically.
On the receiving end (https://github.com/mozilla-mobile/android-components/blob/4155fff604166170c4b2b3139cdc1f369ace1b8c/components/feature/readerview/src/main/assets/extensions/readerview/readerview.js#L344), there is no need to call decodeURIComponent()
manually, it only causing double decoding. URLSearchParams
is already being used here, that’s sufficient.
When composing HTML code (https://github.com/mozilla-mobile/android-components/blob/4155fff604166170c4b2b3139cdc1f369ace1b8c/components/feature/readerview/src/main/assets/extensions/readerview/readerview.js#L129), quote all attributes. Unquoted attributes are a security hazard. And call escapeHTML()
on all values, right there when inserting into HTML code. It’s too easy to forget this if it is done in another function.
Ideally however, outerHTML
/innerHTML
should not be used at all. In this particular case I’d recommend adding this content to the static HTML code, with a hidden
attribute. Use regular DOM methods (setAttribute()
, textContent
) to fill in the dynamic data. These might not be quite as convenient but they are inherently safe. Once everything is filled in, the hidden
attribute can be removed.
Reporter | ||
Comment 1•4 years ago
|
||
Reporter | ||
Updated•4 years ago
|
Comment 2•4 years ago
|
||
Sebastian or Christian are probably good people to look at this.
Assignee | ||
Comment 3•4 years ago
|
||
Thanks, Wladimir.
I will assign to me to schedule a fix.
Assignee | ||
Updated•4 years ago
|
Reporter | ||
Comment 4•4 years ago
|
||
One more recommendation for https://github.com/mozilla-mobile/android-components/blob/a9bbac53fba6ebc4b1f629636ebde0c85c14da45/components/feature/readerview/src/main/java/mozilla/components/feature/readerview/ReaderViewMiddleware.kt#L79: It should be better to determine the location bar display based on what’s actually loaded (extract url
query parameter from action.url) rather than some hidden state that might be outdated.
Assignee | ||
Comment 5•4 years ago
|
||
Thanks, for this part we already have a ticket to find a better solution and remove the workaround. Let's keep this ticket confined to preventing the HTML injection cases you described, and address other refactorings separately.
Assignee | ||
Comment 6•4 years ago
|
||
Latest Fenix Nightly (210312 17:02) contains a fix for the two cases above caused by article.dir
. We also followed the recommendations to quote all attributes, switch to using URLSearchParams
instead of encodeURIComponent
/decodeURIComponent
and move all escaping logic to a single function where those strings are used to make this harder to miss in the future. Removing the dynamic HTML template for the static one we will look into in a follow-up to keep this separate from the immediate security fixes here.
Reporter | ||
Comment 7•4 years ago
|
||
article.url.href
and article.url.hostname
aren’t being escaped here: https://github.com/mozilla-mobile/android-components/blob/09f8e790a498e4ec97173ef88d4de779fe212668/components/feature/readerview/src/main/assets/extensions/readerview/readerview.js#L137. This should be fine given Gecko’s automatic URL escaping for HTTP URLs but it’s an unnecessary assumption.
There is a remaining superfluous decodeURIComponent
call here: https://github.com/mozilla-mobile/android-components/blob/09f8e790a498e4ec97173ef88d4de779fe212668/components/feature/readerview/src/main/assets/extensions/readerview/readerview.js#L358
Otherwise the changes look fine.
Assignee | ||
Comment 8•4 years ago
|
||
This should be fine given Gecko’s automatic URL escaping for HTTP URLs but it’s an unnecessary assumption.
Ah that's not an assumption we made. This is the result of new URL(url)
which validates the URL, and href
as well as hostname
already come back as encoded. That's why we didn't encode/escape again. If there's anything unsafe here please let us know.
There is a remaining superfluous decodeURIComponent call...
Thanks, removed now.
Reporter | ||
Comment 9•4 years ago
|
||
(In reply to Christian Sadilek [:csadilek] from comment #8)
Ah that's not an assumption we made. This is the result of
new URL(url)
which validates the URL, andhref
as well ashostname
already come back as encoded. That's why we didn't encode/escape again. If there's anything unsafe here please let us know.
As I said, that’s Gecko’s URL escaping for HTTP URLs that you get via the URL object. It’s an implementation detail (used to work differently in the past, changed as an XSS prevention measure) and not happening for data: URIs for example.
Reporter | ||
Comment 10•4 years ago
|
||
Just for clarification: I’m not saying that it is unsafe (it isn’t as long as you are sure that you only get HTTP URLs), I’m saying that it’s better not to make any assumptions and to always escape every value inserted into HTML code. And here we have two assumptions: that there will only be HTTP URLs and that Gecko will URL-encode all dangerous characters in HTTP URLs (something that wasn’t always a given e.g. for the anchor part of the URL).
Comment 11•4 years ago
|
||
This was fixed with https://github.com/mozilla-mobile/android-components/pull/9882 and landed about a month ago.
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 12•4 years ago
|
||
Updated•4 years ago
|
Updated•3 years ago
|
Updated•2 years ago
|
Description
•