Closed Bug 1155131 Opened 5 years ago Closed 4 years ago

[Tests] Discourage unescaped innerHTML through test failures

Categories

(Firefox OS Graveyard :: Gaia, defect)

x86_64
Linux
defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: freddyb, Assigned: freddyb)

References

()

Details

Attachments

(3 files)

In the long run, we want to discourage patterns that just set something to innerHTML that is not a string.

The long-term plan is as follows:
- Tag all current innerHTML assignments as allowed
- Disallow innerHTML = expression expression is neither: a string, a const
- allow innerHTML if the expression is using a template string with our custom escaper (shared/tagged.js), i.e.
> foo.innerHTML = t`<a href='${url}'>${text}</a>`

I'll write a longer document with more examples and the inner workings of the library as well as the whitelist on the Mozilla wiki (https://wiki.mozilla.org/User:Fbraun/Gaia/SafeinnerHTMLRoadmap)
Depends on: 1159667
Looking at this further, there's several places where we do something like:

function getView() {
  return Tagged.escapeHTML ``;
}

myEl.innerHTML = getView();


Do you think there is a way that eslint could validate this? One way might be to have a convention, where every variable or function assigned to innerHTML has the name `tagged` in it. We could also potentially validate that functions with the name `tagged` in them must return a string, escaped with the Tagged library. I'm sure that authors could still get around this if they tried to, but maybe it would catch most cases?
The goal is to make precise statements about something that can be secured and something that can not be secured to encourage safe coding practices for the future. While I'm really keen to make this more precise later, I don't want perfect to become the enemy of good :)

Also, let's try not to confuse linting with static analysis. All we can really do is look at individual expressions (i.e., source lines) and tell whether they are providing a good foundation for secure coding. What we can not do with eslint is identify vulnerabilities or complex code inter-dependencies. Identifying variable scope in a specific line of code is far from trivial.

More specifically, I'm afraid that a check for variable naming conventions may be easily tricked.
(In reply to Frederik Braun [:freddyb] from comment #2)
> More specifically, I'm afraid that a check for variable naming conventions
> may be easily tricked.

I think there's two likely "threat" models here (none of which cover bad actors):

1) Naive cases where new code using innerHTML is written largely oblivious to the risks/complexities of innerHTML.

2) Cases where the dangers of innerHTML are known but complicated control/data-flow results in a bug, possibly because of subsequent changes made to logic that's in the "danger zone" but those making the changes don't realize it and assume that the safety guarantees are ironclad and magical and downstream of the changes.

I think magic variable names cover the first case.  I don't think magical variable names cover the second case.  I think the first case is much more likely to happen, but also much more likely to be easily caught.  The second case could involve subtleties that make breakage less likely for normal testing to reveal but still easily detectable by a motivated bad actor.

I think type inference is the solution to the second case.  Semmle is quite happy to hand-out free licenses to Gaia developers (I've got one).  Also, on the open source front, tern.js (http://ternjs.net) seems to have pretty good type inference and understands require.js/AMD modules.

However, I also think static analysis is also very difficult, so it should be an idealized second step and instead we should start with a more pragmatic first step we can later upgrade to awesome static analysis.

I propose the following:
- We implement a tagged data-structure.  Something like { securityReviewWasOnBugAndAnyNewUsesRequireNewReview: "123456", safeHTML: "..." }.
- At the call-site that puts stuff into innerHTML, we have eslint allow Tagged.unwrapSafeHTML() as a call.
- The wrapped data-structure is created exactly where the safe data is initialized.  In the case of comment 1, Tagged.escapeHTML itself can produce the data-structure itself.  In the case of the e-mail app which has its cool bleach.js sanitizer, we wrap the data structure where we call bleach.js, storing that structure in the database too.

I think this satisfies all goals.  It can be implemented by eslint, covers the naive cases, and it is data-flow/control-flow safe.  Anyone manually creating the tagged structure and anyone approving the review is stepping into aggressively reckless/bad actor behaviour for which it would not be unreasonable to yell at them a lot and/or revoke commit privileges/etc.
I see now that your proposal covers an additional use case.
My suggestion mostly discusses templates that are more or less in the same source line as the innerHTML assignment (or insertAdjacentHTML call…), but the unwrapping would also allow transfer of HTML templates between multiple code paths.

The only problem I see is that ES6 template strings can not be used without the templates resolved. Once you write them, all you get is a string literal, which has lost all knowledge about the original distinction between template and inputs.
Template strings can do anything they want.  We just need to have a variant for immediate injection and a variant for deferred injection.

$ function immediateSafe(str, ...values) { return str.map(x => x.replace(/[<>&]/g, 'bad!!')).join(''); }
undefined

$ immediateSafe`<stuff>`
"bad!!stuffbad!!"

$ function taggedSafe(str, ...values) { return { blahblah: 'safe!', str: str.map(x => x.replace(/[<>&]/g, 'bad!!')).join('') }; }
undefined

$ taggedSafe`<stuff>`
Object { blahblah: "safe!", str: "bad!!stuffbad!!" }

$ function useTaggedSafe(x) { if (!x.blahblah) throw new Error('BAD!!!!'); return x.str; }
undefined

$ useTaggedSafe(taggedSafe`<stuff>`)
"bad!!stuffbad!!"


Note: There could actually be only one variant, one that implements toString() to lose the tagging info, but that doesn't work for email's use case and most sophisticated use-cases since that doesn't roundtrip through structured clone.  Also, it's too magic.
Ha! We had a brief video chat and came to the following conclusion.

The existing tagged.escapeHTML function is good for immediate creation and consumption of dynamically created HTML.

If creation and insertion of the HTML have to happen in two distinct code parts, we can add another API that allows creating the HTML source wrapped in an object. The object will have no specific power, but unwrapping the object will be a bit of code that we can detect as e.g. 'innerHTML = unwrap(template);' (the function name will change). We can use this detection in ESLint, and do not have to trace back to the creation of the object, because it will be created and sanitized by our escaping logic.

If someone creates their own objects without our wrapper, they'll have to go through manual review. (It's likely that the email app will do this for the time being, as they're currently using bleach.js)

The result will use the UMD pattern (https://github.com/umdjs/umd).
Nice. This seems like a reasonable implementation to me.
This is my first attempt. It's a bit fuzzy around the edges but I call out the problems I see within the pull request description.

Can you take a look, Andrew?
Assignee: nobody → fbraun
Status: NEW → ASSIGNED
Attachment #8615307 - Flags: feedback?(bugmail)
Comment on attachment 8615307 [details] [review]
extending the html escaper

Looks good.  UMD feedback on pull request.
Attachment #8615307 - Flags: feedback?(bugmail) → feedback+
Comment on attachment 8615307 [details] [review]
extending the html escaper

Thanks for the feedback!

Can you please review then? :-)
Attachment #8615307 - Flags: review?(kgrandon)
Attachment #8615307 - Flags: review?(bugmail)
Comment on attachment 8615307 [details] [review]
extending the html escaper

Looks good.  Note that per https://wiki.mozilla.org/Modules/FirefoxOS#Shared I am not an acceptable reviewer for code in shared, and since you aren't an owner/peer, you can't technically defer review to me.  Only :kgrandon (or :djf) can technically review or defer review to another reviewer.  (I think this is probably silly for this code, but I point this out because Thunderbird's been recently been bitten by a lack of well-defined modules and owners/peers/etc.)
Attachment #8615307 - Flags: review?(bugmail) → review+
Comment on attachment 8615307 [details] [review]
extending the html escaper

Seems fine to me. I wonder if we should rename this library? That can be done in a follow-up, but if you have any suggestions please list them here. Thanks! 

Something like `Sanitizer` might make more sense.
Attachment #8615307 - Flags: review?(kgrandon) → review+
Let's do this in a follow-up. We'd have to change a lot of files that already use this library.
Keywords: checkin-needed
Follow-up in bug 1172446
Keywords: leave-open
Let's proceed, shall we?
Attachment #8617862 - Flags: review?(felash)
Attachment #8617862 - Flags: review?(kgrandon)
Comment on attachment 8617862 [details] [review]
enable eslint disallow-unsafe-innerhtml plugin

Wow, that's a lot of files to fix :)

This approach seems ok to me, but I think you should probably get a review from either Julien or a build peer to land. Thanks!
Attachment #8617862 - Flags: review?(kgrandon) → feedback+
Can you take a look again, julien? I think filtering should work now.
Flags: needinfo?(felash)
Actually currently xfailed files are still preventing commit :) I left a comment on github.
Flags: needinfo?(felash)
Rebased on master to make sure no newly added files are messing with the xfail.list.
Newest commit also addressed your comments.

Julien, what do you think, can you look again? :)
Attached file github PR for master
I realize that it gets really hard to finish the patch due to the timezone difference now that I'm in Vancouver, so I took the liberty to polish up the patch with a last commit.

I did various basic tests with the hook itself or the Makefile and all cases seem to work for me:
* changing a xfailed file does not prevent to commit.
* I tried removing a file from the xfail list, make a change to it: it properly prevents to commit.
* same tests using "APP=sms make eslint" (I changed a file in the sms app)

(Trying to find a reviewer while Kevin and Gaye are in PTO.)
Comment on attachment 8623774 [details] [review]
github PR for master

Aus is kind enough to accept to review this :)
Attachment #8623774 - Flags: review?(aus)
Comment on attachment 8623774 [details] [review]
github PR for master

lgtm, note that I reviewed the pre-commit hook mostly since the rest was reviewed by julienw and work done by freddyb.
Attachment #8623774 - Flags: review?(aus) → review+
Comment on attachment 8617862 [details] [review]
enable eslint disallow-unsafe-innerhtml plugin

Let's finish with the other patch !
Attachment #8617862 - Flags: review?(felash)
Thanks aus !

pushed to master: f9086310722845341ea386b686488001c981abbf

(totally forgot to put the r= line, sorry)
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.