Closed Bug 633888 Opened 13 years ago Closed 13 years ago

Validator should flag assignments to innerHTML and on* attributes with anything other than a static string

Categories

(addons.mozilla.org Graveyard :: Admin/Editor Tools, defect, P2)

defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: kmag, Assigned: basta)

References

Details

(Whiteboard: [ReviewTeam])

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:2.0b12pre) Gecko/20110210 Firefox/4.0b12pre
Build Identifier: 

Assignments to innerHTML and event handling attributes are, in my experience, almost always done incorrectly, but are especially so when used with anything other than a static string.

In the case of innerHTML, there are several issues. One is that I've seen several cases of authors modifying page content by modifying the innerHTML attribute usually of the body tag, but sometimes of other elements. The performance and other problems with this approach are obvious and serious enough that I think it's worth flagging. Worse, though, is that many authors who construct HTML via strings do so unsafely, and in many cases include unescaped strings from remote sources in HTML injected into their own chrome documents, the safety implications of which I need not mention.

Assignments to event handling attributes have similar issues. In the past few weeks alone I've come across over a dozen examples of authors doing things like elem.setAttribute("oncommand", "frob(\"" + foo + "\")"), almost always in a chrome context. This is problematic enough when the source is local and the worst we generally have to worry about is a syntax error due to quoting issues. Alas, however, the source is almost always remote, and therefore a serious security issue given how common this seems to be.



Reproducible: Always
Yes, this is pretty important. I'm not sure if it'll add too much noise to the validator output, but it's best to give it a try first.
Assignee: nobody → mbasta
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P2
Whiteboard: [required amo-editors]
I can't do testing to see whether a value is definitely a string literal, but I can test whether the value being assigned is a string. How does this seem:

Fail when:
<JSWrapper>.innerHTML <assignment/augmented_assignment> <JSWrapper(string)>

A JSWrapper is the equivalent of a reference pointing to the value of a variable (regardless of whether it's derived or implied). Basically this fix would prevent any string value from being assigned to any member "innerHTML" on any object.

Let me know if this is appropriate, overboard, underboard, etc.
Is it possible for you to instead attempt to run the assigned value through the validator that's used for HTML files and throw a warning when that's impossible?
Brilliant! You, sir, are a gentleman and a scholar.

I'll look into that shortly and get back to you.
It doesn't look like that's going to work very well; if we're only getting snippets here and there, the parser is probably going to freak out, especially if it's getting loads of half-open/half-formed tags like you might expect with innerHTML. You can check out the markup parser object here:

https://github.com/mattbasta/amo-validator/blob/master/validator/testcases/markup/markuptester.py

Even if we did send it through the markup parser, I would say that it would tend to air on the false negative side. It's great for whole files, but the parser maintains a stack to define hierarchy (as you would expect for XML-esque languages) but getting only snippets would imaginably wreak havoc and cause things to not be recognized.

I'll get the "vanilla" test wired up tonight.
Hm. I don't think that strings assigned to innerHTML are generally worse quality than most ordinary HTML files (which isn't saying much, I suppose). At least, they're not document.write() calls, so they should at least be well formed in the general case. I suppose that some tests which depend on nesting might be missed, but at least it should be able to catch most script tags and onfoo attributes. And if it comes across malformed markup, well, I'd rather the validator complain so that editors don't have to.
The kind of situation that I'm concerned about is something like this:

x.innerHTML = "<" + fooElement + ">" + bar.zap() + "</" + fooElement + ">";

or

x.innerHTML = "<" + fooElement + ">" + bar.zap() + "</div>";


While the HTML parser would likely work fine if there was an actual value for fooElement, it's more than likely that the JS engine will produce a bad value or empty value instead. If anything, it would probably create a lot of false positives for situations where there probably wouldn't otherwise be an issue.
Yeah, as far as I'm concerned, any time a dynamic value is assigned it should complain that a dynamic value is being assigned which can't be validated and that developers should probably refrain from doing so.
Also, correct me if I'm wrong, but shouldn't on* attributes be flagged unless they're inline functions, not strings?
Well, only strings can be used with the attributes. I'd prefer that people used on* properties with closures (or addEventListener), but I'm not sure how well that would go over. Jorge?
While addEventListener is the recommendation, we should still allow the on* attributes, even though they can only be set with strings.
But from the JS, isn't the safest way to set them via closures?

x.onclick = function() { /* ... */ };

instead of

x.onclick = "...";

Since we don't test the content of the string, we can't test for things like:

x.onclick = "eval('');"

Shouldn't this be the case?
You're referring to the on* properties, not the attributes. Some people are not aware that they exist, nor they are aware of addEventListener. What most people do is:

x.setAttribute("onclick", "...");

I agree this isn't the safest approach, but it's one of the most known and used ones.

I would be OK with giving a recommendation (not a warning) to use addEventListener instead.
My bad, I meant properties, not attributes. We're already passing the value of on* attributes from the HTMLParser over to the scripting engine.

Just to be clear, I should be flagging the following (in addition to the innerHTML stuff):

<JSWrapper>.setAttribute("on*", <JSWrapper>)
<JSWrapper>.on* = <JSWrapper(JSLiteral)>

Let me know whether this is what you're after.
(In reply to comment #14)
> <JSWrapper>.setAttribute("on*", <JSWrapper>)
> <JSWrapper>.on* = <JSWrapper(JSLiteral)>

I don't know what that means.
The first example flags calls to the function "setAttribute" with a string matching "on*" as the first attribute on any object. E.g.:

foo.setAttribute("onclick", "bar()");

The second flags the assignment of any non-closure object to a member matching "on*" on any object. E.g.:

foo.onclick = "bar()";
I think that makes sense. I'd forgotten that the properties accepted literal strings. I think in that case it should certainly be a warning.

In the case of properties, I think a suggestion to use addEventListener or the corresponding property would be fine, along with validating the same fragment the way it would be from an XML file. Any non-literal string ("foo('" + bar + "')") should still be a stern warning.
(In reply to comment #16)
> The first example flags calls to the function "setAttribute" with a string
> matching "on*" as the first attribute on any object. E.g.:
> 
> foo.setAttribute("onclick", "bar()");
> 
> The second flags the assignment of any non-closure object to a member matching
> "on*" on any object. E.g.:
> 
> foo.onclick = "bar()";

Sounds good.
Target Milestone: --- → 6.0.2
Kris: should innerHTML assignments be flagged with a notice in all cases? I'm finishing this off and it's time to take care of that bit.
Target Milestone: 6.0.2 → 6.0.3
In lieu of a response, I built this:

https://github.com/mattbasta/amo-validator/commit/948fac82869b2af757f3a3cdd515b1ea2650652f

innerHTML is flagged if a.) It detects an event handler being set (on* attribute) within its string value or b.) It isn't being assigned a flat string.

on* properties are flagged if they are assigned a string value.

setAttribute is flagged (in the parent commit) if the first attribute begins with "on".
Sorry for the delay. Yes I think innerHTML should be flagged in all cases unless it happens to be a flat string value that gets run through the HTML validator.
I just got this:

Warning: on* property being assigned string

Event handlers in JavaScript should not be assigned by setting an on* property to a string of JS code. Rather, consider using addEventListerner.
        Tier:   3
        File:   chrome/content/numbersgame.js
        Line:   94
        Column: 2
        Context:
        >                       req.open('GET', Numbersgame.BANNER_URL, true);
        >                       req.onreadystatechange = function (aEvt) {
        >                               if (req.readyState == 4) {

I'm a bit busy now, but I'll see if I can get a testcase together later if you need one.
Attached file Troublesome XPI
I'm attaching the XPI. There seem to be quite a lot of other spurious errors as well.
Kris: Check out the latest commit. It's got the MarkupParser support in there (set to ignore markup structure and focus on attributes) and should fix the issue you saw with on* properties:

https://github.com/mattbasta/amo-validator/commit/5bcdd9c4bb9dbcca884e5b3e4a45dec5823fbe92
Yes, that takes care of it. Thanks.
Target Milestone: 6.0.3 → 6.0.4
This was pushed today:

https://github.com/mozilla/amo-validator/commit/e99145a2d02c18f7e9c522250c80e3991e0a5788
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
The test XPI from comment 24 throws no warnings during validation. Based on comment 26, I'm marking this verified.
Status: RESOLVED → VERIFIED
Reclassifying editor bugs and changing to a new whiteboard flag. Spam, spam, spam, spam...
Whiteboard: [required amo-editors] → [ReviewTeam]
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: