Closed Bug 315473 Opened 19 years ago Closed 18 years ago

Investigate allowing NULs in attribute keys

Categories

(Core :: DOM: HTML Parser, defect, P2)

x86
Linux
defect

Tracking

()

VERIFIED FIXED
mozilla1.9alpha1

People

(Reporter: mrbkap, Assigned: mrbkap)

References

()

Details

(Keywords: verified1.8.0.10, verified1.8.1.2, Whiteboard: [sg:low] stepping stone to XSS; goes with bug 314980)

Attachments

(3 files, 3 obsolete files)

In bug 314980, comment 4, sicking pointed out that we should not be allowing NULs in attribute keys, since they get turned into atoms, which do not handle NULs very well at all (see bug 264999).

I took a quick look at this for bug 314980 and it seems that ReadUntil might need some small changes in order for this to work.

I'm filing this as being security sensitive, since it's possible that a buffer-overrun vulnerability lying in wait here.
It's not just buffer-overruns i'm worried about, but also stuff like

<input type0foo="file">

where some code might treat that as a file input and other code might not. Same thing with

<input0foo type="file">
The second case was handled by bug 264956. The first case still needs to be dealt with.
Whiteboard: [sg:investigate]
Flags: blocking1.8.1.2?
Flags: blocking1.8.0.10?
Whiteboard: [sg:investigate] → [sg:investigate] goes with bug 314980
Attached patch Working patch (obsolete) — Splinter Review
I haven't tested this patch yet, so I'm not requesting review yet. It deals with NUL characters in the HTML token stream when we're calling ReadUntil.
Status: NEW → ASSIGNED
Priority: -- → P2
Target Milestone: --- → mozilla1.9alpha
Flags: blocking1.8.1.2?
Flags: blocking1.8.1.2+
Flags: blocking1.8.0.10?
Flags: blocking1.8.0.10+
Whiteboard: [sg:investigate] goes with bug 314980 → [sg:low] stepping stone to XSS; goes with bug 314980
Testcase: you should go to Mozilla, not get an alert

data:text/html,<a%20href="http://www.mozilla.org"%20onclick%00foo="alert('pwned!');return%20false">Go%20to%20mozilla</a>

Flags: wanted1.8.1.x+
Flags: wanted1.8.0.x+
Flags: blocking1.8.1.2+
Flags: blocking1.8.0.10+
Why were the blocking flags cleared for this release? We will look like chumps for having finally fixed the public issue described in bug 314980 but missing this fairly obvious (and publicly known) subset of it.

The patch fixes the testcase in the bug. Does the "untested" comment indicate worry about regressions, or just hadn't tested against the original problem?
Flags: blocking1.8.1.2?
Flags: blocking1.8.0.10?
(In reply to comment #5)
> the public issue described in bug 314980

By public I mean the well-known "XSS Cheat sheet"
http://ha.ckers.org/xss.html#XSS_Non_alpha_non_digit2

as well as blogs about MySpace repeatedly falling victim to this trick:
http://ha.ckers.org/blog/20070112/myspace-0day-again6/
Replacing the blocking+, let's try to get this fixed for the upcoming release.
Flags: blocking1.8.1.2?
Flags: blocking1.8.1.2+
Flags: blocking1.8.0.10?
Flags: blocking1.8.0.10+
Comment on attachment 250376 [details] [diff] [review]
Working patch

The patch works, let's get some reviews on it.
Attachment #250376 - Attachment description: Untested patch → Working patch
Attachment #250376 - Flags: superreview?(jst)
Attachment #250376 - Flags: review?(jonas)
I don't think the patch is ready yet, sorry. The attribute handling is incorrect. I was going to work on it tomorrow morning.
Ok, sounds good. I'll review once there's a new patch.
The nsHTMLTokens changes scare me a little. Looks like we're dropping null chars in the stream on the floor which is exactly what got us here. I.e. parsing
on\0load="doEvil();" as onload="doEvil();"

Is the patch relying on the scanner stopping the token at the end of the null char such that when we create a token the null char is always the last char and so can safely be dropped?
(In reply to comment #11)
> Is the patch relying on the scanner stopping the token at the end of the null
> char such that when we create a token the null char is always the last char and
> so can safely be dropped?

My goal is that NUL implicitly ends every token (it can't be explicit in the end condition because that's a null terminated string). The trick is to ensure that all callers of ReadUntil are prepared to strip the NUL out of the stream. So given <a on\0click="evil">, we'll end up with <a on click="evil"> (which doesn't roundtrip, but fixing that seems hard).
Er, the roundtripping is fine. I didn't get enough sleep last night.
Couldn't the scanner make sure to end the token before the null char to avoid having to force all callers to do that. Or will that confuse things that assume that tokens are back-to-back in the stream or some such? Could we perhaps even create a "whitespace" token that contains the null char?
(In reply to comment #14)
> Couldn't the scanner make sure to end the token before the null char to avoid
> having to force all callers to do that.

The scanner does. My concern was to respect the addTerminal parameter. Currently, all callers pass PR_FALSE, indicating that they would like to Peek at the character that terminated the token; IMO it would be inconsistent to hide the NUL character.

So, looking back through my patch, I do end up skipping over NUL characters (pretending as if they were never there) in some cases. I'll go back over and fix those too.
Attached patch Easier patch (obsolete) — Splinter Review
This is a much easier patch -- it replaces null characters in ReadUntil strings with spaces. The idea is that since we're basically treating NUL characters as whitespace, we might as well make it real whitespace, which doesn't require changing nsHTMLTokens.cpp.
Attachment #250376 - Attachment is obsolete: true
Attachment #252666 - Flags: superreview?(jonas)
Attachment #252666 - Flags: review?(jst)
Attachment #250376 - Flags: superreview?(jst)
Attachment #250376 - Flags: review?(jonas)
Comment on attachment 252666 [details] [diff] [review]
Easier patch

Yeah, makes sense. It's not like anyone with good serious intentions is at all likely to use embedded nulls in HTML anyways. r=jst
Attachment #252666 - Flags: review?(jst) → review+
So mrbkap pointed out to me that the last patch will make us parse

<input onclick\0="doEvil()"> as <input onclick ="doEvil()"> which will execute, which is something that sites might not expect.

So I suggested that we instead convert NULLs to '-'. I.e. the above would be parsed as <input onclick-="doEvil()"> which will not execute. Same thing for <input\0 type="password"> will be treated as <input- type="password"> which will not be a password field.
Attached patch Using - instead of space (obsolete) — Splinter Review
Attachment #252666 - Attachment is obsolete: true
Attachment #252676 - Flags: superreview?(jonas)
Attachment #252676 - Flags: review?(jst)
Attachment #252666 - Flags: superreview?(jonas)
Attachment #252676 - Flags: superreview?(jonas) → superreview+
Attachment #252676 - Flags: review?(jst) → review+
Fix checked into trunk.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
er, '-' has got to be bad, right?

-moz\0binding will now be -moz-binding, and people trying to scrape it out now won't see it. Using -any- replacement character scares me, it's changing one idiosyncratic non-standard behavior for another. Whatever the character used you can probably construct some edge case where it matters.

The first patch worked on both
data:text/html,<a%20onclick%00foo="alert('pwned!');return%20false"%20href="http://www.mozilla.org">Mozilla</a>
data:text/html,<a%20on%00click="alert('pwned!');return%20false"%20href="http://www.mozilla.org">Mozilla</a>

If you missed a few places where \0 got dropped silently can we just fix those?
(In reply to comment #21)
> -moz\0binding will now be -moz-binding

Crap, I didn't think of that.

> The first patch worked on both...

The new one does too.

> If you missed a few places where \0 got dropped silently can we just fix those?

The problem that I was worried about was <input\0 ...>, where we don't want to create an "input" tag, but can't use a NUL character at the end. That suggests some sort of replacement. The previous patch's approach can't deal with the tag name case.
Here's a comparison of Opera, IE7, FF2 (pre-patch) and "trunk+original patch". I'll try the checked-in patch in a bit

testcase     FF2     trunk   Opera   IE7
1. input/    button  button  button  button
2. type/     text    text    text    text
3. submit/   text    text    text    text
4. input-    --      --      --      --
5. input.    --      --      --      --
6. type.     text    text    text    text
7. input\0   button  button  --      button
8. in\0put   --      --      --      button
9. ty\0pe    text    text    text    button
A. type\0    button  text    text    button
B. sub\0mit  text    button  text    button

The trunk+patch "B" is clearly one of the places you're just dropping the nulls in the original patch. Overall I'd say Opera's behavior is what we want. I expect that's what we'll get out of the trunk patch you just checked in, except now I'll have to come up with additional testcases to show that '-' was an unfortunate choice :-)
(In reply to comment #23)
> now I'll have to come up with additional testcases to show that '-' was an
> unfortunate choice :-)

I was thinking of using @ as a neutral character.
Is there a reason not to insert the standard UTF16 replacement character (U+FFFD)?  I'd _hope_ that we don't skip over those between an attr name and attr value (since those can just be inserted due to invalid bytes that can't be decoded in the page encoding)...  If we don't, we need to fix.

And being non-ASCII, it's not actually a delimiter in any of the standard things, so won't appear in things like -moz-binding or whatnot.
Attached patch Good pointSplinter Review
Attachment #252676 - Attachment is obsolete: true
Attachment #252722 - Flags: superreview?(jonas)
Attachment #252722 - Flags: review?(bzbarsky)
Comment on attachment 252722 [details] [diff] [review]
Good point

>Index: parser/htmlparser/src/nsScanner.cpp

>+// We replace NUL characters with this character.
>+static PRUnichar sInvalid = 0xFFFD;

How about UCS2_REPLACEMENT_CHAR (defined in nsCharTraits.h).

r=me with that.

By the way, as far as -moz-binding goes our CSS parser skips over nulls too in some cases, and possibly over \000000 as well....  Probably want a separate bug on that.
Attachment #252722 - Flags: review?(bzbarsky) → review+
Attachment #252722 - Flags: superreview?(jonas) → superreview+
The followup patch has also been checked in.
There was a trivial conflict with a CID defined on the 1.8 branch.
Attachment #252784 - Flags: approval1.8.1.2?
Attachment #252784 - Flags: approval1.8.0.10?
Comment on attachment 252784 [details] [diff] [review]
Patch for the 1.8 branches

Results with the patch are great -- thanks!

a=dveditz for 1.8/1.8.0 branches
Attachment #252784 - Flags: approval1.8.1.2?
Attachment #252784 - Flags: approval1.8.1.2+
Attachment #252784 - Flags: approval1.8.0.10?
Attachment #252784 - Flags: approval1.8.0.10+
Fixed on the 1.8 branches.
(In reply to comment #27)
> By the way, as far as -moz-binding goes our CSS parser skips over nulls too in
> some cases, and possibly over \000000 as well....  Probably want a separate bug
> on that.

Bug 228856 already exists on the CSS parser skipping NULs.

Verified fixed on trunk, using the testcase and table from comment 23.
With a 2007-01-24 build, I see the results as mentioned for FF2 in the table.
With a 2007-01-25 build, I see the same results as mentioned for Opera in the table (which is mentioned as the desired result)

Also verified that the latest branch builds have the same results as mentioned for Opera in the table.
Status: RESOLVED → VERIFIED
Group: security
Blocks: 364823
Depends on: 451557
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: