In bug 314980, comment 4, sicking pointed out that we should not be allowing NULs in attribute keys, since they get turned into atoms, which do not handle NULs very well at all (see bug 264999).
I took a quick look at this for bug 314980 and it seems that ReadUntil might need some small changes in order for this to work.
I'm filing this as being security sensitive, since it's possible that a buffer-overrun vulnerability lying in wait here.
It's not just buffer-overruns i'm worried about, but also stuff like
where some code might treat that as a file input and other code might not. Same thing with
The second case was handled by bug 264956. The first case still needs to be dealt with.
Created attachment 250376 [details] [diff] [review]
I haven't tested this patch yet, so I'm not requesting review yet. It deals with NUL characters in the HTML token stream when we're calling ReadUntil.
Testcase: you should go to Mozilla, not get an alert
Why were the blocking flags cleared for this release? We will look like chumps for having finally fixed the public issue described in bug 314980 but missing this fairly obvious (and publicly known) subset of it.
The patch fixes the testcase in the bug. Does the "untested" comment indicate worry about regressions, or just hadn't tested against the original problem?
(In reply to comment #5)
> the public issue described in bug 314980
By public I mean the well-known "XSS Cheat sheet"
as well as blogs about MySpace repeatedly falling victim to this trick:
Replacing the blocking+, let's try to get this fixed for the upcoming release.
Comment on attachment 250376 [details] [diff] [review]
The patch works, let's get some reviews on it.
I don't think the patch is ready yet, sorry. The attribute handling is incorrect. I was going to work on it tomorrow morning.
Ok, sounds good. I'll review once there's a new patch.
The nsHTMLTokens changes scare me a little. Looks like we're dropping null chars in the stream on the floor which is exactly what got us here. I.e. parsing
on\0load="doEvil();" as onload="doEvil();"
Is the patch relying on the scanner stopping the token at the end of the null char such that when we create a token the null char is always the last char and so can safely be dropped?
(In reply to comment #11)
> Is the patch relying on the scanner stopping the token at the end of the null
> char such that when we create a token the null char is always the last char and
> so can safely be dropped?
My goal is that NUL implicitly ends every token (it can't be explicit in the end condition because that's a null terminated string). The trick is to ensure that all callers of ReadUntil are prepared to strip the NUL out of the stream. So given <a on\0click="evil">, we'll end up with <a on click="evil"> (which doesn't roundtrip, but fixing that seems hard).
Er, the roundtripping is fine. I didn't get enough sleep last night.
Couldn't the scanner make sure to end the token before the null char to avoid having to force all callers to do that. Or will that confuse things that assume that tokens are back-to-back in the stream or some such? Could we perhaps even create a "whitespace" token that contains the null char?
(In reply to comment #14)
> Couldn't the scanner make sure to end the token before the null char to avoid
> having to force all callers to do that.
The scanner does. My concern was to respect the addTerminal parameter. Currently, all callers pass PR_FALSE, indicating that they would like to Peek at the character that terminated the token; IMO it would be inconsistent to hide the NUL character.
So, looking back through my patch, I do end up skipping over NUL characters (pretending as if they were never there) in some cases. I'll go back over and fix those too.
Created attachment 252666 [details] [diff] [review]
This is a much easier patch -- it replaces null characters in ReadUntil strings with spaces. The idea is that since we're basically treating NUL characters as whitespace, we might as well make it real whitespace, which doesn't require changing nsHTMLTokens.cpp.
Comment on attachment 252666 [details] [diff] [review]
Yeah, makes sense. It's not like anyone with good serious intentions is at all likely to use embedded nulls in HTML anyways. r=jst
So mrbkap pointed out to me that the last patch will make us parse
<input onclick\0="doEvil()"> as <input onclick ="doEvil()"> which will execute, which is something that sites might not expect.
So I suggested that we instead convert NULLs to '-'. I.e. the above would be parsed as <input onclick-="doEvil()"> which will not execute. Same thing for <input\0 type="password"> will be treated as <input- type="password"> which will not be a password field.
Created attachment 252676 [details] [diff] [review]
Using - instead of space
Fix checked into trunk.
er, '-' has got to be bad, right?
-moz\0binding will now be -moz-binding, and people trying to scrape it out now won't see it. Using -any- replacement character scares me, it's changing one idiosyncratic non-standard behavior for another. Whatever the character used you can probably construct some edge case where it matters.
The first patch worked on both
If you missed a few places where \0 got dropped silently can we just fix those?
(In reply to comment #21)
> -moz\0binding will now be -moz-binding
Crap, I didn't think of that.
> The first patch worked on both...
The new one does too.
> If you missed a few places where \0 got dropped silently can we just fix those?
The problem that I was worried about was <input\0 ...>, where we don't want to create an "input" tag, but can't use a NUL character at the end. That suggests some sort of replacement. The previous patch's approach can't deal with the tag name case.
Created attachment 252718 [details]
testcase: variants on <input type=submit>
Here's a comparison of Opera, IE7, FF2 (pre-patch) and "trunk+original patch". I'll try the checked-in patch in a bit
testcase FF2 trunk Opera IE7
1. input/ button button button button
2. type/ text text text text
3. submit/ text text text text
4. input- -- -- -- --
5. input. -- -- -- --
6. type. text text text text
7. input\0 button button -- button
8. in\0put -- -- -- button
9. ty\0pe text text text button
A. type\0 button text text button
B. sub\0mit text button text button
The trunk+patch "B" is clearly one of the places you're just dropping the nulls in the original patch. Overall I'd say Opera's behavior is what we want. I expect that's what we'll get out of the trunk patch you just checked in, except now I'll have to come up with additional testcases to show that '-' was an unfortunate choice :-)
(In reply to comment #23)
> now I'll have to come up with additional testcases to show that '-' was an
> unfortunate choice :-)
I was thinking of using @ as a neutral character.
Is there a reason not to insert the standard UTF16 replacement character (U+FFFD)? I'd _hope_ that we don't skip over those between an attr name and attr value (since those can just be inserted due to invalid bytes that can't be decoded in the page encoding)... If we don't, we need to fix.
And being non-ASCII, it's not actually a delimiter in any of the standard things, so won't appear in things like -moz-binding or whatnot.
Created attachment 252722 [details] [diff] [review]
Comment on attachment 252722 [details] [diff] [review]
>+// We replace NUL characters with this character.
>+static PRUnichar sInvalid = 0xFFFD;
How about UCS2_REPLACEMENT_CHAR (defined in nsCharTraits.h).
r=me with that.
By the way, as far as -moz-binding goes our CSS parser skips over nulls too in some cases, and possibly over \000000 as well.... Probably want a separate bug on that.
The followup patch has also been checked in.
Created attachment 252784 [details] [diff] [review]
Patch for the 1.8 branches
There was a trivial conflict with a CID defined on the 1.8 branch.
Comment on attachment 252784 [details] [diff] [review]
Patch for the 1.8 branches
Results with the patch are great -- thanks!
a=dveditz for 1.8/1.8.0 branches
Fixed on the 1.8 branches.
(In reply to comment #27)
> By the way, as far as -moz-binding goes our CSS parser skips over nulls too in
> some cases, and possibly over \000000 as well.... Probably want a separate bug
> on that.
Bug 228856 already exists on the CSS parser skipping NULs.
Verified fixed on trunk, using the testcase and table from comment 23.
With a 2007-01-24 build, I see the results as mentioned for FF2 in the table.
With a 2007-01-25 build, I see the same results as mentioned for Opera in the table (which is mentioned as the desired result)
Also verified that the latest branch builds have the same results as mentioned for Opera in the table.