Closed
Bug 430740
Opened 17 years ago
Closed 16 years ago
BOM characters are stripped from javascript before execution
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
VERIFIED
FIXED
mozilla1.9.1a1
People
(Reporter: bugzilla, Assigned: crowderbt)
References
()
Details
(Keywords: verified1.8.1.17, verified1.9.0.2, verified1.9.1, Whiteboard: [sg:high] XSS risk to sites)
Attachments
(5 files, 4 obsolete files)
2.68 KB,
text/plain
|
Details | |
4.80 KB,
patch
|
Details | Diff | Splinter Review | |
13.92 KB,
patch
|
Details | Diff | Splinter Review | |
15.75 KB,
patch
|
samuel.sidler+old
:
approval1.9.0.2+
|
Details | Diff | Splinter Review |
15.47 KB,
patch
|
igor
:
review+
dveditz
:
approval1.8.1.17+
asac
:
approval1.8.0.next+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; InfoPath.2; .NET CLR 3.0.04506; .NET CLR 3.5.30319; WWTClient2)
Build Identifier: 2.0.0.13
Using control format unicode characters, it is possible for code that will be executed to exist inside what would otherwise appear to be a quoted javascript string. The same technique can be used to escape out of block comments.
Reproducible: Always
Steps to Reproduce:
Execute the following javascript:
function evil() {
alert('evil');
return 'evil';
}
alert(eval("(['a\\\u200d', '+evil()])//')]"));
Actual Results:
alert 'evil' appears, then an aldert "a', evil"
Expected Results:
only one alert "a\,+evil()])//" should appear.
See url for more.
Updated•17 years ago
|
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Firefox → Core
QA Contact: general → general
Comment 1•17 years ago
|
||
Fixed for Firefox 3 by patch for bug 274152 -- see bug 274152 comment 43 for the checkin date (also, not quite perfect hindsight: please search closed bugs first for format control keywords).
/be
Status: UNCONFIRMED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Summary: Control Format unicode characters are stipped from javascript before execution → Control Format unicode characters are stripped from javascript before execution
Assignee | ||
Comment 2•17 years ago
|
||
Is it possible that bug 268516 re-introduced this bug, for BOM characters?
Assignee | ||
Comment 3•17 years ago
|
||
js> alert(eval("(['a\\\ufffe', '+evil()])//')]"));
evil
a', evil
Search as I may, for 30 minutes, and could not find any bugs related to 'control format' characters. Hindsight only works if you have eyes.
Comment 5•17 years ago
|
||
Comment 6•17 years ago
|
||
Updated•17 years ago
|
Flags: in-testsuite+
Flags: in-litmus-
Assignee | ||
Comment 7•17 years ago
|
||
How did this bug get resolved? Are comments 2 and 3 not still issues?
Comment 8•17 years ago
|
||
I missed the second case by crowder. 1.8.1 fails the first part, 1.9.0 fails the second part.
Attachment #318432 -
Attachment is obsolete: true
Assignee | ||
Comment 9•17 years ago
|
||
Need a better fix here.
Status: RESOLVED → UNCONFIRMED
Resolution: FIXED → ---
Comment 10•17 years ago
|
||
is the 1.9.0 failure of the second part of the test due to bug 368516 ?
Assignee | ||
Comment 11•17 years ago
|
||
Yes, and that's what I -meant- to type in comment 2, but didn't.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Updated•17 years ago
|
Blocks: 368516
Flags: wanted1.8.1.x+
Flags: blocking1.9.1?
Flags: blocking1.9.0.1?
Flags: blocking1.8.1.16?
Whiteboard: [sg:high?] XSS risk to sites.
Assignee | ||
Comment 12•16 years ago
|
||
brendan: Some thoughts on this would be very helpful. What will be the fate of randomly-occurring format characters like this, in ES4?
Comment 13•16 years ago
|
||
What's the problem? ES4 does not strip format control characters; it does try to cope with misplaced BOMs. The proposal to do both has been stable for a while.
/be
Assignee | ||
Comment 14•16 years ago
|
||
I guess I'm looking for a better idea of what you mean by "cope", and I'm not sure if I'm reading ecmascript.org correctly, or even looking in the right place.
Assignee | ||
Comment 15•16 years ago
|
||
Dave Reed: If I substitute the \ufffe in my altered version of your example with just "a", the expression doesn't parse correctly. Can you show me some more examples?
Comment 16•16 years ago
|
||
The write-up in http://www.ecmascript.org/es4/spec/incompatibilities.pdf says:
ES4 therefore specifies that class Cf characters should not be stripped from source text at all, with the exception of byte order marks (BOMs), which must continue to be stripped everywhere.
This bug makes the case that BOM as well as class Cf characters should not be stripped from within string literals. I'll try to get this onto the Ecma TC39 committee's radar for the next face-to-face meeting.
/be
Comment 17•16 years ago
|
||
This has been resolved for ES4, IMO. The resolution is that Cf characters are not stripped at all, but that a BOM is interpreted in the same way as white space between tokens. (Thus a BOM in a string or regular expression literal has the value of the BOM, not a space value.) The alternative is to translate BOM to U+0020 also within string and regular expression literals, but IMO that is an inferior solution, as the former solution does not translate at all -- it just gives a compatible-with-ES3 interpretation to BOM in certain contexts.
Updated•16 years ago
|
Flags: blocking1.9.0.1? → blocking1.9.0.1-
Comment 18•16 years ago
|
||
Brendan, Brian: is this something we'd fix on the 1.8 branch, too?
Updated•16 years ago
|
Assignee: general → crowder
Assignee | ||
Comment 19•16 years ago
|
||
I'm actually not convinced anymore that this is that serious a bug, and the original poster has not provided an exploitable example, in which the enclosing code does not itself contain a syntax error. We should probably WONTFIX or morph this bug to be the "make SM compatible with ES4" bug, which means doing what Lars documents in comment #17.
Reporter | ||
Comment 20•16 years ago
|
||
My original example does not result in a syntax error. The code within the supposed string isn't valid by itself, but that would be necessary by an attacker to make the unquoted string form valid code in the context in which it is used. Hence, the string ends in // so that remaining 'actual' code is considered commented out.
Assignee | ||
Comment 21•16 years ago
|
||
Actually, yeah. You're right. Here's an attempt at a patch; it needs more testing to be ready for review, though.
Assignee | ||
Comment 22•16 years ago
|
||
Comment on attachment 329544 [details] [diff] [review]
the fix
I have -no idea- why mercurial has so much difficulty generating a minimized diff of this, I can't get it to do -b or -w. I'll have a manually-generated diff shortly.
Assignee | ||
Comment 23•16 years ago
|
||
jsscan.cpp versus the fixed jsscan.cpp generated by |diff -bpU8|
Assignee | ||
Comment 24•16 years ago
|
||
Comment on attachment 329544 [details] [diff] [review]
the fix
This passes js/tests and Mochitests.
Attachment #329544 -
Flags: review?(brendan)
Assignee | ||
Comment 25•16 years ago
|
||
Comment on attachment 329544 [details] [diff] [review]
the fix
Passing review to Igor.
Attachment #329544 -
Flags: review?(brendan) → review?(igor)
Comment 26•16 years ago
|
||
(In reply to comment #25)
> (From update of attachment 329544 [details] [diff] [review])
> Passing review to Igor.
>
From the first impression the patch replaces BOM by whytespace in regexp literals, doesn't it?
Assignee | ||
Comment 27•16 years ago
|
||
> From the first impression the patch replaces BOM by whytespace in regexp
> literals, doesn't it?
String literals, not regexp literals, unless I did something terribly wrong.
Comment 28•16 years ago
|
||
Brian, see comment 17. BOM in string or regexp must be preserved, BOM outside quotes or slashes becomes space.
/be
Comment 29•16 years ago
|
||
(In reply to comment #28)
> Brian, see comment 17. BOM in string or regexp must be preserved, BOM outside
> quotes or slashes becomes space.
I interpret that idea that BOM between tokens means whitespace as preserving BOM not only inside string and regexps but also inside numbers and names. That is, 1<bom>e7 should not mean 1 e7 but rather lead to a syntax error.
Comment 30•16 years ago
|
||
(In reply to comment #29)
> I interpret that idea that BOM between tokens means whitespace as preserving
> BOM not only inside string and regexps but also inside numbers and names. That
> is, 1<bom>e7 should not mean 1 e7 but rather lead to a syntax error.
Here is another example:
-<bom>-a
With the stripping rule that becomes "--a" while with bom-as-whitespace that would be "- -a" or just "+a". This is not good in my book.
Comment 31•16 years ago
|
||
Igor: problem is IE set a precedent. Can someone test IE8 beta to confirm it does turn BOM outside of string or regexp into a space?
/be
Assignee | ||
Comment 32•16 years ago
|
||
IE8 and IE7 seem to actually error out on BOMs anywhere except inside regexp and string literals, from my testing. Both 7 and 8 throw an "invalid character" exception if the BOM occurs anywhere but at the location of the first character in the script (string, in eval), or in a regexp or string literal. Also, IE throws an invalid character error for non-BOM format control characters (\u200d, as in the example from comment #1) occuring -anywhere- except inside of string/regexp literals.
Assignee | ||
Comment 33•16 years ago
|
||
My comment #32 is wrong. IE7/8 handle BOMs differently inside strings being eval()d than inside <script> tags or code included from <script> tags. Does eval() lose the utf-8 encoding attribute, perhaps?
Assignee | ||
Comment 34•16 years ago
|
||
At any rate, what IE does for BOMs inside <script> tags is to convert them to space (rather than eliding them as we currently do). This means that my "the fix" is relatively close to their handling, except that I must now make eval() break to match them.
Assignee | ||
Comment 35•16 years ago
|
||
Also, I need to add the same exemption for regexp literals.
Comment 36•16 years ago
|
||
(In reply to comment #33)
> My comment #32 is wrong. IE7/8 handle BOMs differently inside strings being
> eval()d than inside <script> tags or code included from <script> tags. Does
> eval() lose the utf-8 encoding attribute, perhaps?
What would MSIE alerts for the following case:
<script>
var a = 0;
-<bom>-a;
alert(a);
<script>
??
Assignee | ||
Comment 37•16 years ago
|
||
> What would MSIE alerts for the following case:
>
> <script>
> var a = 0;
> -<bom>-a;
> alert(a);
> <script>
IE alerts 0, not -1
Comment 38•16 years ago
|
||
(In reply to comment #37)
> > What would MSIE alerts for the following case:
> >
> > <script>
> > var a = 0;
> > -<bom>-a;
> > alert(a);
> > <script>
>
> IE alerts 0, not -1
On more test, what about
<script>
var a = eval('"<bom>"')
alert(a.charCodeAt(0));
</script>
That is, does that bom->whitespace replacement happens on HTML parser or script level?
Assignee | ||
Comment 39•16 years ago
|
||
For 0xfeff, this alerts 65279.
Comment 40•16 years ago
|
||
(In reply to comment #39)
> For 0xfeff, this alerts 65279.
Ok , the last curiosity attempt:
<script>
function f() {<bom>}
var s = f.toString().replace('\ufeff', 'T');
alert(s);
</script>
I.e. the question is what exactly character MSIE uses to replace the bom? Or perhaps it does not replace it with WS at all but rather treats BOM as a whitespace in the lexer outside of eval context?
Assignee | ||
Comment 41•16 years ago
|
||
function f() {T}
Comment 42•16 years ago
|
||
(In reply to comment #41)
> function f() {T}
>
So MSIE simply treats BOM as a whitespace for the purpose of parsing. Which suggests to do this in SM to fix the bug: treat BOM as one of Unicode whitespace characters in the scanner avoiding any character skipping or patching.
Comment 43•16 years ago
|
||
Is the patch only for the BOM? What about \u200d from the original example, and the other :Cf: characters listed in the URI?
Flags: blocking1.9.0.2?
Flags: blocking1.8.1.17?
Flags: blocking1.8.1.17+
Assignee | ||
Comment 44•16 years ago
|
||
Non BOM Cf characters aren't stripped anymore (note that comment 0 references an Fx2 build-id), so they get treated appropriately (typically, we'll throw an "illegal character" exception). That has been the case since mrbkap's original fix for bug 274152 landed. I excepted BOM characters in bug 368516 because of concerns over web-compability, but stripping them out wasn't the Right Thing. The Right Thing (or close to it, I think, given the situation), is to treat them as whitespace.
The 1.8 patch for this will likely be at least a little different.
Summary: Control Format unicode characters are stripped from javascript before execution → BOM characters are stripped from javascript before execution
Assignee | ||
Comment 45•16 years ago
|
||
Pratap: I'm hoping you (or another engineer) can shed some light on the difference in behavior for IE7/8's eval() here, as opposed to the handling of <script> tags.
Assignee | ||
Comment 46•16 years ago
|
||
We still don't do the same thing IE does for eval cases (we'll continue to treat BOMs as whitespace inside eval. This saves us from the exploit, but doesn't precisely emulated IE.
Attachment #329544 -
Attachment is obsolete: true
Attachment #329545 -
Attachment is obsolete: true
Attachment #329910 -
Flags: review?(igor)
Attachment #329544 -
Flags: review?(igor)
Assignee | ||
Comment 47•16 years ago
|
||
For convenience in review.
Comment 48•16 years ago
|
||
Comment on attachment 329910 [details] [diff] [review]
Here's another swing at emulating IE.
>--- mozilla.4b761ffed330/js/src/jsscan.cpp 2008-07-16 15:11:37.000000000 -0700
>+++ /Users/crowder/mozilla/js/src/jsscan.cpp 2008-07-16 14:56:37.000000000 -0700
>+static JS_INLINE JSBool
>+ScanAsSpace(jschar ch)
>+{
>+ if (JS_ISSPACE(ch) || ch == 0xfffe || ch == 0xfeff)
>+ return JS_TRUE;
>+ return JS_FALSE;
>+}
Nit: comment that 0xfeff and 0xfffe are BOMs treated as whitespace for compatibility.
Attachment #329910 -
Flags: review?(igor) → review+
Comment 49•16 years ago
|
||
Comment on attachment 329911 [details] [diff] [review]
diff -w of previous
Sorry, drive-by: IsSpace would be canonical name (someone pointed out that BOM code points are Unicode space chars going by attributes), instead of ScanAsSpace. Likewise, c not ch.
/be
Comment 50•16 years ago
|
||
Come to think of it, does JS_ISSPACE treat BOMs as space already?
/be
Comment 51•16 years ago
|
||
(In reply to comment #50)
> Come to think of it, does JS_ISSPACE treat BOMs as space already?
BOMs are zero-width space characters which is different from the space character attribute.
Assignee | ||
Comment 52•16 years ago
|
||
Includes the ch => c rewrite, but retains ScanAsSpace as the function name, per discussion with brendan on IRC.
Attachment #329910 -
Attachment is obsolete: true
Assignee | ||
Updated•16 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 53•16 years ago
|
||
Applies without fuzz on the branch, other than the module name (jsscan.cpp => jsscan.c)
Attachment #330105 -
Flags: approval1.9.0.2?
Assignee | ||
Comment 54•16 years ago
|
||
I think this bug is a legit XSS bug and rightly deserves the blocking statuses requested for it for both branches.
Whiteboard: [sg:high?] XSS risk to sites. → [sg:high] XSS risk to sites.
Comment 55•16 years ago
|
||
Pushed to mozilla-central:
http://hg.mozilla.org/mozilla-central/index.cgi/rev/ddb85d02a8f5
Status: NEW → RESOLVED
Closed: 17 years ago → 16 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.1a1
Comment 56•16 years ago
|
||
OK, now it's really there (there were some issues before)
Comment 57•16 years ago
|
||
Is the "branch patch" for both branches, or is that only the 1.9 "branch"?
Blocks: xss
Assignee | ||
Comment 58•16 years ago
|
||
(In reply to comment #57)
> Is the "branch patch" for both branches, or is that only the 1.9 "branch"?
Only for 1.9, yeah. I'll roll a 1.8 patch next.
Assignee | ||
Updated•16 years ago
|
Attachment #330105 -
Attachment description: branch patch → 1.9 branch patch
Assignee | ||
Comment 59•16 years ago
|
||
Fx2 is vulnerable to -all- Cf characters, not just BOMs. This fixes that. Asking for re-review, since the patch didn't apply trivially. If 1.8.1.17 is not the branch I should be seeking approval for, please fix.
Attachment #330318 -
Flags: review?(igor)
Attachment #330318 -
Flags: approval1.8.1.17?
Updated•16 years ago
|
Attachment #330318 -
Flags: review?(igor) → review+
Updated•16 years ago
|
Flags: blocking1.9.0.2? → blocking1.9.0.2+
Whiteboard: [sg:high] XSS risk to sites. → [sg:high] XSS risk to sites. [needs baking]
Comment 62•16 years ago
|
||
Sorry for the trivial question but I'm new to bugzilla - is there a nightly build (Windows preferably) with the fix included that I can grab to run my test cases against?
Assignee | ||
Comment 63•16 years ago
|
||
The builds here should have the latest and greatest, including the patch landed on the 17th.
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-trunk/
Comment 64•16 years ago
|
||
Comment on attachment 330105 [details] [diff] [review]
1.9 branch patch
Approved for 1.9.0.2. Please land in CVS. a=ss
Attachment #330105 -
Flags: approval1.9.0.2? → approval1.9.0.2+
Comment 65•16 years ago
|
||
Comment on attachment 330318 [details] [diff] [review]
1.8 branch patch
Approved for 1.8.1.17, a=dveditz for release-drivers
Attachment #330318 -
Flags: approval1.8.1.17? → approval1.8.1.17+
Assignee | ||
Updated•16 years ago
|
Keywords: fixed1.9.0.2
Assignee | ||
Updated•16 years ago
|
Keywords: fixed1.8.1.17
Comment 66•16 years ago
|
||
Did this cause ecma_3/extensions/regress-368516.js to regress or is the test no longer valid?
Assignee | ||
Comment 67•16 years ago
|
||
Yes, that test is invalid now; BOMs aren't stripped anymore, but are treated as whitespace in parsing.
Comment 68•16 years ago
|
||
(In reply to comment #66)
> Did this cause ecma_3/extensions/regress-368516.js to regress or is the test no
> longer valid?
>
The test contains:
eval("hi" + bomchars[i] + "there = 'howdie';");
With the patch this equivalent to eval("hi there = 'howdie';") which is not a valid code. So the test is no longer applicable. Ideally it should be changed to verify that boms are treated precizely as a whitespace, like in
eval("var"+bom+"x;")
Comment 69•16 years ago
|
||
/cvsroot/mozilla/js/tests/ecma_3/extensions/regress-368516.js,v <-- regress-368516.js
new revision: 1.2; previous revision: 1.1
mozilla-central: 16398:4da7cad4b133
Comment 71•16 years ago
|
||
Comment on attachment 330318 [details] [diff] [review]
1.8 branch patch
a=asac for 1.8.0.15
taking for distros
Attachment #330318 -
Flags: approval1.8.0.15+
Updated•16 years ago
|
Flags: blocking1.8.0.15+
Updated•16 years ago
|
Group: core-security
Comment 72•16 years ago
|
||
/cvsroot/mozilla/js/tests/ecma_3/extensions/regress-430740.js,v <-- regress-430740.js
initial revision: 1.1
http://hg.mozilla.org/mozilla-central/rev/2d7d00ef041e
Updated•16 years ago
|
Flags: blocking1.9.1? → blocking1.9.1+
Comment 73•16 years ago
|
||
Keywords: fixed1.9.1
Whiteboard: [sg:high] XSS risk to sites. [needs baking] → [sg:high] XSS risk to sites
You need to log in
before you can comment on or make changes to this bug.
Description
•