privacy hole in JS regexps ?

VERIFIED FIXED in mozilla1.2alpha

Status

()

Core
DOM: Core & HTML
P1
critical
VERIFIED FIXED
16 years ago
15 years ago

People

(Reporter: glazou, Assigned: peterv)

Tracking

Trunk
mozilla1.2alpha
x86
All
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [adt2], URL)

Attachments

(1 attachment)

(Reporter)

Description

16 years ago
With a Win2K trunk build from 08 august 2002 around 11pm Pacific time, open the
URL http://daniel.glazman.free.fr/regexp.htm.

I don't know if the regexp is correct or not but the result of the JS shown in the
document is "(Windows;" and that does not seem very normal to me...

John Keiser is able to confirm this.

Comment 1

16 years ago
strange bug, the first time I opened the page with my debug build on solaris,
I got "1.1b".
javascript:window.alert(RegExp.$1) was empty, and I can't fiddle with the result
in regexp.htm by doing successful regexps in the js console. The js console
has an empty RegExp.$1 as well.
Now all I get are "", if when restarting the browser.

build ID 20020809, pulled Aug  9 1:38 am PDT
OS: Windows 2000 → All

Comment 2

16 years ago
build 2002072104 on NT

I got "PROXY" the first time, then only "". It looks as we're reading from some
kind of global buffer here (or unitialized memory).

Comment 3

16 years ago
I always got "" using build 2002080811 on Win2k.

Comment 4

16 years ago
"" with trunk 2002080718 (and patch from bug 114962) on WinME.

Comment 5

16 years ago
For convenience, here is the regexp Daniel is using:

            var color = "rgb(255, 0, 0)";
            var res = color.match( /rgb\((\d*),\b*(\d*),\b*(\d*)\)/ );
            var r = '"' + RegExp.$1 + '"';


Of course, the real point of this bug is that memory is being
misread or corrupted, but let me comment on why this regexp won't
produce a match. The problem is the use of the word-boundaries \b.

A string supplied to a regexp is conceived as having empty strings
before and after each character. The one before the first character
is matched by the symbol ^; the one after the last character is matched
by the symbol $. A word boundary \b matches the empty string between
a word character and a non-word character.


Here, the supplied word contains strings like

                          "255, 0"

Note the space between the comma and the zero. Because of this
space, the string will not match the following pattern, e.g.:

                          /255,\b*0/


Why? The word-boundary after the comma lies between the comma and the SPACE.
If your version of Perl accepts the -e option, you can see this:

perl -e 'if("255, 0" =~ /255,\b*0/){print("$&");} else {print("NO MATCH");}'
NO MATCH


We get a match if we remove the space between the comma and zero in the string:

perl -e 'if("255,0" =~ /255,\b*0/){print("$&");} else {print("NO MATCH");}'
255,0


Or if we keep the space in the string, but change the regexp from \b to \s:

perl -e 'if("255, 0" =~ /255,\s*0/){print("$&");} else {print("NO MATCH");}'
255, 0



MORAL:

To get the desired match, one should match for possible space \s*
after the commas in the string, not possible word boundaries \b*

Comment 6

16 years ago
Note: I think our own documentation of \b is misleading!

-------
http://devedge.netscape.com/docs/manuals/js/core/jsref15/regexp.html#1193136

\b    Matches a word boundary, such as a space. (Not to be confused with [\b].) 

For example: /\bn\w/ matches the 'no' in "noonday"
             /\wy\b/ matches the 'ly' in "possibly yesterday."  
-------


This makes the user think that \b is matches a space character,
when it does not! Compare this to the Perl documentation:


-------
http://www.mit.edu/perl/perlre.html

Perl defines the following zero-width assertions: 

\b    Match a word boundary

A word boundary (\b) is defined as a spot between two characters that has
a \w on one side of it and and a \W on the other side of it (in either order), 
counting the imaginary characters off the beginning and end of the string
as matching a \W. (Within character classes \b represents backspace rather
than a word boundary.)
-------

Comment 7

16 years ago
> The word-boundary after the comma lies between the comma and the SPACE.

To be more precise: where the regexp is LOOKING for a word-boundary
is between the comma and the space. Since both of these are non-word
characters, i.e. form the pattern \W\W, no word boundary is found.
A word-boundary demands patterns \w\W or \W\w.


Using the quantifier * after the \b doesn't help, either:


                          /255,\b*0/
Say, trying to match      "255, 0"



OK, so in the string there's no \b between the comma and the space.
That satisifes \b* (= match \b zero or more times). But then the
regexp demands a zero character directly after the comma, which fails
because the space is there...



NOTE: the Devedge site I referred to above is undergoing changes.
The link for the documentation is at developer.netscape.com instead:

http://developer.netscape.com/docs/manuals/js/core/jsref15/regexp.html#1193136

Comment 8

16 years ago
Again, the main point of this bug is memory mis-read/corruption.

Using Mozilla builds: trunk 20020805, 1.0 branch 2002-08-06 on WinNT.
I don't see the memory corruption, (i.e. correctly get no match).
But that is the nature of memory bugs: they may appear sporadically.


Related?

bug 122076 "RegExp crash when following link"; in particular
note comment http://bugzilla.mozilla.org/show_bug.cgi?id=122076#c28
(Reporter)

Comment 9

16 years ago
> Again, the main point of this bug is memory mis-read/corruption.

Absolutely. I know that my regexp does not match. But the result of that
should not be an access to a memory area we should not be reading from...

Comment 10

16 years ago
cc'ing Brendan/Shaver. I need browser embedding savvy.
When I execute (for example) 'javascript:RegExp.$1' from the URL bar I can see
that the context being used has some previous values in the RegExp statics.
Should JS_ClearRegExpStatics be being called on the way in to the engine under
these circumstances? [I'm not sure whether this is a security issue - would the
same context be used for different security environments?]
We should call JS_ClearRegExpStatics where we call JS_ClearScope, in the DOM
code.  Reassigning, hope I guessed the right assignee.

/be
Assignee: rogerl → peterv

Comment 12

16 years ago
Setting default component and QA -
Component: JavaScript Engine → DOM Level 0
QA Contact: pschwartau → desale

Updated

16 years ago
Keywords: mozilla1.0.1, nsbeta1
(Assignee)

Updated

16 years ago
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: --- → mozilla1.2alpha
(Assignee)

Comment 13

16 years ago
Created attachment 95780 [details] [diff] [review]
Call JS_ClearRegExpStatics

Like this?
Comment on attachment 95780 [details] [diff] [review]
Call JS_ClearRegExpStatics

Yeah, looks good to me, r=jst and I hope brendan could sr this one.
Attachment #95780 - Flags: review+
Comment on attachment 95780 [details] [diff] [review]
Call JS_ClearRegExpStatics

sr=me, sure.

/be
Attachment #95780 - Flags: superreview+
(Assignee)

Comment 16

16 years ago
Checked in on the trunk. I think we want to get this on the branch, it's a
low-risk change that fixes a potential privacy/security problem (we're accessing
random memory).
Keywords: adt1.0.1
Comment on attachment 95780 [details] [diff] [review]
Call JS_ClearRegExpStatics

a=rjesup@wgate.com for 1.0 branch checkin.  Please be careful to test after
checkin!  and change mozilla1.0.1+ to fixed1.0.1
Attachment #95780 - Flags: approval+

Updated

16 years ago
Keywords: mozilla1.0.1 → mozilla1.0.1+

Updated

16 years ago
Keywords: mozilla1.1
(Assignee)

Comment 18

16 years ago
Marking this fixed since it's checked in on the trunk. Sent mail to adt.
Status: ASSIGNED → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → FIXED

Comment 19

15 years ago
mass changing adt1.0.1 to adt1.0.2 nominations.
Keywords: adt1.0.2

Updated

15 years ago
Keywords: adt1.0.1

Comment 20

15 years ago
desale: can you pls verify this as fixed on the trunk? thanks!

mozilla1.0.1 has passed, and the approval is now over a month old. nominating
for mozilla1.0.2.
Keywords: mozilla1.0.1+, nsbeta1 → approval, mozilla1.0.2, nsbeta1+
Whiteboard: [adt2]

Comment 21

15 years ago
I tested it with testcase provided at http://daniel.glazman.free.fr/regexp.htm

BUIDS TESTED ON: 2002-09-20-08-trunk
OS: WinXP
TEST RESULTS: I'm getting results ""

hence Reopening bug.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 22

15 years ago
The regexp is wrong and shouldn't return anything, see comment 9 (or for more
details comment 5, comment 6 and comment 7). The bug was that sometimes it
returned a value.
Status: REOPENED → RESOLVED
Last Resolved: 16 years ago15 years ago
Resolution: --- → FIXED
Wasn't stop ship in MachV, has something changed we should know about?
Keywords: adt1.0.2 → adt1.0.2-
(Assignee)

Comment 24

15 years ago
Nothing has changed that adt didn't know before. I on the other hand didn't know
this was not stop-ship in MachV. I have never heard anything back from adt
despite my mails, it wasn't even minussed. Drivers on the other hand responded
quickly.

Comment 25

15 years ago
I talked to Phil Schwartau & he showed me how to verify this one.
Finally I verified this one on WinXP & WinNT on todays trunk.
Marking it verified.
Thanks Phil Schwartau for helping me out with this verification.
Status: RESOLVED → VERIFIED
a=rjesup@wgate.com for 1.0 branch again
Renominating for ADT
When checked in, please change mozilla1.0.2+ to fixed1.0.2
Keywords: adt1.0.2-, mozilla1.0.2 → adt1.0.2, mozilla1.0.2+

Updated

15 years ago
Keywords: mozilla1.0.2+ → fixed1.0.2
You need to log in before you can comment on or make changes to this bug.