Closed Bug 161880 Opened 22 years ago Closed 22 years ago

privacy hole in JS regexps ?

Categories

(Core :: DOM: Core & HTML, defect, P1)

x86
All
defect

Tracking

()

VERIFIED FIXED
mozilla1.2alpha

People

(Reporter: glazou, Assigned: peterv)

References

()

Details

(Whiteboard: [adt2])

Attachments

(1 file)

With a Win2K trunk build from 08 august 2002 around 11pm Pacific time, open the
URL http://daniel.glazman.free.fr/regexp.htm.

I don't know if the regexp is correct or not but the result of the JS shown in the
document is "(Windows;" and that does not seem very normal to me...

John Keiser is able to confirm this.
strange bug, the first time I opened the page with my debug build on solaris,
I got "1.1b".
javascript:window.alert(RegExp.$1) was empty, and I can't fiddle with the result
in regexp.htm by doing successful regexps in the js console. The js console
has an empty RegExp.$1 as well.
Now all I get are "", if when restarting the browser.

build ID 20020809, pulled Aug  9 1:38 am PDT
OS: Windows 2000 → All
build 2002072104 on NT

I got "PROXY" the first time, then only "". It looks as we're reading from some
kind of global buffer here (or unitialized memory).
I always got "" using build 2002080811 on Win2k.
"" with trunk 2002080718 (and patch from bug 114962) on WinME.
For convenience, here is the regexp Daniel is using:

            var color = "rgb(255, 0, 0)";
            var res = color.match( /rgb\((\d*),\b*(\d*),\b*(\d*)\)/ );
            var r = '"' + RegExp.$1 + '"';


Of course, the real point of this bug is that memory is being
misread or corrupted, but let me comment on why this regexp won't
produce a match. The problem is the use of the word-boundaries \b.

A string supplied to a regexp is conceived as having empty strings
before and after each character. The one before the first character
is matched by the symbol ^; the one after the last character is matched
by the symbol $. A word boundary \b matches the empty string between
a word character and a non-word character.


Here, the supplied word contains strings like

                          "255, 0"

Note the space between the comma and the zero. Because of this
space, the string will not match the following pattern, e.g.:

                          /255,\b*0/


Why? The word-boundary after the comma lies between the comma and the SPACE.
If your version of Perl accepts the -e option, you can see this:

perl -e 'if("255, 0" =~ /255,\b*0/){print("$&");} else {print("NO MATCH");}'
NO MATCH


We get a match if we remove the space between the comma and zero in the string:

perl -e 'if("255,0" =~ /255,\b*0/){print("$&");} else {print("NO MATCH");}'
255,0


Or if we keep the space in the string, but change the regexp from \b to \s:

perl -e 'if("255, 0" =~ /255,\s*0/){print("$&");} else {print("NO MATCH");}'
255, 0



MORAL:

To get the desired match, one should match for possible space \s*
after the commas in the string, not possible word boundaries \b*
Note: I think our own documentation of \b is misleading!

-------
http://devedge.netscape.com/docs/manuals/js/core/jsref15/regexp.html#1193136

\b    Matches a word boundary, such as a space. (Not to be confused with [\b].) 

For example: /\bn\w/ matches the 'no' in "noonday"
             /\wy\b/ matches the 'ly' in "possibly yesterday."  
-------


This makes the user think that \b is matches a space character,
when it does not! Compare this to the Perl documentation:


-------
http://www.mit.edu/perl/perlre.html

Perl defines the following zero-width assertions: 

\b    Match a word boundary

A word boundary (\b) is defined as a spot between two characters that has
a \w on one side of it and and a \W on the other side of it (in either order), 
counting the imaginary characters off the beginning and end of the string
as matching a \W. (Within character classes \b represents backspace rather
than a word boundary.)
-------
> The word-boundary after the comma lies between the comma and the SPACE.

To be more precise: where the regexp is LOOKING for a word-boundary
is between the comma and the space. Since both of these are non-word
characters, i.e. form the pattern \W\W, no word boundary is found.
A word-boundary demands patterns \w\W or \W\w.


Using the quantifier * after the \b doesn't help, either:


                          /255,\b*0/
Say, trying to match      "255, 0"



OK, so in the string there's no \b between the comma and the space.
That satisifes \b* (= match \b zero or more times). But then the
regexp demands a zero character directly after the comma, which fails
because the space is there...



NOTE: the Devedge site I referred to above is undergoing changes.
The link for the documentation is at developer.netscape.com instead:

http://developer.netscape.com/docs/manuals/js/core/jsref15/regexp.html#1193136
Again, the main point of this bug is memory mis-read/corruption.

Using Mozilla builds: trunk 20020805, 1.0 branch 2002-08-06 on WinNT.
I don't see the memory corruption, (i.e. correctly get no match).
But that is the nature of memory bugs: they may appear sporadically.


Related?

bug 122076 "RegExp crash when following link"; in particular
note comment http://bugzilla.mozilla.org/show_bug.cgi?id=122076#c28
> Again, the main point of this bug is memory mis-read/corruption.

Absolutely. I know that my regexp does not match. But the result of that
should not be an access to a memory area we should not be reading from...
cc'ing Brendan/Shaver. I need browser embedding savvy.
When I execute (for example) 'javascript:RegExp.$1' from the URL bar I can see
that the context being used has some previous values in the RegExp statics.
Should JS_ClearRegExpStatics be being called on the way in to the engine under
these circumstances? [I'm not sure whether this is a security issue - would the
same context be used for different security environments?]
We should call JS_ClearRegExpStatics where we call JS_ClearScope, in the DOM
code.  Reassigning, hope I guessed the right assignee.

/be
Assignee: rogerl → peterv
Setting default component and QA -
Component: JavaScript Engine → DOM Level 0
QA Contact: pschwartau → desale
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: --- → mozilla1.2alpha
Like this?
Comment on attachment 95780 [details] [diff] [review]
Call JS_ClearRegExpStatics

Yeah, looks good to me, r=jst and I hope brendan could sr this one.
Attachment #95780 - Flags: review+
Comment on attachment 95780 [details] [diff] [review]
Call JS_ClearRegExpStatics

sr=me, sure.

/be
Attachment #95780 - Flags: superreview+
Checked in on the trunk. I think we want to get this on the branch, it's a
low-risk change that fixes a potential privacy/security problem (we're accessing
random memory).
Keywords: adt1.0.1
Comment on attachment 95780 [details] [diff] [review]
Call JS_ClearRegExpStatics

a=rjesup@wgate.com for 1.0 branch checkin.  Please be careful to test after
checkin!  and change mozilla1.0.1+ to fixed1.0.1
Attachment #95780 - Flags: approval+
Keywords: mozilla1.1
Marking this fixed since it's checked in on the trunk. Sent mail to adt.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
mass changing adt1.0.1 to adt1.0.2 nominations.
Keywords: adt1.0.2
Keywords: adt1.0.1
desale: can you pls verify this as fixed on the trunk? thanks!

mozilla1.0.1 has passed, and the approval is now over a month old. nominating
for mozilla1.0.2.
Whiteboard: [adt2]
I tested it with testcase provided at http://daniel.glazman.free.fr/regexp.htm

BUIDS TESTED ON: 2002-09-20-08-trunk
OS: WinXP
TEST RESULTS: I'm getting results ""

hence Reopening bug.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
The regexp is wrong and shouldn't return anything, see comment 9 (or for more
details comment 5, comment 6 and comment 7). The bug was that sometimes it
returned a value.
Status: REOPENED → RESOLVED
Closed: 22 years ago22 years ago
Resolution: --- → FIXED
Wasn't stop ship in MachV, has something changed we should know about?
Keywords: adt1.0.2adt1.0.2-
Nothing has changed that adt didn't know before. I on the other hand didn't know
this was not stop-ship in MachV. I have never heard anything back from adt
despite my mails, it wasn't even minussed. Drivers on the other hand responded
quickly.
I talked to Phil Schwartau & he showed me how to verify this one.
Finally I verified this one on WinXP & WinNT on todays trunk.
Marking it verified.
Thanks Phil Schwartau for helping me out with this verification.
Status: RESOLVED → VERIFIED
a=rjesup@wgate.com for 1.0 branch again
Renominating for ADT
When checked in, please change mozilla1.0.2+ to fixed1.0.2
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: