Closed Bug 332173 Opened 14 years ago Closed 11 years ago

Problems with regexp parsing of '~' in nsIZipReader.findEntries (and other nsWildCard uses)

Categories

(Core :: XPCOM, defect, P1, critical)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
status1.9.2 --- beta1-fixed
blocking1.9.1 --- .2+
status1.9.1 --- .2-fixed

People

(Reporter: Waldo, Assigned: nelson)

References

Details

(4 keywords, Whiteboard: [sg:low])

Attachments

(4 files, 5 obsolete files)

The way the character '~' is parsed (by iterating from the end of the pattern to the front to find it) is incorrect, because it assumes that '~' always signifies a negation pattern (pat~pat2) if it's not escaped.  However, characters other than ']' in character classes (e.g., '[abc]') are *not* escaped.  Further problems also appear in invalid patterns which combine '(pat|pat2)' and 'pat~pat2' syntaxes, and in some cases it is possible to produce a crash.

To demonstrate these bugs, here's a table of results of testing various patterns against various strings.  These patterns and strings should be considered as "raw" strings, so if they were actually being used in code '\' would need to be escaped.

Pattern      String       Match?   Is Behavior Correct?
-------------------------------------------------------
a[a\~b]      aa           yes      yes
a[a\~b]      a\           yes      yes
a[a\~b]      a~           yes      yes
a[a~b]       a~           NO       NO
a[a~b]b      a~b          NO       NO
(a|b)(a|~)b  a~b          CRASH    "NO"

Note that the last behavior isn't strictly incorrect because there's no specified behavior, but ideally it should be impossible for any pattern/string combination to cause a crash regardless of pattern validity.
Component: Networking → Networking: JAR
QA Contact: networking → networking.jar
More details anon.
Assignee: jwalden+bmo → nobody
Group: core-security
Component: Networking: JAR → XPCOM
QA Contact: networking.jar → xpcom
Bug 504456 comments strongly suggest that the wildcard-matching code used by libjar at the time this bug was filed is a duplicate of the matching code in NSS; I've read that bug, and the testcase there seems un-similar enough to the CRASH listed above that this may be a separate bug that hits both code.  libjar's wildcard has since been moved into xpcom/io/nsWildCard.cpp, but my memory of this bug from the time when I investigated it is that that pattern-string combination causes a double-free.  I was diligent enough to notice this as a problem when investigating, but I looked at libjar and our codebase and determined that this...er...wasn't a problem because we never accepted user-input patterns and thus couldn't trigger this -- and then I never followed up and marked this as sensitive as a result.  Apparently, however, we're using the pattern-matching code in the file picker, so there's more attack surface than I thought (although at a glance I think filepicker filters are all from trusted input, at least for now) -- and yet again in NSS from appearances, which does actually accept untrusted input.  Yay for code triplication (now duplication, better but still bad).

This might be a dup of bug 504456, but the circumstances are sufficiently different that I don't know -- and I'd rather be wrong than *wrong*.
var lf = Components.classes["@mozilla.org/file/local;1"].createInstance(Components.interfaces.nsILocalFile);
var zr = Components.classes["@mozilla.org/libjar/zip-reader;1"].createInstance(Components.interfaces.nsIZipReader);
// create a zip containing a file named "a~b", init lf with its path
zr.open(lf);
var en = zr.findEntries("(a|b)(a|~)b");
en.getNext();

Trying that in an xpcshell gives me this right now:

js> en.getNext()
xpcshell(27742,0xa000d000) malloc: *** error for object 0x2529880: incorrect checksum for freed object - object was probably modified after being freed, break at szone_error to debug
xpcshell(27742,0xa000d000) malloc: *** set a breakpoint in szone_error to debug

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x6e727578
0x90005979 in szone_free ()

which comports with my memory, unfortunately.
Why is this bug private? Are there ways web content could cause us to hit any of this string-matching code?
One, because relying on that assumption is going to come back and bite sometime, two, and more importantly, because this bug probably exists in the NSS code as well, which is exposed to web content.

I should note for completeness that this matching code is hit by viewing a directory in a zip via the jar: protocol, but the manner in which the relevant glob is created is carefully controlled such that nothing unusual can be accepted as input -- we escape special characters manually in the right places in the constructed patterns.  See nsJARInputStream::InitDirectory for details on this.
Duplicate of this bug: 504977
This is also effectively a duplicate of bug 504456, but we're keeping them 
separate to keep the cc lists separate until after these bugs are opened.

While working on bug 504456 I came to the realization that:
(a) Nesting of "unions" [which are constructs like (a|b|c) ] does not work.
(b) Inside square brackets, escaping does not work AT ALL. 
    The expression [a\~b] matches any of the 4 characters a \ ~ b 
    The expression [a\]]  matches a] and \] but not a and not ]  
(c) Inside a union, if the characters | and ) are to be treated as ordinary 
    matching characters, they must be escaped, even when they appear inside 
    a bracketed expression inside the union.  But then the escape character 
    becomes part of the characters that match inside the brackets.  
(d) The last unescaped tilde '~' character begins an "exclusion" expression
    even when it occurs inside unions or brackets.  
(e) multiple exclusions do not work.  Only the last exclusion is honored as
    an exclusion.
(f) Escaping of tildes is flawed.  \\~ is treated as an escaped exclusion
    because it is scanned right to left.  
(g) Inside unions, exclusion expressions may not be used (crash).  This is 
    the essence of bug 504456.

More examples:

Example  Pattern      String       Match?   Is Behavior Correct?
----------------------------------------------------------------
 7      (a[.|]|b,)    a|           no      no
 8      (a[.\|]|b,)   a|           yes     yes
 9      (a[.\|]|b,)   a\           yes     no
10      (a[.\]]|b,)   a]           no      no 
11      ?~b~c         a            INVALID no
12      ?~b~c         a~b          INVALID no

I wrote a patch that corrects escaping within square brackets (item b above)
and detects and rejects exclusions within unions and between brackets, which
avoids the crashes (item g).  I also fixed escaping of tildes by scanning 
left to right (item f) .  But this does nothing to improve the other items.
i think we can live with items a and e, but I think I won't be satisfied 
while item c remains unfixed, but fixing it will be a much bigger job.  

Eliminating exclusions would greatly simplify things.  I suspect they're not
used much.  I wonder if it would be OK to just eliminate them, and cause 
tilde to stop having any special treatment at all.  Thoughts?
It just keeps getting worse.  
This expression matcher supposedly has a "case insensitive" mode. 
It doesn't work for expressions of the form [a-b].  [a-b] does not match A .
OK, I have attached some patches to bug 504456.   They include:
a) a patch to the shexp .h and .c files
b) a new command line test program
c) a test script that uses that program.

I'm MUCH happier with this set of patches.  It passes all my tests. 
I believe it implements the originally intended syntax, without introducing 
need for much much more escaping (which my previous patch did), or removing 
major functionality, like exclusions (e.g., expr1~expr2 ).  

Since this code isn't particularly crypto oriented, and is not peculiar to
NSS, I think any mozilla reviewer or super-reviewer could review it.  
I invite reviews.  Shall I attach the new patches to this bug also?
Attached patch Rollup patch from bug 504456 (obsolete) — Splinter Review
This one patch combines those attached to bug 504456.  Review invited.
Attachment #389408 - Flags: review?
One thing is clear.  If NS_WildcardMatch is called with an expression that 
doesn't pass the expression test done by NS_WildCardValid, then 
NS_WildcardMatch absolutely IS vulnerable.  For this reason, NSS always 
calls (NSS's equivalent of) NS_WildCardValid in every call to (NSS's 
equivalent of) NS_WildcardMatch, to avoid this vulnerability.  

It's not immediately obvious that the callers of NS_WildcardMatch always, 
or *ever*, call NS_WildCardValid first to ensure the expression is valid.  

I can adapt my NSS patch for these other copies.  These other copies are 
nearly identical, except that
a) one use PRUnichar* instead of char*, and 
b) some tweaking of the code for braketed expressions has been done for 
libjar's copy to try to fix escaping.  My patch will take care of this.
(Rather, I will update my patch to take care of this problem.)
Please ignore comment 12.  My existing patch already handles escaping in 
bracketed expressions correctly.
Attachment #389408 - Flags: review? → review?(alexei.volkov.bugs)
Adding NSS team members (potential patch reviewers) to CC list
blocking1.9.1: --- → .2+
Flags: wanted1.9.0.x+
Flags: wanted1.8.1.x+
Flags: blocking1.9.2?
Flags: blocking1.9.0.13+
Flags: blocking1.8.1.next+
Whiteboard: [sg:critical]
Summary: Problems with regexp parsing of '~' in nsIZipReader.findEntries → Problems with regexp parsing of '~' in nsIZipReader.findEntries (and other nsWildCard uses)
Attachment #389408 - Attachment is obsolete: true
Attachment #389408 - Flags: review?(alexei.volkov.bugs)
Comment on attachment 389408 [details] [diff] [review]
Rollup patch from bug 504456

This patch is superseded by a newer one in bug 504456. Please feel free to review that patch.
This patch is not a real Hg patch.  It's hand crafted to look as much like one
as I can without having to use Hg and download the ENTIRE mozilla 1.9.1 tree
to fix 3 files.  I hope it will apply and work like a regular Hg patch.
It does build.

This patch is for the 3 affected files in mozilla/modules/libjar
I'll work on a separate patch for the copy used in 
mozilla/toolkit/components/filepicker/src because that file is fundamentally
different in several ways.  Most notably, it operates on UTF-16 chars not
ASCII/UTF8 (like this one does).

I invite any of the cc list people on this bug to try this patch and give 
a review or at least a report of whether it worked for you.
Attachment #389619 - Flags: review?
This is the same patch as above, but created with diff -w. 
The indentation in this copy will look wrong, but it may give a better 
understanding of the real non-whitespace changes made to the files.
Does <jar:http://whereswalden.com/files/temp/antichess.jar!/> display as a directory listing with this patch?  It looks like it should at a glance, but if it doesn't, I don't think we care, given the havoc bug 369814 wreaked on displaying directory listings within zips anyway.
Whiteboard: [sg:critical] → [sg:critical][Embargoed until at least 2009-07-29 via bug 504977]
Comment on attachment 389619 [details] [diff] [review]
handcrafted patch for libjar version of nsWildCard files

Dave: can you do a review here? If not, please reassign to next most reasonable owner.
Attachment #389619 - Flags: review? → review?(dtownsend)
Dave, I think Mike meant reassign the review request, not the bug.

Having already written patches for two of firefox's 3 copies of this code,
I'm willing to fix the last one, too.  To do that, I need to know what 
platform-independent functions are recommended for use in Firefox, that 
do for arrays of UTF16 PR_UNIchar characters what the following functions 
do for arrays of ASCII char:

isalpha
isalnum
strlen
strdup
strcmp
strcpy
strchr
toupper
tolower
PL_strcasecmp
Assignee: nobody → nelson
Comment on attachment 389619 [details] [diff] [review]
handcrafted patch for libjar version of nsWildCard files

I won't really have time to do a thorough job of this before my vacation, Waldo has said he'll take a look
Attachment #389619 - Flags: review?(dtownsend) → review?(jwalden+bmo)
Comment on attachment 389620 [details] [diff] [review]
same patch ignoring whitespace changes

>-                    switch(_shexp_match(&str[x++],&expr[y], case_insensitive)) {
>+	        ret = _shexp_match(&str[x++], &expr[y], case_insensitive, 
>+				   level + 1);
[Side note: There's a way of handling wildcards without recursion, since you only ever need to backtrack to the last wildcard character e.g. if matching *x*y* if the y exists after an x then it must exist after the first x.]
I have no doubt that the algorithms can be substantially improved. 
My goals for this bug are only to remove the vulnerabilities without 
significantly affecting/reducing the grammar (the set of acceptable patterns).

BTW, further progress on the Unicode version of these functions (the copy 
found in toolkit/components/filepicker/src ) is blocked waiting for answers 
to the questions I posed in comment 20.  I'll try asking them in m.d.platform.
The patch looks broadly good so far, still need to spend more time wrapping my head around it, tho -- hope to finish tonight, might end up being tomorrow morning.  I also think I want to spend a little time hacking up a mini-fuzzer for this to try to verify non-crashingness on non-obvious inputs -- although it might be better if someone with more fuzz-fu did that instead.

One request which would ease review: can you fix the patch's indentation?  Looking at the diff view of it, I think the original file is correctly indented while the modified file includes tab characters that expand to a different number of spaces than your editor expands them, but I haven't gone to the effort of verifying this.
Jeff, 
A fuzzer would be good.  I wrote a little test program and a test script 
which tests maybe 100 patterns.  You'll find it in the "roll up" patch 
above.  It will need some very tiny tweaking (function name changes) to 
work with this latest patch.  

As for indentation, I can imagine that's different.  NSS's coding style is
tabs every 8, but indentation quantum is 4, so indentation is done using 
a combination of spaces and tabs.  My editor is configured for that, and 
does not recognize or honor "mode lines".  I took the NSS code and modified
it a little to create this patch, so this patch follows the NSS style, and
not whatever style was used before.  But that can be fixed.  

Please summarize the preferred indentation for me.  Tabs every ? characters.
Indentation quantum is ? characters.  Any other special rules?  
Does firefox prefer 
  } else { 
or 
  }
  else {
or
  }
  else
  {
or ?
}
else {

no tabs ever
indent matches existing practice in the file, usually either 2 or 4
Attached patch (wrong file) (obsolete) — Splinter Review
This should be nearly the same patch as before, but more readable.
This patch will seem bigger because the original file had tabs in it
and I got rid of them all.
Attachment #389619 - Attachment is obsolete: true
Attachment #389620 - Attachment is obsolete: true
Attachment #389619 - Flags: review?(jwalden+bmo)
Comment on attachment 390258 [details] [diff] [review]
(wrong file)

wrong file. sorry
Attachment #390258 - Attachment is obsolete: true
Attachment #390258 - Attachment description: patch v4 - using no tabs, else on separate line → (wrong file)
Comment on attachment 390260 [details] [diff] [review]
Patch v4 - no tabs, else on separate line

This patch is the same as v3 except for formatting.
Attachment #390260 - Flags: review?(jwalden+bmo)
Comment on attachment 390260 [details] [diff] [review]
Patch v4 - no tabs, else on separate line

Thinking back on fuzzing, actually, anyone who watches our commits is going to find it abundantly clear that there were issues before and we had to throw out the entire algorithm and replace it.  Given that I'd be astounded if at least *someone* non-trustworthy doesn't fuzz the life out of this to find followup issues, possibly in conjunction with extensions out there as well as with the NSS version of this.  Thinking harder, I'd like to spend time writing a fuzzer for this, but comparative advantage and time constraints dictate that I leave it to someone with more expertise at it than I currently have.


>+static int
>+_scan_and_copy(const char *expr, char stop1, char stop2, char *dest)
> {
>-    char *e2 = (char *) PR_Malloc(sizeof(char)*strlen(expr));
>-    register int t,p2,p1 = 1;
>-    int cp;
>-
>-    while(1) {
>-        for(cp=1;expr[cp] != ')';cp++)
>-            if(expr[cp] == '\\')
>-                ++cp;
>-        for(p2 = 0;(expr[p1] != '|') && (p1 != cp);p1++,p2++) {
>-            if(expr[p1] == '\\')
>-                e2[p2++] = expr[p1++];
>-            e2[p2] = expr[p1];
>+    register int sx;     /* source index */
>+    register char cc;
>+
>+    for (sx = 0; (cc = expr[sx]) && cc != stop1 && cc != stop2; sx++) {
>+        if (cc == '\\') {
>+            if (!expr[++sx])
>+                return ABORTED; /* should be impossible */

This doesn't seem to correctly handle an escaped ) or ] or | or what-have-you.  I think you need to do the != checks, excepting for '\0', inside the loop at top after unescaping as necessary.

The /* should be impossible */ and return should be augmented with a PR_ASSERT(0), assuming there's no reason a prlog.h include can't be added here.  (If there is such a reason, a #ifdef DEBUG-guarded dereference of NULL or integer division by zero would also suffice.)  If it happened to be possible to hit this, just failing is not going to be sufficient for us to discover the problem.


>         }
>-        for (t=cp+1; ((e2[p2] = expr[t]) != 0); ++t,++p2) {}
>-        if(_shexp_match(str,e2, case_insensitive) == MATCH) {
>-            PR_Free(e2);
>-            return MATCH;
>+        else if (cc == '[') {
>+            while ((cc = expr[++sx]) && cc != ']') {
>+                if(cc == '\\' && !expr[++sx])
>+                    return ABORTED;
>+            }
>+            if (!cc)
>+                return ABORTED; /* should be impossible */
>         }
>-        if(p1 == cp) {
>-            PR_Free(e2);
>-            return NOMATCH;
>+    }
>+    if (dest && sx) {
>+        /* Copy all but the closing delimiter. */
>+        memcpy(dest, expr, sx);
>+        dest[sx] = 0;
>+    }
>+    return cc ? sx : ABORTED; /* index of closing delimiter */

This doesn't seem to properly handle an escaped ~, returning ABORTED, which breaks matching such regular expressions even though they appear to be accepted as valid.


>+/* returns 1 if val is in range from start..end, case insensitive. */
>+static int
>+_is_char_in_range(int start, int end, int val)
>+{
>+    char map[256];
>+    memset(map, 0, sizeof map);
>+    while (start <= end)
>+        map[tolower(start++)] = 1;
>+    return map[tolower(val)];
>+}

So ranges are always case-insensitive then?  Was this the case before?  I wasn't aware that it was.


>+        case '[': {
>+            int start, end = 0, i;
>+            neg = ((expr[++y] == '^') && (expr[y+1] != ']'));
>+            if (neg)
>+                ++y;
>+            i = y;
>+            start = (unsigned char)(expr[i++]);
>+            if (start == '\\')
>+                start = (unsigned char)(expr[i++]);
>+            if (isalnum(start) && expr[i++] == '-') {
>+                end = (unsigned char)(expr[i++]);
>+                if (end == '\\')
>+                    end = (unsigned char)(expr[i++]);
>+            }
>+            if (isalnum(end) && expr[i] == ']') {
>+                /* This is a range form: a-b */
>+                int val   = (unsigned char)(str[x]);
>+                if (end < start)  /* swap them */
>+                    start ^= end ^= start ^= end;

Ignoring that this is basically inscrutable even with the comment, C doesn't work this way.  There are no sequence points in that expression, so you're modifying lvals at the same time they're being used as rvals, and if it works it's only an accident of the compiler.  Break it out into a temporary and spread it over three lines.  See also bug 356454, particularly bug 356454 comment 8.


>+                if (case_insensitive && isalpha(val)) {
>+                    val = _is_char_in_range(start, end, val);
>+                    if (neg == val)
>+                        return NOMATCH;
>+                }
>+                else if (neg != ((val < start) || (val > end))) {
>+                    return NOMATCH;
>+                }
>+                y+=3;

This falls over in some way I don't want to think hard to figure out on [\a-\b], can't be a fixed-number increment here.  Should this just be |y = i|?


>+        case '?':
>+            break;

This doesn't work if we're at the end of the input string, does it?  Looks like out-of-bound reads, maybe worse.


>+static int
>+ns_WildCardMatch(const char *str, const char *xp, PRBool case_insensitive)

This also doesn't seem to handle escaped ~ very well (more of what was mentioned earlier, I guess).
Attachment #390260 - Flags: review?(jwalden+bmo) → review-
CCing dkeeler in case he can help with the fuzzing here...
Jeff, 
You wrote a lot of "doesn't seem to".  Please give actual expressions 
and inputs with which it will produce the wrong results.  
Have you looked at my test script, and all the cases it tests?  
(It's in the first attachment above.)  I think you'll find that it 
tests (successfully) many of the cases about which you're concerned.
I will happily add any additional cases you write to that script.

Some other specific comments:

> So ranges are always case-insensitive then? 

That function is only called to handle case-insensitive ranges, 
so, within that function, ranges are always case-insensitive.

> There are no sequence points in that expression

I'll change it.  I've never seen it fail with any c compiler in the 25 years
I've been using it.  (Bug 356454 is about a javascript extension, and I can 
well believe that the rules are different there.)  But, a good optimizer 
should be able to generate just as good code if it is broken out, so I don't
mind changing it.

>> y += 3;
> Should this just be |y = i|?

Could be.  I thought I changed that.  I'll look at it again.  

> This doesn't work if we're at the end of the input string, does it? 

Yes.  It cannot be reached at end of string.  The if test at the beginning
of the enclosing for loop precludes it.
> anyone who watches our commits is going to find it abundantly clear that 
> there were issues before and we had to throw out the entire algorithm and 
> replace it. 

Yes.  However, the window of opportunity is small.  Moxie is going public 
with this at Black Hat on Wednesday, 6 days from now.
> This doesn't seem to properly handle an escaped ~, returning ABORTED, 
> which breaks matching such regular expressions even though they appear 
> to be accepted as valid.

Remember that the ~ that is the separator between the two subexpressions 
in an exclusion pattern (e.g. the ~ in pattern1~pattern2 ) is replaced with
a NUL ('\0') before being passed to this function.  So, I think the case 
that concerns you cannot happen.
I added a bunch more test cases with trailing special characters in patterns,
both escaped and unescaped, and I found that the function for validating a
pattern did not agree with the function for comparing a string with a pattern
about whether certain character were special or not.  Fixing that solved a 
few more corner cases.  

I also discovered that in cases that appear to be range tests, but the 
range delimiters are not alphanumeric, e.g. [A-\]]  such ranges are actually
interpreted as 3 character unions, e.g. matching any of 'A', '-', or ']' .
To whatever extent this is surprising, it results in some unexpected matches 
and non-matches.  For example [A-\]] matches - and does not match B.  

I want to solicit opinions about what (if anything) to do about it.  
Three possibilities spring to mind:

a) leave it alone, as is
b) remove the requirement that the range delimiter characters must be 
   alphanumeric characters. 
c) detect the case of an expression that looks like a range expression but
   has non-alphanumeric delimiters and report an invalid shell expression.

For now I will leave it alone.  But I solicit opinions about it.
New patch forthcoming.
Attached patch Patch v5Splinter Review
This patch addresses all the faults found with the additional test cases,
and also breaks out the variable swapping into 3 statements.
No new PR_Asserts though.   New test script to follow.
Attachment #390260 - Attachment is obsolete: true
Attachment #390412 - Flags: review?(jwalden+bmo)
(In reply to comment #33)
> Jeff, 
> You wrote a lot of "doesn't seem to".  Please give actual expressions 
> and inputs with which it will produce the wrong results.  

"Sorry", that's partly politeness and partly that my impression has always been that the burden of demonstrating patch correctness lies with the patch author, not with the reviewer, so the author is obligated to perform reasonably-directed investigation when requested.


> Have you looked at my test script, and all the cases it tests?  

No, I didn't want to know what patterns worked a priori so as to keep the review reasonably black-box.


> > So ranges are always case-insensitive then? 
> 
> That function is only called to handle case-insensitive ranges, 
> so, within that function, ranges are always case-insensitive.

To be clear, this question was to double-check my own understanding, not to suggest there was a problem.


> > There are no sequence points in that expression
> 
> I'll change it.  I've never seen it fail with any c compiler in the 25 years
> I've been using it.  (Bug 356454 is about a javascript extension, and I can 
> well believe that the rules are different there.) 

Did you read the specific comment I noted, which specifically deals with C99's semantics?  It was only happenstance that the context was the same issue in a different language.

Further, you're still doing the inscrutable xor.  Switch them via a temporary variable please!


> Yes.  It cannot be reached at end of string.  The if test at the beginning
> of the enclosing for loop precludes it.

Oh, I see, I misread the condition.

Still looking and trying to convince myself of what you've mentioned, getting closer...
> Still looking and trying to convince myself of what you've mentioned, 
> getting closer...

OK, I am encouraged by this.  Please let me know when you have completed 
your review on the correctness of the code, and then I will address any 
remaining stylistic issues.  

I am a little surprised that you object to the XOR code.  Surely its' 
correctness is now in doubt. (?) It uses no more instructions, and uses one less register than a scheme that uses a temp variable.  In a register starved CPU like the x86, in an inner loop, reducing the number of registers can be a big win.  But if you think that the XOR logic is too complicated for Firefox
developers, then I will be happy to change it to use an additional register.
er, is NOT is doubt.  :(
(In reply to comment #36)
> I also discovered that in cases that appear to be range tests, but the 
> range delimiters are not alphanumeric, e.g. [A-\]]  such ranges are actually
> interpreted as 3 character unions, e.g. matching any of 'A', '-', or ']' .
> To whatever extent this is surprising, it results in some unexpected matches 
> and non-matches.  For example [A-\]] matches - and does not match B.  
> 
> I want to solicit opinions about what (if anything) to do about it.  
> Three possibilities spring to mind:
> 
> a) leave it alone, as is
> b) remove the requirement that the range delimiter characters must be 
>    alphanumeric characters. 
> c) detect the case of an expression that looks like a range expression but
>    has non-alphanumeric delimiters and report an invalid shell expression.

a) is closest to previous behavior, so I think that's best for now.  Practically, it probably would be best if we ripped out this custom matching engine and used JS/ES5 regex processing whenever possible (this leaves NSS high and dry, but syntax requirements there are exogenous, so it can't switch anyway).  However, that magnitude of fix isn't possible now, so it seems best to me to keep it as similar as possible until we can make a clean break.
(In reply to comment #40)
> I am a little surprised that you object to the XOR code.  Surely its' 

"its", since I happened to notice it

> correctness is no[t] in doubt. (?) It uses no more instructions, and uses one
> less register than a scheme that uses a temp variable.  In a register starved
> CPU like the x86, in an inner loop, reducing the number of registers can be a
> big win.  But if you think that the XOR logic is too complicated for Firefox
> developers, then I will be happy to change it to use an additional register.

It's correct, sure, but what exactly it does is not at all clear unless you've heard of the xor-swap trick before.  I expect any compiler worth its salt will optimize out the temporary location with a register-swap instruction (or even the xor trick if it doesn't have such an instruction); if it doesn't, that's its bug, and I don't think that it's worth caring about optimizing for that CPU.  Besides -- do you actually know that compilers will optimize the xors into a swap instruction?  If I were writing a compiler I wouldn't spend time on it, given the pattern isn't easy to read and has a simple workaround with generalized utility.  Last, the xor-swap's inter-instruction dependencies aren't much going to make pipelined CPUs happy these days.
Comment on attachment 390412 [details] [diff] [review]
Patch v5

That matching is broken on regexes with escaped ~ doesn't make me happy, but pragmatically if it's non-exploitable that should be good enough, given that this code isn't used from very many places (or, probably, by very many users).
Attachment #390412 - Flags: review?(jwalden+bmo) → review+
> matching is broken on regexes with escaped ~

Please cite an example.
Oh, I see now, never mind.

Pattern matching is hard, let's go shopping.
Thanks, Jeff.

Now, I'd like to ask someone to get the necessary approvals for this, 
(whatever those are this week) and commit it to the appropriate Hg trees 
for me.  

Next I will work on a comparable patch for the copy of this code in 
toolkit/components/filepicker/src
Keywords: checkin-needed
Nelson: I've asked Jeff to land this on Wednesday, July 29 at 1:45pm PDT across all branches (a=me).

Dan's investigating if there are other places we need this fix on the 1.8 branch.
Whiteboard: [sg:critical][Embargoed until at least 2009-07-29 via bug 504977] → [sg:critical][DO NOT LAND until 7/29 at 1:45pm PDT]
++sx occurs before the expr[sx] != ')' check in _handle_union, so we go off the end if the expr/string don't match.  Example from startup:

NS_WildCardMatch (str=0x2a08470 "skin/classic/browser/urlbar/textfield-mid-focused.png", expr=0x190ea040 "(M|/M)ETA-INF/(M|m)(ANIFEST|anifest).(MF|mf)$", case_insensitive=0)

Relanding with that in a second.
Flags: blocking1.9.0.13+
Whiteboard: [sg:critical][DO NOT LAND until 7/29 at 1:45pm PDT] → [sg:critical]
Seeing some green in builds with both original fix and followup that wasn't there last time with only the former, optimistically marking as fixed now:

http://hg.mozilla.org/mozilla-central/rev/8698be604d6d

http://hg.mozilla.org/releases/mozilla-1.9.1/rev/7a9fe7dfac8a

http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&branch=HEAD&sortby=Date&date=explicit&mindate=1248912378&maxdate=1248913378

http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&branch=GECKO190_20090706_RELBRANCH&sortby=Date&date=explicit&mindate=1248901378&maxdate=1248913378

1.8 might not be as simple a fix because it precedes modifications made by me in bug 309296; I haven't made an effort to do the backport yet.

Nelson, regarding the filepcker/src changes in older trees, how do those go?  In retrospect those probably should have been part of the original patch, not as a followup patch.  :-\
Keywords: checkin-needed
Jeff, I'd like to see the stack trace of the crashes reported on IRC.
When I run the test program with those values for str and expr, 
I don't get a crash.  It doesn't go off the end of the string. 
It DOES return -1 instead of 1, which is not strictly correct (though both
values mean a match was not found), but not a vulnerability in itself.  

Did that return value cause the caller to crash?
The stacks are many and varied.  Several crashes in destructors (called from Release() implementations), several crashes while removing entries from hashtables, at least one crash deep in the guts of libc somewhere (called from a destructor).

The one Windows log I looked at has lots of:

firefox-bin(67464,0xa050ffa0) malloc: *** error for object 0xe2d01e0: incorrect checksum for freed object - object was probably modified after being freed.

The one Mac log I looked at has lots of:

firefox-bin(252,0xa000cfc0) malloc: ***  Deallocation of a pointer not malloced: 0x37e0024; This could be a double free(), or free() called with the middle of an allocated block; Try setting environment variable MallocHelp to see tools to help debug

That would be consistent with all the crashes inside destructors...
And all of the above would also be consistent with us just scribbling on random memory, of course...
@Boris:  I'd like to have a test case, a set of inputs to the function that
causes the function to do something catastrophic, such as an out-of-bounds
access (read or write).  The test data in comment 49 simply doesn't crash or 
cause any heap corruption or stack corruption in my tests.  I see no evidence 
that nsWildCard scribbles on random memory.

Is the crash is due to behavior of the caller when the nsWildCard function 
returns -1 (INVALID_SXP) instead of 1 (NO MATCH) ?  

@Jeff, 
In answer to your question in comment 50 about the filepicker/src sources,
the version of the code there has been changed to operate on arrays of 
PRUnichar rather than arrays of char.  It calls functions such as isalpha,
isalnum, toupper and tolower on input values that are 16-bit values.  

IIRC, certain versions of the MSVC runtime throw an exception of those 
functions are called with a value outside of the range 0..255 . 
So, I've been contemplating ways to make that code work for Unicode values.

As a very quick and expedient approach, I could write wrappers for those 
functions that behave as follows when the input is not in range 0..255:
 - toupper, tolower: return input value
 - isalpha, isalnum: return false
 - PL_strcasecmp, behave as strcmp for those values.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Nelson, the callers just compare the return value to MATCH.  So I really doubt that's where the problem lies.

But really, you have all the same information as I do here.  Load <http://tests.themasta.com/tinderboxpushlog/>.  Scroll down to the part that says "ab1686101bfb Nelson Bolyard – Bug 332173 - Problems with regexp parsing of '~' in nsIZipReader.findEntries (and other nsWildCard uses). r=jwalden, a=ss".  See all the orange letters on the right?  Pick a letter.  Click on it.  Select "View Full Log" at the bottom of the page (or copy that url and wget it, or save link as, or whatever you prefer).

Also, you could add some logging to the function in question to log the string and pattern when hitting the error condition Jeff found.  Then see which of those look like they might cause trouble.
For example, breakpointing on one of those malloc errors on mac (which happens when allocating e2; this looks like the checksum error above, which must have come from a Mac build; I got my builds confused) gives expr at that point as:

(gdb) p expr 
$6 = 0x1beb0490 "(ANIFESTETA-INF/(M|m)(ANIFEST|anifest).(MF|mf)$"
Valgrind is your friend:

==6562== Invalid write of size 1
==6562==    at 0x16296: memcpy (mc_replace_strmem.c:482)
==6562==    by 0x574CC2: nsCharTraits<char>::copy(char*, char const*, unsigned long) (nsCharTraits.h:510)
==6562==    by 0x57495E: int _handle_union<char>(char const*, char const*, int, unsigned int) (nsWildCard.cpp:238)
==6562==    by 0x5747AB: int _shexp_match<char>(char const*, char const*, int, unsigned int) (nsWildCard.cpp:352)
==6562==    by 0x574A6F: int ns_WildCardMatch<char>(char const*, char const*, int) (nsWildCard.cpp:386)
==6562==    by 0x574D7F: int NS_WildCardMatch_<char>(char const*, char const*, int) (nsWildCard.cpp:418)
==6562==    by 0x574B92: NS_WildCardMatch(char const*, char const*, int) (nsWildCard.cpp:426)
==6562==    by 0x1A9524FE: nsZipFind::FindNext(char const**) (nsZipArchive.cpp:452)
==6562==    by 0x1A956732: nsJAREnumerator::HasMore(int*) (nsJAR.cpp:912)
==6562==    by 0x1A957729: nsJAR::ParseManifest() (nsJAR.cpp:530)
==6562==    by 0x1A95809E: nsJAR::GetCertificatePrincipal(char const*, nsIPrincipal**) (nsJAR.cpp:376)
==6562==    by 0x1A95E7B6: nsJARChannel::GetOwner(nsISupports**) (nsJARChannel.cpp:506)
==6562==  Address 0x18371291 is 3 bytes after a block of size 46 alloc'd
==6562==    at 0x12516: malloc (vg_replace_malloc.c:194)
==6562==    by 0x15C0EC: PR_Malloc (prmem.c:467)
==6562==    by 0x5B3137: NS_Alloc_P (nsMemoryImpl.cpp:315)
==6562==    by 0x5748C6: int _handle_union<char>(char const*, char const*, int, unsigned int) (nsWildCard.cpp:225)
==6562==    by 0x5747AB: int _shexp_match<char>(char const*, char const*, int, unsigned int) (nsWildCard.cpp:352)
==6562==    by 0x574A6F: int ns_WildCardMatch<char>(char const*, char const*, int) (nsWildCard.cpp:386)
==6562==    by 0x574D7F: int NS_WildCardMatch_<char>(char const*, char const*, int) (nsWildCard.cpp:418)
==6562==    by 0x574B92: NS_WildCardMatch(char const*, char const*, int) (nsWildCard.cpp:426)
==6562==    by 0x1A9524FE: nsZipFind::FindNext(char const**) (nsZipArchive.cpp:452)
==6562==    by 0x1A956732: nsJAREnumerator::HasMore(int*) (nsJAR.cpp:912)
==6562==    by 0x1A957729: nsJAR::ParseManifest() (nsJAR.cpp:530)
==6562==    by 0x1A95809E: nsJAR::GetCertificatePrincipal(char const*, nsIPrincipal**) (nsJAR.cpp:376)

and so on and so on.
Boris, is that supposed to be a call stack? 
Those look like c++ templates.  The _handle_union function on which I worked 
is not a template, and does not call nsCharTraits<char>::copy.
Oh, steps to reproduce are to try starting mochitests.  Just starting the browser works fine, far as I can tell.

> Boris, is that supposed to be a call stack? 

It's two callstacks.  One to the invalid write, one to the allocation of the block we're presumably trying to write to.

> The _handle_union function on which I worked is not a template, and does not
> call nsCharTraits<char>::copy.

The function you worked on got templatized and moved to xpcom/io/nsWildCard.cpp over two months ago; presumably Jeff merged your changes to that.  See bug 487192 for the templatization and http://hg.mozilla.org/mozilla-central/log/d9eb98022817/xpcom/io/nsWildCard.cpp for your change being landed, backed out, relanded.

The logic that landed is identical to the logic in your patch, except for being templatized to work on both char and PRUnichar.
Actually, not quite.

    e2 = (T *) NS_Alloc(1 + nsCharTraits<T>::length(expr));

vs

    e2 = (char *) PR_Malloc(1 + strlen(expr));

The NS_Alloc should presumably have |* sizeof(T)|, no?
Attached patch With * sizeof(T)Splinter Review
Good catch.  Note that this is only a trunk problem, where this code has been templatized to handle both char and PRUnichar -- I had a bit of a time porting the changes and hit a number of cases like this but missed this one.  :-\

Repeat: the problem fixed by this small patch is only pertinent to mozilla-central -- branches are not at all affected in any way, because they only work on char strings.
bz pointed out another I missed, so I went back and checked; the final checkin addresses both NS_Alloc calls and eliminates the problem entirely.

http://hg.mozilla.org/mozilla-central/rev/3ec6e0fa85ec
OK, as I understand it, the "fix" for the filepicker code is no longer 
desired or needed because of the templatization work (that I have not yet 
seen).  So, I think we're done here.
The filepicker code still needs a fix -- just not in mozilla-central, because trunk has unified the two copies present in 191, 190, and so on into the single location.  I haven't looked, but it might almost be worth porting the unification+trunk changes over to branches to address that -- could be easier than the mess that merging this to the unified version was for me.
I don't oppose you back porting the templatized code.
General report on the orange, for future reference.  There were two separate issues:

1)  In the Unicode codepath (which I think is not really exercized much, if at
    all, in our unit tests) we were allocating buffers that were 2x too small.
2)  In both codepaths, the logic error in the loop in _handle_union meant that we
    could overrun the e2 buffer.  Specifically, in the following situation:

      (a|b)xxxxxxxx(a|b)

    we would enter the loop a third time, end up with count being "# of x's
    plus 2", and since in this case |cp| would be 4 the strcpy (or memcpy in
    the m-c version) that copies from expr+cp to e2+count would copy "# of x's
    plus 5" chars.  Thus we'd copy a total of "2*(# of x's) + 7" chars.  Since
    the buffer size we allocated is "# of x's + 11", we'd run into trouble any
    time there are more than 4 x's in the string.  For longer alternations,
    we'd need more padding between them, of course.  The example in comment 49
    has 8 chars between the first two alternations, so hits this condition,
    even though the first alternation is 6 chars, not 5.

So the test program mentioned in comment 51 should probably be fixed if it's to detect the buffer-overrun situation, if it didn't detect it in this case....
I think there remains a vulnerability in the templatized version of this code,
seen at 
http://mxr.mozilla.org/mozilla-central/source/xpcom/io/nsWildCard.cpp#249

When the characters are not "char", but some wider type, then the values 
of "start" and "end" passed to this function can be greater than 255, 
resulting in out-of-bounds memory accesses.  Dunno if it's exploitable beyond
mere DOS.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Er, yes.  Good catch.  That code needs to either guard on the values of everything involved or use a different algorithm.
Depends on: 507485
(In reply to comment #67)
> When the characters are not "char", but some wider type, then the values 
> of "start" and "end" passed to this function can be greater than 255, 
> resulting in out-of-bounds memory accesses.  Dunno if it's exploitable beyond
> mere DOS.

I'm not so sure; note that when called we've verified that isalnum(int(start)) and isalnum(int(end)).  What locale are we?  Whether we're fine here or not depends on that, I think.  I filed bug 507485 to address this concern.
OK, since the issue of comment 67 has been moved to bug 507485, this bug
can be resolved/fixed again.
Severity: normal → critical
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Priority: -- → P1
Resolution: --- → FIXED
What is the best/simplest way for QA to verify this bug?
I don't know, because I don't know what browser code uses this function,
nor where it gets its patterns from.  
Jeff, can you answer this?
Practically, we can verify that some problems are gone, but for an effective rewrite of this scale it's not really possible to fully verify, unfortunately.


host-5-43:~ jwalden$ ~/moz/bs/obj-i386-apple-darwin8.11.1/dist/bin/xpcshell -i
js> var lf =
Components.classes["@mozilla.org/file/local;1"].createInstance(Components.interfaces.nsILocalFile);
var zr =
Components.classes["@mozilla.org/libjar/zip-reader;1"].createInstance(Components.interfaces.nsIZipReader);
js> js> 
js> lf.initWithPath("/tmp/foo.zip")
js> zr.open(lf);
var en = zr.findEntries("(a|b)(a|~)b");
en.getNext();
js> uncaught exception: [Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIZipReader.findEntries]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: typein :: <TOP_LEVEL> :: line 8"  data: no]
js> typein:9: TypeError: en is undefined

is sufficient to verify it works in mozilla-central.

host-5-43:~/moz/191/obj-i386-apple-darwin8.11.1/dist/bin jwalden$ ./xpcshell -i
js> var lf =
Components.classes["@mozilla.org/file/local;1"].createInstance(Components.interfaces.nsILocalFile);
var zr =
Components.classes["@mozilla.org/libjar/zip-reader;1"].createInstance(Components.interfaces.nsIZipReader);js> 
js> lf.initWithPath("/tmp/foo.zip")
js> zr.open(lf);
js> var en = zr.findEntries("(a|b)(a|~)b");
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80070057: file /Users/jwalden/moz/191/modules/libjar/nsJAR.cpp, line 309
uncaught exception: [Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIZipReader.findEntries]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: typein :: <TOP_LEVEL> :: line 7"  data: no]

verifies it works in 191, although the filepicker copy remains to be fixed, and it's hit with a different code path, of course.

I don't have a 190 tree/build at the moment, so it'll take longer to run the above test there -- doing so now.
Jeff, I want to check and make sure that no-one is waiting on me to write 
any more patches for this bug.  I think the plan of record is to back-port 
the templatized version of this code from m-c to the 191 branch, and that 
you're doing that.  Am I mistaken?
And in 190:

host-5-43:~/moz/190/mozilla jwalden$ obj-i386-apple-darwin8.11.1/dist/bin/xpcshell -i
js> var lf =
Components.classes["@mozilla.org/file/local;1"].createInstance(Components.interfaces.nsILocalFile);
js> var zr =
Components.classes["@mozilla.org/libjar/zip-reader;1"].createInstance(Components.interfaces.nsIZipReader);
js> lf.initWithPath("/tmp/foo.zip")
js> zr.open(lf);
js> var en = zr.findEntries("(a|b)(a|~)b");
uncaught exception: [Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIZipReader.findEntries]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: typein :: <TOP_LEVEL> :: line 7"  data: no]
Jeff, in your opinion...is it safe to mark this verified1.9.1?
(In reply to comment #74)
> Jeff, I want to check and make sure that no-one is waiting on me to write 
> any more patches for this bug.  I think the plan of record is to back-port 
> the templatized version of this code from m-c to the 191 branch, and that 
> you're doing that.  Am I mistaken?

I looked a little further into backporting the merger, and I'm not sure it's worth the effort to do so.  Rather, I think a better course of action would be to back-copy the templatized version and simply implement the expected PRUnichar-based method as the sole specialization of it.  I'll take care of that, patch anon.
(In reply to comment #76)
> Jeff, in your opinion...is it safe to mark this verified1.9.1?

Well...technically the filepicker code still needs to be changed.  However, since that's the XUL filepicker that's not enabled by default, and further since it only uses a predefined set of filters given here, all of which are simple and don't use any of the constructs we had trouble handling:

http://mxr.mozilla.org/mozilla-central/source/toolkit/locales/en-US/chrome/global/filepicker.properties

I'll call this verified.  I'll dump the filepicker cleanup in a separate bug; this one is outliving its usefulness.
As per the last few comments and speaking with Juan, verified1.9.1.
Keywords: verified1.9.1
Adding verified1.9.0.13 keyword per comment #75.  Thanks for your help verifying this, Jeff.
Lowering the severity. The bug is real, but the regexp isn't taken from content in any way. It's theoretically possible some add-on somewhere was doing so, but we haven't found any.
Whiteboard: [sg:critical] → [sg:low]
Marking verified for 1.9.0.14 on the comments. This is as good as it is going to get.
Blocks: 507758
Group: core-security
Flags: blocking1.9.2? → blocking1.9.2+
Needs to land on mozilla-1.9.2? This blocks our ability to release a beta ...
Keywords: checkin-needed
Whiteboard: [sg:low] → [sg:low][see comment 83]
This landed before we branched, with followup bug 507485
Keywords: checkin-needed
Whiteboard: [sg:low][see comment 83] → [sg:low]
See also bug 514872.  That fix is needed wherever this bug's fix went.
Depends on: 514872
You need to log in before you can comment on or make changes to this bug.