Last Comment Bug 112626 - String.split(regexp) crashes if regexp contains ()'s
: String.split(regexp) crashes if regexp contains ()'s
Status: VERIFIED FIXED
: crash, js1.5
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: All Windows NT
-- critical (vote)
: mozilla1.0.1
Assigned To: rogerl (gone)
: Phil Schwartau
: Jason Orendorff [:jorendorff]
Mentors:
Depends on:
Blocks: 149801
  Show dependency treegraph
 
Reported: 2001-11-29 07:55 PST by Christian Schneider
Modified: 2006-08-19 09:52 PDT (History)
5 users (show)
bob: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Testcase from the reporter (159 bytes, text/html)
2001-11-29 10:12 PST, Matthias Versen [:Matti]
no flags Details
Crash case for same (428 bytes, text/html)
2001-12-10 10:07 PST, Chris King
no flags Details
Set RegExp static input (483 bytes, patch)
2002-06-12 13:31 PDT, rogerl (gone)
rginda: review+
brendan: superreview+
jud: approval+
Details | Diff | Splinter Review

Description User image Christian Schneider 2001-11-29 07:55:12 PST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
BuildID:    2001112009

The following page causes a segmentation fault. Should be easy to find...

<HTML>
<HEAD>
<SCRIPT language="JavaScript">
	var _cs='2001-01-01';
	var curTime = _cs.split(/([- :])/);
</SCRIPT>
</HEAD>
<BODY>

</BODY>
</HTML>

Reproducible: Always
Steps to Reproduce:
1. call the webpage in description
2. boom
3.

Actual Results:  Segmentation Fault

Expected Results:  Calculate the correct result
Comment 1 User image Boris Zbarsky [:bz] (still a bit busy) 2001-11-29 08:25:03 PST
worksforme, 2001-11-28 linux build.  No crash.
Comment 2 User image Matthias Versen [:Matti] 2001-11-29 08:59:40 PST
confirmend with win2k build 20011129..

I will try to get a stack trace but my debug doesn't crash (updating now)
Comment 3 User image Matthias Versen [:Matti] 2001-11-29 10:12:10 PST
Created attachment 59708 [details]
Testcase from the reporter

it doesn't crash my debug build.
0.9.6 talkback ID: TB38670554Q
Comment 4 User image Phil Schwartau 2001-11-29 16:54:49 PST
Stack trace from Talkback incident TB38670554Q:

Build ID 2001112012
Product ID MozillaBranch
Operating System Win32
Stack Trace

str_split [d:\builds\seamonkey\mozilla\js\src\jsstr.c, line 1846] 
js_Invoke [d:\builds\seamonkey\mozilla\js\src\jsinterp.c, line 834] 
js_Interpret [d:\builds\seamonkey\mozilla\js\src\jsinterp.c, line 2792] 
js_Execute [d:\builds\seamonkey\mozilla\js\src\jsinterp.c, line 1014] 
JS_EvaluateUCScriptForPrincipals [d:\builds\seamonkey\mozilla\js\src\jsapi.c, 
                                                              line 3380] 
nsJSContext::EvaluateString 
[d:\builds\seamonkey\mozilla\dom\src\base\nsJSEnvironment.cpp, line 654] 
nsScriptLoader::EvaluateScript 
[d:\builds\seamonkey\mozilla\content\base\src\nsScriptLoader.cpp, line 576] 
nsScriptLoader::ProcessRequest 
[d:\builds\seamonkey\mozilla\content\base\src\nsScriptLoader.cpp, line 484] 
nsScriptLoader::ProcessScriptElement 
[d:\builds\seamonkey\mozilla\content\base\src\nsScriptLoader.cpp, line 428] 
nsHTMLScriptElement::SetDocument 
[d:\builds\seamonkey\mozilla\content\html\content\src\nsHTMLScriptElement.cpp, 
                                                                  line 159] 
nsGenericContainerElement::AppendChildTo 
[d:\builds\seamonkey\mozilla\content\base\src\nsGenericElement.cpp, line 3724] 
HTMLContentSink::ProcessSCRIPTTag 
[d:\builds\seamonkey\mozilla\content\html\document\src\nsHTMLContentSink.cpp, 
                                                                    line 5126] 
HTMLContentSink::AddLeaf 
[d:\builds\seamonkey\mozilla\content\html\document\src\nsHTMLContentSink.cpp, 
                                                                    line 3497] 
CNavDTD::AddLeaf [d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp, line 
                                                                         3774] 
CNavDTD::AddHeadLeaf [d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp, 
                                                                    line 3833] 
CNavDTD::HandleStartToken 
[d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp, line 1719] 
CNavDTD::HandleToken [d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp, 
                                                                     line 895] 
CNavDTD::BuildModel [d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp, 
                                                                     line 526] 
nsParser::BuildModel [d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, 
                                                                    line 1983] 
nsParser::ResumeParse [d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, 
                                                                    line 1847] 
nsParser::OnDataAvailable 
[d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 2505] 
nsDocumentOpenInfo::OnDataAvailable 
[d:\builds\seamonkey\mozilla\uriloader\base\nsURILoader.cpp, line 260] 
nsFileChannel::OnDataAvailable 
[d:\builds\seamonkey\mozilla\netwerk\protocol\file\src\nsFileChannel.cpp, line 
                                                                          511] 
nsOnDataAvailableEvent::HandleEvent 
[d:\builds\seamonkey\mozilla\netwerk\base\src\nsStreamListenerProxy.cpp, line 
                                                                          203] 
PL_HandleEvent [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 591] 
PL_ProcessPendingEvents [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, 
                                                                     line 524] 
_md_EventReceiverProc [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 
                                                                         1072] 
nsAppShellService::Run 
[d:\builds\seamonkey\mozilla\xpfe\appshell\src\nsAppShellService.cpp, line 303] 
main1 [d:\builds\seamonkey\mozilla\xpfe\bootstrap\nsAppRunner.cpp, line 1316] 
main [d:\builds\seamonkey\mozilla\xpfe\bootstrap\nsAppRunner.cpp, line 1633] 
WinMain [d:\builds\seamonkey\mozilla\xpfe\bootstrap\nsAppRunner.cpp, line 1651] 
WinMainCRTStartup() 
KERNEL32.DLL + 0x17d08 (0x77e87d08) 
Comment 5 User image Phil Schwartau 2001-11-29 18:00:33 PST
An even more-reduced testcase. This crashes current Mozilla trunk binaries:

                         '1,2'.split(/(,)/);
                          

The crash has something to do with the back-reference parentheses.
If I take those out, there is no crash.

As Matti said above, I cannot get any crash with my Mozilla debug
builds on WinNT, Linux. I also cannot crash in the debug or optimized
JS shell. I only crash with Mozilla binary builds.

Reassigning to Kenton; cc'ing Roger for his RegExp expertise -
Comment 6 User image Phil Schwartau 2001-11-29 18:04:41 PST
Clarification: using Mozilla binaries 2001-11-26-xx on WinNT, Linux, Mac9.1
It's only my WinNT binary that crashes on this; Linux and Mac9.1 load fine.
Comment 7 User image Phil Schwartau 2001-11-29 18:12:35 PST
To try to provoke the crash just key this in the URL bar:

                javascript: '1,2'.split(/(,)/);

My debug WinNT build prints out 1,,,2 
My WinNT binary crashes.


For prettier output, try 

                javascript: ('1,2'.split(/(,)/)).toSource();

My debug WinNT build prints out ["1", ",", "2"]
My WinNT binary crashes.
Comment 8 User image Phil Schwartau 2001-12-04 12:58:56 PST
Compare bug 104375:  "string.replace() placeholder '$1' not working"
Comment 9 User image Chris King 2001-12-10 10:07:21 PST
Created attachment 61103 [details]
Crash case for same

Test case for crash on win32 for string.split()
Comment 10 User image Chris King 2001-12-10 10:09:53 PST
Attached another test case, note on this test case that it is trying to split 
up the list by the semi-colons (;).  If you do NOT put a semi-colon in the 
textarea, it won't crash.  If you do, however, it breaks.  Tested on the 
Nightly Win32 build 2001121003.
Comment 11 User image Phil Schwartau 2001-12-10 14:29:09 PST
The heart of Chris' testcase is:

              function check_email(email)
              {
                var email_ar = new Array();
                email_ar = email.split(/(;)/);
              }


Am seeing all the same features as above: 

1. I do not crash on any platform besides Windows
2. I do not crash with a Windows debug build, only nighly binaries
3. I do not crash in the debug or optimized JS shell
4. I do not crash if there is no backreference in the regexp: 

                email.split(/(;)/)  <--- crashes 
                email.split(/;/)    <--- is fine
Comment 12 User image Kenton Hanson (gone) 2001-12-12 14:54:59 PST
It seems like a MS compiler bug.  I am moving to 0.9.8.
Comment 13 User image Kenton Hanson (gone) 2002-01-16 17:22:31 PST
Targeting 9.9.  I need help!
Comment 14 User image Brendan Eich [:brendan] 2002-02-07 09:49:49 PST
Cc'ing potential windows compiler bug buddies.

/be
Comment 15 User image timeless 2002-02-08 06:57:02 PST
hrm, can someone add a printf for sep before http://lxr.mozilla.org/mozilla/source/js/src/jsstr.c#1846
[or try sep && sep-> at that line]

sep = &tmp and sep and tmp have the same scope, so it really should never be null, but perhaps something magical is happening

my cable provider was @home so i'm w/o net access for my win box, otherwise i'd love to play :-)
Comment 16 User image rpotts (gone) 2002-02-08 10:39:03 PST
in the regexp case, i noticed that we don't initialize the lenght of the empty
tmp JSSubString...

1780 #if JS_HAS_REGEXPS
1781 if (JSVAL_IS_REGEXP(cx, argv[0])) {
1782 reobj = JSVAL_TO_OBJECT(argv[0]);
1783 re = (JSRegExp *) JS_GetPrivate(cx, reobj);
1784 sep = &tmp;
1785 1786 /* Set a magic value so we can detect a successful re match. */
1787 sep->chars = NULL;
 === AT THIS POINT tmp.length IS GARBAGE ===
1788 } else
1789 #endif

could we be ending up in a routine such as find_split(...) where the length is
important ?

it might be worthwhile to clear the length along with the buffer, ie:
    sep->length = 0;
    sep->chars  = NULL;

just a wild guess ;-)  unfortunately i can't try it cause i don't have any
release builds handy :-(

-- rick
Comment 17 User image Brendan Eich [:brendan] 2002-02-08 23:22:07 PST
If my bleary eyes don't deceive me, find_split still does not depend on
sep->length being valid in the re != NULL case.  But it's gnarly, and setting
tmp.length = 0 would remove all doubt.  I'm betting, though (or my bleary eyes
are betting) that it won't help.

/be
Comment 18 User image rpotts (gone) 2002-02-11 10:18:54 PST
hey brendan,

i think you're right... i ran the crash under purify and didn't see anything bad 
:-(  

but i always tend to look for using uninitialized locals (etc) since they tend 
to show up as optimized-only bugs...  but maybe it is codegen :-(

-- rick
Comment 19 User image rogerl (gone) 2002-02-11 11:38:41 PST
I'm looking into a crash that's occuring in a re-built regexp engine and I don't
see where the input string for the regexp statics is getting set - leaving all
of the regexp static substrings in danger of pointing into collected space.
Could something like this could be happening here?
Comment 20 User image Brendan Eich [:brendan] 2002-02-11 12:18:32 PST
rogerl: how is RegExp.input related to this bug?  I don't see any use of it in
the script, and it's not used internally (only regexp_exec_sub uses it if called
with no arguments).  It's also a GC root, so it shouldn't dangle.

/be
Comment 21 User image rogerl (gone) 2002-02-11 13:06:03 PST
The regexp static paren substrings point into the input string. If that string
is no longer in use by the user code then the regexp static input string is the
only remaining reference. But in the case I'm seeing, the input field isn't
getting set and so the paren substrings are bogus (because a gc has collected
the original). The loop in 'str_split' is using those paren substrings. It's
only a guess - I have a completely different test case that's tripping up here
and it may yet be something stupid I've done (there is precedence).
Comment 22 User image Kenton Hanson (gone) 2002-02-11 13:43:17 PST
In file jsregexp.c  the following consturct is found in line 2235

  for (num = 0; num < state.parenCount; num++) {
.
.
.
.
  }

  res->parenCount = num;  // line 2235

It is not clear what num will be outside the loop
Comment 23 User image Kenton Hanson (gone) 2002-02-12 14:07:31 PST
Ignore the last comment.  Code behaves as author intended.
Comment 24 User image Steve Adamski (gone) 2002-02-20 10:01:40 PST
Moved to Mozilla 1.2 as a temporary holding place, bugs remaining on 0.9.9 are
being scrutinized, now that the milestone has passed.

If this must be fixed for MachV, then it needs to be assigned the "nsbeta1"
keyword.
Comment 25 User image Brendan Eich [:brendan] 2002-02-20 14:32:05 PST
I think this should be fixed for 1.0, but I'm sorry I can't help diagnose it
right now.  Adding keyword.

/be
Comment 26 User image Steve Adamski (gone) 2002-02-28 17:54:22 PST
nominated with the `nsbeta1' keyword, on the grounds that this is a crasher.
Comment 27 User image Kenton Hanson (gone) 2002-04-01 12:29:55 PST
Will not make 1.0.  Important bug but not going to make 1.0
Comment 28 User image Kenton Hanson (gone) 2002-05-29 12:06:39 PDT
David Bradley was unable to reproduce this bug as of May 24, 2002.  I need to
isolate when the errant behavior disappeared, and examine what changes occurred.
Comment 29 User image Phil Schwartau 2002-06-10 13:35:03 PDT
Resolving as WORKSFORME. 

Recall how to reprocuce this bug from Comment #11. I am no longer
able to reproduce the crash. To be more precise: I have a series 
of Mozilla WinNT trunk binaries. Every one of them up to and 
including my 2002-04-22 binary crashes on the testcase. But
my next one, 2002-04-28,  does not crash, nor does any after that.

So something happened between 2002-04-22 and 2002-04-28, but 
a check in bonsai shows it had nothing to do with jsregexp.c.
Comment 30 User image Phil Schwartau 2002-06-10 13:41:36 PDT
Based on the report in Comment #28, marking Verified.

c.schneider@scram.de: if this problem still occurs for you 
with an up-to-date build, please reopen this bug; thanks -
Comment 31 User image Brendan Eich [:brendan] 2002-06-12 10:46:23 PDT
I bet this is a "dup" of bug 140070, based on Phil's build dates and a reading
of bonsai for mozilla/js/src in that period.

/be
Comment 32 User image rogerl (gone) 2002-06-12 13:30:10 PDT
I hate to be a contrarian, but I think my earlier wild guess might have been
substantive. Try this in the shell:

/b(a)/("cba")
RegExp.$1            // should be "a"
gc()
RegExp.$1            // should be "a", but often isn't.

The RegExp.input string is getting collected leaving the paren substring dangling.
I can't prove that's what happened in the original case, but it seems easiest to
re-open this than start another.
Comment 33 User image rogerl (gone) 2002-06-12 13:31:00 PDT
Created attachment 87414 [details] [diff] [review]
Set RegExp static input
Comment 34 User image Brendan Eich [:brendan] 2002-06-12 14:18:30 PDT
Comment on attachment 87414 [details] [diff] [review]
Set RegExp static input 

Gahhh.	How long has this hole been there?  sr=brendan@mozilla.org, but I hope
we don't extend the life of long strings arbitrarily, where we used to get by
with the weak refs from the parens arrays in cx->regExpStatics.

/be
Comment 35 User image Brendan Eich [:brendan] 2002-06-12 14:20:16 PDT
Let's get this into the 1.1beta trunk and go for drivers approval for the 1.0
branch and js1.5 final.  Phil, can you link this and any other 1.5 bugs not yet
tracked up to the 1.5 metabug?  Thanks,

/be
Comment 36 User image Phil Schwartau 2002-06-12 18:36:36 PDT
Have linked this bug to the JS1.5 meta bug 149801, along with other
bugs in the JS Engine component that have the "js1.5" keyword -
Comment 37 User image Phil Schwartau 2002-06-12 18:38:14 PDT
Just need r= to get this patch in 
Comment 38 User image Robert Ginda 2002-06-18 16:03:57 PDT
Comment on attachment 87414 [details] [diff] [review]
Set RegExp static input 

r=rginda
Comment 39 User image rogerl (gone) 2002-06-21 11:49:52 PDT
Fix checked in to trunk.
Comment 40 User image timeless 2002-06-28 10:43:58 PDT
i sent drivers an approval request for the 1_0 branch,
in it i noted that Comment 32 indicated a correctness issue which should 
probably have an entry in the js regression test suite.

per mozilla seamonkey practices, this bug should be resolved fixed, since it is 
fixed on trunk. however, I'm not sure how that works when there's another 
branch (?) [js1.5] that cares about a fix -- I do need to know, since I have a 
nubus bug fix that I'd like in js1.5.
Comment 41 User image Phil Schwartau 2002-06-28 10:59:05 PDT
timeless: thanks. In JS Engine the custom is to leave a bug in the
Assigned state until it is checked into both trunk and branch.
Only then do we mark it Fixed. For example, this is the pattern that
Brendan always follows -
Comment 42 User image Judson Valeski 2002-06-28 12:04:52 PDT
please checkin to the 1.0.1 branch. once there, remove the "mozilla1.0.1+"
keyword and add the "fixed1.0.1" keyword.
Comment 43 User image rogerl (gone) 2002-06-28 13:34:33 PDT
Fix checked in to 1.0.1 branch.
Comment 44 User image Bob Clary [:bc:] 2005-09-18 17:08:19 PDT
Checking in regress-112626.js;
/cvsroot/mozilla/js/tests/js1_5/String/regress-112626.js,v  <--  regress-112626.js
initial revision: 1.1
Comment 45 User image Bob Clary [:bc:] 2006-08-19 09:52:57 PDT
verified fixed 1.9 20060818

Note You need to log in before you can comment on or make changes to this bug.