Closed Bug 112626 Opened 23 years ago Closed 23 years ago

String.split(regexp) crashes if regexp contains ()'s

Categories

(Core :: JavaScript Engine, defect)

All
Windows NT
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla1.0.1

People

(Reporter: c.schneider, Assigned: rogerl)

References

Details

(Keywords: crash, js1.5)

Attachments

(3 files)

From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) BuildID: 2001112009 The following page causes a segmentation fault. Should be easy to find... <HTML> <HEAD> <SCRIPT language="JavaScript"> var _cs='2001-01-01'; var curTime = _cs.split(/([- :])/); </SCRIPT> </HEAD> <BODY> </BODY> </HTML> Reproducible: Always Steps to Reproduce: 1. call the webpage in description 2. boom 3. Actual Results: Segmentation Fault Expected Results: Calculate the correct result
worksforme, 2001-11-28 linux build. No crash.
confirmend with win2k build 20011129.. I will try to get a stack trace but my debug doesn't crash (updating now)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: crash
it doesn't crash my debug build. 0.9.6 talkback ID: TB38670554Q
Stack trace from Talkback incident TB38670554Q: Build ID 2001112012 Product ID MozillaBranch Operating System Win32 Stack Trace str_split [d:\builds\seamonkey\mozilla\js\src\jsstr.c, line 1846] js_Invoke [d:\builds\seamonkey\mozilla\js\src\jsinterp.c, line 834] js_Interpret [d:\builds\seamonkey\mozilla\js\src\jsinterp.c, line 2792] js_Execute [d:\builds\seamonkey\mozilla\js\src\jsinterp.c, line 1014] JS_EvaluateUCScriptForPrincipals [d:\builds\seamonkey\mozilla\js\src\jsapi.c, line 3380] nsJSContext::EvaluateString [d:\builds\seamonkey\mozilla\dom\src\base\nsJSEnvironment.cpp, line 654] nsScriptLoader::EvaluateScript [d:\builds\seamonkey\mozilla\content\base\src\nsScriptLoader.cpp, line 576] nsScriptLoader::ProcessRequest [d:\builds\seamonkey\mozilla\content\base\src\nsScriptLoader.cpp, line 484] nsScriptLoader::ProcessScriptElement [d:\builds\seamonkey\mozilla\content\base\src\nsScriptLoader.cpp, line 428] nsHTMLScriptElement::SetDocument [d:\builds\seamonkey\mozilla\content\html\content\src\nsHTMLScriptElement.cpp, line 159] nsGenericContainerElement::AppendChildTo [d:\builds\seamonkey\mozilla\content\base\src\nsGenericElement.cpp, line 3724] HTMLContentSink::ProcessSCRIPTTag [d:\builds\seamonkey\mozilla\content\html\document\src\nsHTMLContentSink.cpp, line 5126] HTMLContentSink::AddLeaf [d:\builds\seamonkey\mozilla\content\html\document\src\nsHTMLContentSink.cpp, line 3497] CNavDTD::AddLeaf [d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp, line 3774] CNavDTD::AddHeadLeaf [d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp, line 3833] CNavDTD::HandleStartToken [d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp, line 1719] CNavDTD::HandleToken [d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp, line 895] CNavDTD::BuildModel [d:\builds\seamonkey\mozilla\htmlparser\src\CNavDTD.cpp, line 526] nsParser::BuildModel [d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 1983] nsParser::ResumeParse [d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 1847] nsParser::OnDataAvailable [d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 2505] nsDocumentOpenInfo::OnDataAvailable [d:\builds\seamonkey\mozilla\uriloader\base\nsURILoader.cpp, line 260] nsFileChannel::OnDataAvailable [d:\builds\seamonkey\mozilla\netwerk\protocol\file\src\nsFileChannel.cpp, line 511] nsOnDataAvailableEvent::HandleEvent [d:\builds\seamonkey\mozilla\netwerk\base\src\nsStreamListenerProxy.cpp, line 203] PL_HandleEvent [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 591] PL_ProcessPendingEvents [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 524] _md_EventReceiverProc [d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 1072] nsAppShellService::Run [d:\builds\seamonkey\mozilla\xpfe\appshell\src\nsAppShellService.cpp, line 303] main1 [d:\builds\seamonkey\mozilla\xpfe\bootstrap\nsAppRunner.cpp, line 1316] main [d:\builds\seamonkey\mozilla\xpfe\bootstrap\nsAppRunner.cpp, line 1633] WinMain [d:\builds\seamonkey\mozilla\xpfe\bootstrap\nsAppRunner.cpp, line 1651] WinMainCRTStartup() KERNEL32.DLL + 0x17d08 (0x77e87d08)
An even more-reduced testcase. This crashes current Mozilla trunk binaries: '1,2'.split(/(,)/); The crash has something to do with the back-reference parentheses. If I take those out, there is no crash. As Matti said above, I cannot get any crash with my Mozilla debug builds on WinNT, Linux. I also cannot crash in the debug or optimized JS shell. I only crash with Mozilla binary builds. Reassigning to Kenton; cc'ing Roger for his RegExp expertise -
Assignee: rogerl → khanson
Summary: split function crashes mozilla → String.split(regexp) crashes if regexp contains ()'s
Clarification: using Mozilla binaries 2001-11-26-xx on WinNT, Linux, Mac9.1 It's only my WinNT binary that crashes on this; Linux and Mac9.1 load fine.
OS: All → Windows NT
To try to provoke the crash just key this in the URL bar: javascript: '1,2'.split(/(,)/); My debug WinNT build prints out 1,,,2 My WinNT binary crashes. For prettier output, try javascript: ('1,2'.split(/(,)/)).toSource(); My debug WinNT build prints out ["1", ",", "2"] My WinNT binary crashes.
Compare bug 104375: "string.replace() placeholder '$1' not working"
Attached file Crash case for same
Test case for crash on win32 for string.split()
Attached another test case, note on this test case that it is trying to split up the list by the semi-colons (;). If you do NOT put a semi-colon in the textarea, it won't crash. If you do, however, it breaks. Tested on the Nightly Win32 build 2001121003.
The heart of Chris' testcase is: function check_email(email) { var email_ar = new Array(); email_ar = email.split(/(;)/); } Am seeing all the same features as above: 1. I do not crash on any platform besides Windows 2. I do not crash with a Windows debug build, only nighly binaries 3. I do not crash in the debug or optimized JS shell 4. I do not crash if there is no backreference in the regexp: email.split(/(;)/) <--- crashes email.split(/;/) <--- is fine
It seems like a MS compiler bug. I am moving to 0.9.8.
Target Milestone: --- → mozilla0.9.8
Targeting 9.9. I need help!
Target Milestone: mozilla0.9.8 → mozilla0.9.9
Cc'ing potential windows compiler bug buddies. /be
hrm, can someone add a printf for sep before http://lxr.mozilla.org/mozilla/source/js/src/jsstr.c#1846 [or try sep && sep-> at that line] sep = &tmp and sep and tmp have the same scope, so it really should never be null, but perhaps something magical is happening my cable provider was @home so i'm w/o net access for my win box, otherwise i'd love to play :-)
in the regexp case, i noticed that we don't initialize the lenght of the empty tmp JSSubString... 1780 #if JS_HAS_REGEXPS 1781 if (JSVAL_IS_REGEXP(cx, argv[0])) { 1782 reobj = JSVAL_TO_OBJECT(argv[0]); 1783 re = (JSRegExp *) JS_GetPrivate(cx, reobj); 1784 sep = &tmp; 1785 1786 /* Set a magic value so we can detect a successful re match. */ 1787 sep->chars = NULL; === AT THIS POINT tmp.length IS GARBAGE === 1788 } else 1789 #endif could we be ending up in a routine such as find_split(...) where the length is important ? it might be worthwhile to clear the length along with the buffer, ie: sep->length = 0; sep->chars = NULL; just a wild guess ;-) unfortunately i can't try it cause i don't have any release builds handy :-( -- rick
If my bleary eyes don't deceive me, find_split still does not depend on sep->length being valid in the re != NULL case. But it's gnarly, and setting tmp.length = 0 would remove all doubt. I'm betting, though (or my bleary eyes are betting) that it won't help. /be
hey brendan, i think you're right... i ran the crash under purify and didn't see anything bad :-( but i always tend to look for using uninitialized locals (etc) since they tend to show up as optimized-only bugs... but maybe it is codegen :-( -- rick
I'm looking into a crash that's occuring in a re-built regexp engine and I don't see where the input string for the regexp statics is getting set - leaving all of the regexp static substrings in danger of pointing into collected space. Could something like this could be happening here?
rogerl: how is RegExp.input related to this bug? I don't see any use of it in the script, and it's not used internally (only regexp_exec_sub uses it if called with no arguments). It's also a GC root, so it shouldn't dangle. /be
The regexp static paren substrings point into the input string. If that string is no longer in use by the user code then the regexp static input string is the only remaining reference. But in the case I'm seeing, the input field isn't getting set and so the paren substrings are bogus (because a gc has collected the original). The loop in 'str_split' is using those paren substrings. It's only a guess - I have a completely different test case that's tripping up here and it may yet be something stupid I've done (there is precedence).
In file jsregexp.c the following consturct is found in line 2235 for (num = 0; num < state.parenCount; num++) { . . . . } res->parenCount = num; // line 2235 It is not clear what num will be outside the loop
Ignore the last comment. Code behaves as author intended.
Moved to Mozilla 1.2 as a temporary holding place, bugs remaining on 0.9.9 are being scrutinized, now that the milestone has passed. If this must be fixed for MachV, then it needs to be assigned the "nsbeta1" keyword.
Target Milestone: mozilla0.9.9 → mozilla1.2
I think this should be fixed for 1.0, but I'm sorry I can't help diagnose it right now. Adding keyword. /be
Keywords: js1.5, mozilla1.0
nominated with the `nsbeta1' keyword, on the grounds that this is a crasher.
Keywords: nsbeta1
Will not make 1.0. Important bug but not going to make 1.0
David Bradley was unable to reproduce this bug as of May 24, 2002. I need to isolate when the errant behavior disappeared, and examine what changes occurred.
Resolving as WORKSFORME. Recall how to reprocuce this bug from Comment #11. I am no longer able to reproduce the crash. To be more precise: I have a series of Mozilla WinNT trunk binaries. Every one of them up to and including my 2002-04-22 binary crashes on the testcase. But my next one, 2002-04-28, does not crash, nor does any after that. So something happened between 2002-04-22 and 2002-04-28, but a check in bonsai shows it had nothing to do with jsregexp.c.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → WORKSFORME
Based on the report in Comment #28, marking Verified. c.schneider@scram.de: if this problem still occurs for you with an up-to-date build, please reopen this bug; thanks -
Status: RESOLVED → VERIFIED
I bet this is a "dup" of bug 140070, based on Phil's build dates and a reading of bonsai for mozilla/js/src in that period. /be
I hate to be a contrarian, but I think my earlier wild guess might have been substantive. Try this in the shell: /b(a)/("cba") RegExp.$1 // should be "a" gc() RegExp.$1 // should be "a", but often isn't. The RegExp.input string is getting collected leaving the paren substring dangling. I can't prove that's what happened in the original case, but it seems easiest to re-open this than start another.
Status: VERIFIED → REOPENED
Resolution: WORKSFORME → ---
Comment on attachment 87414 [details] [diff] [review] Set RegExp static input Gahhh. How long has this hole been there? sr=brendan@mozilla.org, but I hope we don't extend the life of long strings arbitrarily, where we used to get by with the weak refs from the parens arrays in cx->regExpStatics. /be
Attachment #87414 - Flags: superreview+
Let's get this into the 1.1beta trunk and go for drivers approval for the 1.0 branch and js1.5 final. Phil, can you link this and any other 1.5 bugs not yet tracked up to the 1.5 metabug? Thanks, /be
Assignee: khanson → rogerl
Status: REOPENED → NEW
Keywords: mozilla1.0.1
Target Milestone: mozilla1.2alpha → mozilla1.0.1
Blocks: 149801
Have linked this bug to the JS1.5 meta bug 149801, along with other bugs in the JS Engine component that have the "js1.5" keyword -
Just need r= to get this patch in
Comment on attachment 87414 [details] [diff] [review] Set RegExp static input r=rginda
Attachment #87414 - Flags: review+
Fix checked in to trunk.
Status: NEW → ASSIGNED
i sent drivers an approval request for the 1_0 branch, in it i noted that Comment 32 indicated a correctness issue which should probably have an entry in the js regression test suite. per mozilla seamonkey practices, this bug should be resolved fixed, since it is fixed on trunk. however, I'm not sure how that works when there's another branch (?) [js1.5] that cares about a fix -- I do need to know, since I have a nubus bug fix that I'd like in js1.5.
timeless: thanks. In JS Engine the custom is to leave a bug in the Assigned state until it is checked into both trunk and branch. Only then do we mark it Fixed. For example, this is the pattern that Brendan always follows -
Attachment #87414 - Flags: approval+
please checkin to the 1.0.1 branch. once there, remove the "mozilla1.0.1+" keyword and add the "fixed1.0.1" keyword.
Fix checked in to 1.0.1 branch.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
Keywords: adt1.0.1
Flags: testcase?
Checking in regress-112626.js; /cvsroot/mozilla/js/tests/js1_5/String/regress-112626.js,v <-- regress-112626.js initial revision: 1.1
Flags: testcase? → testcase+
verified fixed 1.9 20060818
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: