Closed Bug 587525 Opened 14 years ago Closed 14 years ago

Split with regex including an alternative | incorrectly returns splitting tokens

Categories

(Core :: JavaScript Engine, defect)

x86_64
macOS
defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: bodi.giyomu, Unassigned)

Details

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.126 Safari/533.4
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8

Splitting a string on a regular expression including a choice point (ex. /a|b/)  will return the splitting tokens as part of the result array.

Reproducible: Always

Steps to Reproduce:
1. Create a string ex.                                                            var string = "'test' and 'test2' or 'test3'";
2. Split it with a regex involving an alternative. ex.        string.split(/and|or/);
3. Check the return array.
Actual Results:  
 ["'test' ", "and", " 'test2' ", "or", " 'test3'"]

Expected Results:  
 ["'test' ", " 'test2' ", " 'test3'"]
I can't repro this with my 3.6.8 build [Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.8) Gecko/20100723 Ubuntu/10.04 (lucid) Firefox/3.6.8] using URL:

javascript:var string = "'test' and 'test2' or 'test3'"; var result = string.split(/and|or/); document.write(result.toSource())

Can somebody with an OS X machine handy give it a shot?
Sorry, I incorrectly copied the regex: /(and|or)/ instead of /and|or/

Try with this URL instead
javascript:var string = "'test' and 'test2' or 'test3'"; var result = string.split(/(and|or)/); document.write(result.toSource())
(In reply to comment #2)
> Sorry, I incorrectly copied the regex: /(and|or)/ instead of /and|or/

We do this correctly per ECMAScriptv5 section 15.5.4.14:

"""
If separator is a regular expression that contains capturing parentheses, then each time separator is matched the results (including any undefined results) of the capturing parentheses are spliced into the output array.

For example,

"A<B>bold</B>and<CODE>coded</CODE>".split(/<(\/)?([^<>]+)>/)

evaluates to the array

["A", undefined, "B", "bold", "/", "B", "and", undefined,
"CODE", "coded", "/", "CODE", ""]
"""

Thanks for reporting, though!
Status: UNCONFIRMED → RESOLVED
Closed: 14 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.