Closed
Bug 272395
Opened 20 years ago
Closed 20 years ago
JavaScript regex incorrect handling of unescaped literal ] in character class : [^]] or []]
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
VERIFIED
INVALID
People
(Reporter: bugzilla, Unassigned)
References
Details
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0
The ECMAScript specification (I think), or at least most regex implementations,
allow an unescaped ] literal in a character class if it follows the opening [ or
the NOT carat [^ , as an alternative to escaping it using \]
Minimum example: [^]] and [^\]] should both match any character except a ] but
only the latter works. **In the former a regex test fails to match any other
characters**. Additionally something like [^]]+ would actually match the literal
string ]]
The regex behaviour is as expected if the literal ] is escaped.
Reproducible: Always
Steps to Reproduce:
1. Go to http://www.regular-expressions.info/javascriptexample.html
2. Enter [^]] as the regexp and and non-] as the subject string
3. Test Match button
Actual Results:
No match
Expected Results:
Successful match
Comment 1•20 years ago
|
||
Cite ECMA-262 Edition 3 before filing INVALID bugs. The spec clearly prohibits
] in a character class without a backslash escaping it:
15.10.2.18 ClassAtomNoDash
The production ClassAtomNoDash :: SourceCharacter but not one of \ ] - evaluates
by returning a one-element CharSet containing the character represented by
SourceCharacter.
The production ClassAtomNoDash :: \ ClassEscape evaluates by evaluating
ClassEscape to obtain a CharSet and returning that CharSet.
/be
Status: UNCONFIRMED → RESOLVED
Closed: 20 years ago
Resolution: --- → INVALID
Comment 3•19 years ago
|
||
*** Bug 322129 has been marked as a duplicate of this bug. ***
Comment 4•19 years ago
|
||
The ECMA standard is clearly wrong. I doubt it was their intention to diverge from historical and previous established standards, such as POSIX and the Single Unix Specification, which predate ECMAScript. And since JavaScript RE has its origins in Perl RE which in turn is based on POSIX ERE, this bug should be fixed.
For example:
$ perl -n -e 'print $_ if /[]]/' <<EOT
> some text
> more [text] <---
> not this
> and ] blah <---
> foobar
> EOT
more [text] <---
and ] blah <---
$
The above example clearly shows that /[]]/ WORKS. Also performing the above with the escaped right bracket:
perl -n -e 'print $_ if /[\]]/' <<EOT
> some text
> more [text] <---
> not this
> and ] blah <---
> foobar
> EOT
more [text] <---
and ] blah <---
$
Also works.
Obviously the correct solution for this bug is to support BOTH forms:
/[]]/ historical ERE behaviour
/[\]]/ more recent behaviour
By supporting historical behaviour, ERE become portable across ALL applications and usages where ERE are supported, which was the intent of POSIX and SUS in the first place.
You need to log in
before you can comment on or make changes to this bug.
Description
•