Closed
Bug 378738
Opened 18 years ago
Closed 18 years ago
\d pattern matches characters other than the decimal digits 0-9 (ecma_3/RegExp/15.10.2.12.js)
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
VERIFIED
FIXED
People
(Reporter: martin.honnen, Assigned: Waldo)
References
()
Details
Attachments
(1 file)
895 bytes,
patch
|
mrbkap
:
review+
|
Details | Diff | Splinter Review |
With Spidermonkey if you evaluate
/\d/.test("\uFF11")
then it returns true but should return false as the ECMAScript specification clearly says:
"15.10.2.12 CharacterClassEscape
The production CharacterClassEscape :: d evaluates by returning the ten-element set of characters containing the
characters 0 through 9 inclusive."
This is a bug in Spidermonkey (already present in Netscape 4 builds but still there in a Firefox trunk nightly). Other engines like Rhino or MS JScript return false.
Bug was reported in mozilla.dev.tech.javascript by Igor Tandetnik.
Comment 1•18 years ago
|
||
Oops, there is code in browser.js (by me, although Mano has blame) that depends on this bug: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/browser/base/content/browser.js&rev=1.778&mark=1434-1438#1421 checked in in bug 298004
Assignee | ||
Comment 2•18 years ago
|
||
This doesn't fix the browser dependency on the bug; that can be a separate patch.
Updated•18 years ago
|
Attachment #262776 -
Flags: review?(mrbkap) → review+
Comment 3•18 years ago
|
||
Should we fix the REOP_NONDIGIT case as well? And are we sure we shouldn't WONTFIX this? Perhaps the spec is simply wrong here?
Comment 4•18 years ago
|
||
It is worth noting probably that \d and \D in character _classes_ do the right thing. So maybe we should fix this inconsistency.
Comment 5•18 years ago
|
||
On WONTFIX: I feel it naturally follows from the principle of least surprise that, for every string s, if /^\d+$/.test(s) returns true then parseInt(s, 10) should return a value other than NaN (similarly for Number(s) ).
Currently this condition does not hold. Personally, I would be happy with either /\d/ changed to only match [0-9] (and \D to match [^0-9]), or parseInt changed to understand "fancy digits".
Comment 6•18 years ago
|
||
Please note that ChatZilla has a copy of the code in browser.js mentioned in comment 1, and other things might have copied it too.
Comment 7•18 years ago
|
||
(In reply to comment #5)
> On WONTFIX: I feel it naturally follows from the principle of least surprise
> that, for every string s, if /^\d+$/.test(s) returns true then parseInt(s, 10)
> should return a value other than NaN (similarly for Number(s) ).
I'm not sure I agree with this. Making parseInt handle digits other than in the ASCII range seems like a big can of worms. What should parseInt("१২੩") return? (that's a Devanagari 1, a Bengali 2, and a Gurmukhi 3).
Comment 8•18 years ago
|
||
I'll be happy with it returning 123. The probability of such a situation arising in practice is very near zero, I believe. Since the string is linguistically meaningless, parsing it to 123 is as good an outcome as any other.
But again, I would be equally happy with \d restricted to [0-9] and "१২੩" failing to parse as a number. What I object to is "१২੩" looking like a number lexically, but then failing to parse. Or, rather, I object to "१२३" looking like a number but failing to parse (here, all three characters are Devanagari digits).
Assignee | ||
Comment 9•18 years ago
|
||
(In reply to comment #3)
> Should we fix the REOP_NONDIGIT case as well?
Dangit, that's twice I've done this recently (not catch an obvious variant of a bug); I'll fix that on checkin (probably this evening).
> And are we sure we shouldn't WONTFIX this? Perhaps the spec is simply wrong
> here?
Everyone else gets it right. Also, this contradicts most of ES3 by taking Unicode character classes into account; I'd leave it for ES4 if people want it.
Comment 10•18 years ago
|
||
Consider also code like this:
http://lxr.mozilla.org/mozilla/source/calendar/resources/content/datetimepickers/datetimepickers.xml
It uses \d for parsing and Number for converting all over the place. It would seem the practice is quite common.
Assignee | ||
Comment 11•18 years ago
|
||
Fixed on trunk, with the browser bug filed as bug 379088.
Assignee | ||
Comment 12•18 years ago
|
||
Fixed on trunk, with the browser bug filed as bug 379088.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 13•18 years ago
|
||
(I did add the fix for \D as well, to save people the trouble of checking.)
Comment 14•18 years ago
|
||
/cvsroot/mozilla/js/tests/ecma_3/RegExp/15.10.2.12.js,v <-- 15.10.2.12.js
initial revision: 1.1
Flags: in-testsuite+
Comment 15•18 years ago
|
||
verified fixed 1.9.0 linux/mac* shell.
Status: RESOLVED → VERIFIED
Summary: \d pattern matches characters other than the decimal digits 0-9 → \d pattern matches characters other than the decimal digits 0-9 (ecma_3/RegExp/15.10.2.12.js)
Comment 16•17 years ago
|
||
Following document seems to say about JS 1.5 RegExp spec based on test result with Mozilla family on which patch for bug 298004 is applied.
http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Global_Objects:RegExp
Please add description about "JS 1.9 backed out inappropriate spec alteration by bug 298004" in "New in JavaScript 1.9" of following page, when it'll be added.
http://developer.mozilla.org/en/docs/JavaScript
Comment 17•17 years ago
|
||
I added description following documents. http://developer.mozilla.org/en/docs/Firefox_3_for_developers#Notable_bugs_fixed
http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Global_Objects:RegExp
Updated•15 years ago
|
OS: Windows XP → All
Hardware: x86 → All
Updated•15 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•