Last Comment Bug 514808 - Regex whitespace character class deficiencies
: Regex whitespace character class deficiencies
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla1.9.3a1
Assigned To: Saint Wesonga
:
:
Mentors:
http://thinkweb2.com/projects/prototy...
Depends on: 652771
Blocks:
  Show dependency treegraph
 
Reported: 2009-09-04 20:50 PDT by Juriy "kangax" Zaytsev
Modified: 2011-08-02 08:53 PDT (History)
10 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch (2.77 KB, patch)
2009-09-15 18:24 PDT, Saint Wesonga
dmandelin: review+
Details | Diff | Splinter Review
Test case (3.45 KB, patch)
2009-09-16 11:20 PDT, David Mandelin [:dmandelin]
no flags Details | Diff | Splinter Review

Description Juriy "kangax" Zaytsev 2009-09-04 20:50:00 PDT
As per ES3 (see 15.10.2.12), /\s/ should match both WhiteSpace (7.2) and LineTerminator (7.3). WhiteSpace is defined to include characters from Unicode "Zs" category (besides "standard" - x09, x0B, x0C, x20 and xA0). Firefox (contrary to, say, latest WebKit and Chrome) fails some of these characters. See <http://thinkweb2.com/projects/prototype/whitespace-deviations/> for a full results table.

A test is available online at: <http://yura.thinkweb2.com/rcornford_whitespace_test.html>

Thank you.
Comment 1 Juriy "kangax" Zaytsev 2009-09-04 20:53:59 PDT
It's also worth mentioning that `String.prototype.trim` seems to rely on the same character set that is matched by (faulty) /\s/. I suppose, fixing /\s/ will also make `trim` fully compliant.
Comment 2 Saint Wesonga 2009-09-12 15:45:11 PDT
This bug is fixed in trunk. Should it be marked as wanted in older branches?
Comment 3 Andreas Gal :gal 2009-09-12 19:15:31 PDT
Which patch fixed this?
Comment 4 Saint Wesonga 2009-09-13 15:21:37 PDT
From what I gather, it could only be the patch in bug 406271. See http://hg.mozilla.org/mozilla-central/rev/b837948c1daf

This is interesting because the WhiteSpaceRanges[] constant array at http://mxr.mozilla.org/mozilla-central/source/js/src/jsregexp.cpp?&mark=3548-3571#3552 appears to be incomplete even though all the tests at <http://yura.thinkweb2.com/rcornford_whitespace_test.html> pass.
Comment 5 Saint Wesonga 2009-09-13 17:18:40 PDT
Ignore comment 2, it's broken in trunk if javascript.options.jit.content = false
Comment 6 Saint Wesonga 2009-09-15 18:24:25 PDT
Created attachment 400921 [details] [diff] [review]
Patch
Comment 7 Brendan Eich [:brendan] 2009-09-15 19:59:50 PDT
Comment on attachment 400921 [details] [diff] [review]
Patch

I'm traveling and hoping to bounce this review to Dave or Luke. Dave, if you need bounce, then Luke is "it". Thanks,

/be
Comment 8 David Mandelin [:dmandelin] 2009-09-16 11:20:43 PDT
Created attachment 401061 [details] [diff] [review]
Test case
Comment 9 David Mandelin [:dmandelin] 2009-09-16 11:21:14 PDT
Comment on attachment 400921 [details] [diff] [review]
Patch

Looks good. r+ with the test case added that I just attached.
Comment 10 Phil Ringnalda (:philor) 2009-12-13 18:49:06 PST
One of the shorter stays in the tracemonkey checkin-needed queue at only three months, but still: land it in tracemonkey or admit you can't be bothered and wontfix it.
Comment 11 Robert Sayre 2009-12-15 16:18:16 PST
(In reply to comment #10)
> One of the shorter stays in the tracemonkey checkin-needed queue at only three
> months, but still: land it in tracemonkey or admit you can't be bothered and
> wontfix it.

tone not welcome. checkin-needed noted.
Comment 13 Robert Sayre 2009-12-16 08:15:53 PST
I backed this out to fix an orange tinderbox.
Comment 14 Saint Wesonga 2010-03-09 21:17:28 PST
(In reply to comment #13)
> I backed this out to fix an orange tinderbox.

How do I find out which tests failed with this patch?
Comment 15 Robert Sayre 2010-03-09 21:20:17 PST
Hey there, it may be too late to find the logs at this point. I can send it to the tryserver[0] tomorrow and see what we get. I'll post the logs here.

[0] https://wiki.mozilla.org/Build:TryServer
Comment 16 Saint Wesonga 2010-03-23 22:04:34 PDT
(In reply to comment #15)
> Hey there, it may be too late to find the logs at this point. I can send it to
> the tryserver[0] tomorrow and see what we get. I'll post the logs here.
> 
> [0] https://wiki.mozilla.org/Build:TryServer

Would it be possible for you to do the same for bug 529386? (not sure what the protocol is for requesting these logs seeing as I don't have commit access).
Comment 17 Saint Wesonga 2010-06-01 09:31:35 PDT
Robert, any word on this bug?
Comment 18 Jeff Walden [:Waldo] (remove +bmo to email) 2011-04-12 23:22:59 PDT
This is the cause of most of the failures on http://samples.msdn.microsoft.com/ietestcenter/Javascript/ES15.5.html (one's really a test bug, rest are about this).  The test is basically this:

assertEq(("\n\r\u2028\u2029" +
          "\t\n\v\f\r \x85\xA0\u1680\u180E\u2000\u2001\u2002" +
          "\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200A" +
          "\u2028\u2029\u202F\u205F\u3000\uFEFF").trim(),
         "");

That said, I'm not completely sure the test is right, because it seems to want U+0085 as a Unicode space, but at least the GNOME character map utility says it isn't.  There are other differences, too:

js> ("\t\n\v\f\r \x85\xA0\u1680\u180E\u2000\u2001\u2002\u2003" +
     "\u2004\u2005\u2006\u2007\u2008\u2009\u200A\u2028\u2029" +
     "\u202F\u205F\u3000\uFEFF").split("").filter(function(s) {
       return s.trim() !== "";
     })
["\x85", "\u1680", "\u180E", "\u202F", "\u205F"]

U+205F is a space per the character map, so at least some of these are probably correct even if it turns out U+0085 is not.  Someone who actually knows this Unicode stuff should (ideally) investigate.

...and yet another bug forgotten.  How do we end up doing this?  :-(
Comment 19 Juriy "kangax" Zaytsev 2011-04-13 11:05:43 PDT
U+0085 is from Cc category (Other, Control), not Zs (Separator, Space). The latter one is what ES5 considers a whitespace. 

It does look like a bug in a test.
Comment 20 AWAY Tom Schuster [:evilpie] 2011-08-01 15:40:10 PDT
http://hg.mozilla.org/integration/mozilla-inbound/rev/ae35b244a09a
Comment 21 Marco Bonardo [::mak] 2011-08-02 03:26:05 PDT
http://hg.mozilla.org/mozilla-central/rev/ae35b244a09a

Note You need to log in before you can comment on or make changes to this bug.