Closed
Bug 1453456
Opened 6 years ago
Closed 5 years ago
Add/use more ASCII alphanumeric testing/conversion functions to/from MFBT
Categories
(Core :: MFBT, enhancement)
Core
MFBT
Tracking
()
RESOLVED
FIXED
mozilla67
Tracking | Status | |
---|---|---|
firefox67 | --- | fixed |
People
(Reporter: Waldo, Assigned: Waldo)
Details
Attachments
(5 files)
10.86 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
33.62 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
21.79 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
4.62 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
15.36 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
No description provided.
Assignee | ||
Comment 1•6 years ago
|
||
Attachment #8967133 -
Flags: review?(nfroyd)
Assignee | ||
Comment 2•6 years ago
|
||
Attachment #8967134 -
Flags: review?(nfroyd)
Assignee | ||
Comment 3•6 years ago
|
||
Attachment #8967135 -
Flags: review?(nfroyd)
Comment 4•6 years ago
|
||
Comment on attachment 8967133 [details] [diff] [review] Add/use more ASCII alphanumeric testing/conversion functions from MFBT in SpiderMonkey code in various places Review of attachment 8967133 [details] [diff] [review]: ----------------------------------------------------------------- No tests? ::: js/src/ctypes/CTypes.cpp @@ +2942,2 @@ > else > + return false; Now that there aren't so many cases, do you just want to say: if (!IsAsciiAlphanumeric(c)) return false; uint8_t digit = AsciiAlphanumericToNumber(c); ? Seems a little clearer, and enables you to initialize `digit` when you declare it, which is always nice. ::: js/src/jsnum.cpp @@ +231,2 @@ > else > break; Could consider the same switch here, to enable initializing `digit` when it's declared. ::: mfbt/TextUtils.h @@ +35,5 @@ > }; > > } // namespace detail > > +/** Returns true iff |aChar| matches [a-z]. */ As with IsAsciiAlpha, is "what you thought islower did, but doesn't depend on the locale" worth adding here? (Though it looks like islower is required to recognize non-ascii, too...) @@ +48,5 @@ > + > +/** Returns true iff |aChar| matches [A-Z]. */ > +template<typename Char> > +constexpr bool > +IsAsciiUppercaseAlpha(Char aChar) Same as above, but with isupper. @@ +104,5 @@ > + } > + > + // The division here compile-time-enforces an IsAsciiAlphanumeric assertion > + // when possible in constexpr context and in runtime context does the > + // next-best thing. I am having a hard time deciding whether this is incredible or incredibly dangerous.
Attachment #8967133 -
Flags: review?(nfroyd) → review+
Comment 5•6 years ago
|
||
Comment on attachment 8967134 [details] [diff] [review] Replace nsCRT::IsAscii{Alpha,Digit} with mfbt/TextUtils.h versions Review of attachment 8967134 [details] [diff] [review]: ----------------------------------------------------------------- r=me, but I would like to know why the below is hanging around. ::: xpcom/ds/nsCRT.h @@ +96,5 @@ > static bool IsLower(char aChar) { return NS_IsLower(aChar); } > > static bool IsAscii(char16_t aChar) { return NS_IsAscii(aChar); } > static bool IsAscii(const char16_t* aString) { return NS_IsAscii(aString); } > + static bool IsAsciiAlpha(char16_t aChar) { return mozilla::IsAsciiAlpha(aChar); } So the patch is said to be replacing nsCRT::IsAsciiAlpha, but we're keeping the function around?
Attachment #8967134 -
Flags: review?(nfroyd) → review+
Comment 6•6 years ago
|
||
Comment on attachment 8967135 [details] [diff] [review] Replace JS7_ISDEC with mozilla::IsAsciiDigit Review of attachment 8967135 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/util/Text.h @@ -28,5 @@ > /* > * Shorthands for ASCII (7-bit) decimal and hex conversion. > * Manually inline isdigit and isxdigit for performance; MSVC doesn't do this for us. > */ > -#define JS7_ISDEC(c) ((((unsigned)(c)) - '0') <= 9) Out of curiosity, and given the above comment, have you checked that MSVC does generate the above code with our current implementation of IsAsciiDigit?
Attachment #8967135 -
Flags: review?(nfroyd) → review+
Pushed by jwalden@mit.edu: https://hg.mozilla.org/integration/mozilla-inbound/rev/fde6a66eb317 Add/use more ASCII alphanumeric testing/conversion functions from MFBT in SpiderMonkey code in various places. r=froydnj https://hg.mozilla.org/integration/mozilla-inbound/rev/6a84718ddfe1 Replace nsCRT::IsAscii{Alpha,Digit} with mfbt/TextUtils.h versions. r=froydnj https://hg.mozilla.org/integration/mozilla-inbound/rev/e0f8f325c2ce Replace JS7_ISDEC with mozilla::IsAsciiDigit. r=froydnj
Assignee | ||
Comment 8•6 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #4) > No tests? Whoops, forgot. Added a bunch. > Now that there aren't so many cases, do you just want to say: > > if (!IsAsciiAlphanumeric(c)) > return false; > > uint8_t digit = AsciiAlphanumericToNumber(c); Sure, yeah. I also considered making it a Maybe or a Result or something, then decided not to spend too much time on it. But really Optional<uint8_t> is really what's wanted here IMO. > As with IsAsciiAlpha, is "what you thought islower did, but doesn't depend > on the locale" worth adding here? (Though it looks like islower is required > to recognize non-ascii, too...) > Same as above, but with isupper. Sure. > I am having a hard time deciding whether this is incredible or incredibly > dangerous. Oh, that's easy to answer. Yes. (C++17 if constexpr is really what we want -- added a comment noting that.) (In reply to Nathan Froyd [:froydnj] from comment #5) > So the patch is said to be replacing nsCRT::IsAsciiAlpha, but we're keeping > the function around? Whoops, removed. (In reply to Nathan Froyd [:froydnj] from comment #6) > Out of curiosity, and given the above comment, have you checked that MSVC > does generate the above code with our current implementation of IsAsciiDigit? Not recently, but years ago I remember tracking down some documentation for this somewhere and us starting to remove the pattern generally. Just no one's going around systematically doing it. I'll probably do more along these lines, so keeping the bug open for that.
Keywords: leave-open
Comment 9•6 years ago
|
||
Backed out for build bustages at js/src/jit-test/tests/ctypes/conversion-primitive.js Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=50cd4a60a402e9151d461651f565458c79059ec6 Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=173604865&repo=mozilla-inbound&lineNumber=9429 Backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/f36928548891d9029477f0733c71bb8e093e95f6
Flags: needinfo?(jwalden+bmo)
Assignee | ||
Comment 10•6 years ago
|
||
So it occurred to me that actually, we *could* use non-constexpr code in this function -- you can have such things in constexpr functions *as long as the non-constexpr bits aren't invoked from constexpr context*. But it turns out gcc at least as of 6.4 doesn't like this. For example, this testcase ////////////////////////////////// static inline void NonConstexpr() {} constexpr unsigned char AsciiAlphanumericToNumber(unsigned char uc) { if ('0' <= uc && uc <= '9') return uc - '0'; NonConstexpr(); return 0; } int main() { static_assert(AsciiAlphanumericToNumber('0') == 0, "hi"); } ////////////////////////////////// works great with recent clang, but in gcc 6.4 I get constexpr-fail.cpp: In function ‘constexpr unsigned char AsciiAlphanumericToNumber(unsigned char)’: constexpr-fail.cpp:9:15: error: call to non-constexpr function ‘void NonConstexpr()’ NonConstexpr(); ~~~~~~~~~~~~^~ constexpr-fail.cpp: In function ‘int main()’: constexpr-fail.cpp:15:3: error: non-constant condition for static assertion static_assert(AsciiAlphanumericToNumber('0') == 0, "hi"); ^~~~~~~~~~~~~ constexpr-fail.cpp:15:42: error: ‘constexpr unsigned char AsciiAlphanumericToNumber(unsigned char)’ called in a constant expression static_assert(AsciiAlphanumericToNumber('0') == 0, "hi"); ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~ so boo to the idea of just having MOZ_ASSERT_UNREACHABLE in the function tail. Still looking into this. Goal is to land tonight or tomorrow, 'cause next week is a team meetup and I may not be able to then, but this stuff should get done and stay done so people can start using it.
Flags: needinfo?(jwalden+bmo)
Comment 11•6 years ago
|
||
Pushed by jwalden@mit.edu: https://hg.mozilla.org/integration/mozilla-inbound/rev/69fbc1090788 Add/use more ASCII alphanumeric testing/conversion functions from MFBT in SpiderMonkey code in various places. r=froydnj https://hg.mozilla.org/integration/mozilla-inbound/rev/c530019cba77 Replace nsCRT::IsAscii{Alpha,Digit} with mfbt/TextUtils.h versions. r=froydnj https://hg.mozilla.org/integration/mozilla-inbound/rev/89bb0e170cc0 Replace JS7_ISDEC with mozilla::IsAsciiDigit. r=froydnj
Comment 12•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/69fbc1090788 https://hg.mozilla.org/mozilla-central/rev/c530019cba77 https://hg.mozilla.org/mozilla-central/rev/89bb0e170cc0
Comment 13•5 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months. :Waldo, maybe it's time to close this bug?
Flags: needinfo?(jwalden+bmo)
Assignee | ||
Comment 14•5 years ago
|
||
Okay, fine, let's do one or two more things here here about nsCRT.h and close this out.
Attachment #9046442 -
Flags: review?(nfroyd)
Assignee | ||
Comment 15•5 years ago
|
||
Attachment #9046443 -
Flags: review?(nfroyd)
Assignee | ||
Updated•5 years ago
|
Flags: needinfo?(jwalden)
Comment 16•5 years ago
|
||
Comment on attachment 9046443 [details] [diff] [review] Remove nsCRT::IsAscii(null-terminated string) in favor of mozilla::IsAsciiNullTerminated Review of attachment 9046443 [details] [diff] [review]: ----------------------------------------------------------------- ::: mfbt/TextUtils.h @@ +40,5 @@ > auto uc = static_cast<UnsignedChar>(aChar); > return uc < 0x80; > } > > +/** Returns true iff |aChar| is ASCII, i.e. in the range [0, 0x80). */ This comment should really say something more like "Returns true iff every character in the null-terminated string pointed to by |aChar| is ASCII, i.e. in the range [0, 0x80).", right?
Attachment #9046443 -
Flags: review?(nfroyd) → review+
Comment 17•5 years ago
|
||
Comment on attachment 9046442 [details] [diff] [review] Remove nsCRT::IsAscii(char16_t) in favor of mozilla::IsAscii Review of attachment 9046442 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/streamconv/converters/mozTXTToHTMLConv.cpp @@ +220,1 @@ > (!isEmail || aInString[uint32_t(i)] != ')'); Such conditional.
Attachment #9046442 -
Flags: review?(nfroyd) → review+
Comment 18•5 years ago
|
||
Pushed by jwalden@mit.edu: https://hg.mozilla.org/integration/mozilla-inbound/rev/e4565e7ddcb0 Remove nsCRT::IsAscii(char16_t) in favor of mozilla::IsAscii. r=froydnj https://hg.mozilla.org/integration/mozilla-inbound/rev/18e52bd7c3ad Remove nsCRT::IsAscii(null-terminated string) in favor of mozilla::IsAsciiNullTerminated. r=froydnj
Assignee | ||
Updated•5 years ago
|
Keywords: leave-open
Comment 19•5 years ago
|
||
Pushed by jwalden@mit.edu: https://hg.mozilla.org/integration/mozilla-inbound/rev/6ad7a291385a Followup for Windows-specific bustage about char16ptr_t where every other platform has a nice pointer-to-character type instead. r=bustage in a CLOSED TREE
Comment 20•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/e4565e7ddcb0
https://hg.mozilla.org/mozilla-central/rev/18e52bd7c3ad
https://hg.mozilla.org/mozilla-central/rev/6ad7a291385a
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
status-firefox67:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla67
You need to log in
before you can comment on or make changes to this bug.
Description
•