Closed Bug 1446237 Opened 6 years ago Closed 6 years ago

Can{Upper,Lower}Case* functions are confusingly named

Categories

(Core :: JavaScript Engine, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
mozilla61
Tracking Status
firefox61 --- fixed

People

(Reporter: Waldo, Assigned: Waldo)

Details

Attachments

(2 files)

Looking at the CanUpperCaseSpecialCasing function against just now, I found myself confused about its name -- couldn't match it up with semantics well.  And the comments by the function, I could *sort* of remember what they were supposed to communicate, but not completely enough to be comfortable with them.

The problem I have right now, is that "can upper case" in JS terms is a thing that's true of any character, sort of.  Just, some uppercasing does nothing.  If uppercasing does nothing, then CanUpperCase* returns false, otherwise returns true.  (With some finessing of that for the special-case form.)

It feels to me like a different name would make clearer this intent: ChangesWhen{Upper,Lower}Cased*.  That, and some doc-comment changes.

The effective patch here is mostly just alpha-renames, but the comment-change could definitely use a second set of eyes on it.  Tho to be sure, both of us may be somewhat too deep in this to bring a *truly* un-jaundiced eye to bear.  But I'm not sure how state-of-nature-compatible this comment has to be anyway, because if you're looking at this stuff, you probably have at least a *little* background in it already.
So I finished up the real patch for this, then tried rerunning the Unicode data-gen script to ensure it wouldn't change anything.  Turns out...no.  Multiple changes by multiple people mean the script is currently un-runnable.  So here's making both them pay for it :-) by having them both review it, so that they're aware the relevant files were generated and require modification and rerunning of a script, not direct changes.
Attachment #8959403 - Flags: review?(jorendorff)
Attachment #8959403 - Flags: review?(sphink)
Comment on attachment 8959403 [details] [diff] [review]
Adjust make_unicode.py for its new location in vm/ and for the introduction of js/src/tests/non262/

Review of attachment 8959403 [details] [diff] [review]:
-----------------------------------------------------------------

Excellent work. Keep it up. Couldn't have done it better myself. (Didn't do it better myself.) Nice going. Good show, mate.
Attachment #8959403 - Flags: review?(sphink) → review+
Comment on attachment 8959404 [details] [diff] [review]
Rename some String.cpp and Unicode.h functions to have better, and different, names -- for clarity particularly in scumbag unified builds with scumbag global |using namespace|

Review of attachment 8959404 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Jeff Walden [:Waldo] from comment #2)
> Rename some String.cpp and Unicode.h functions to have better, and
> different, names -- for clarity particularly in scumbag unified builds with
> scumbag global |using namespace|

We should probably just remove "using namespace js::unicode;" from Unicode.cpp. Which makes me wonder whether or not "using namespace js::unicode;" violates the namespace rules in [1]? (And it looks like there's no static analyser to check for "using namespace" violations. -> [2])

[1] https://wiki.mozilla.org/JavaScript:SpiderMonkey:Coding_Style#Namespaces
[2] https://searchfox.org/mozilla-central/rev/3abf6fa7e2a6d9a7bfb88796141b0f012e68c2db/js/src/util/Text.cpp#16

::: js/src/builtin/String.cpp
@@ +1113,5 @@
>  
>  #endif // EXPOSE_INTL_API
>  
>  static inline bool
> +ToUpperCaseHasSpecialHandling(Latin1Char charCode)

Maybe "ToUpperCaseHasSpecialCasing" so the suffix matches "ToUpperCaseLengthSpecialCasing" and "ToUpperCaseAppendUpperCaseSpecialCasing" and to keep "special casing" in the function name?

::: js/src/util/Unicode.h
@@ +293,5 @@
>  }
>  
>  // Returns true iff ToUpperCase(ch) != ch.
>  inline bool
> +ChangesWhenUpperCased(char16_t ch)

Do you think it makes sense to stress that this function doesn't return the Changes_When_Uppercased Unicode property value?

@@ +308,5 @@
>      if (MOZ_LIKELY(ch < 128))
>          return ch >= 'a' && ch <= 'z';
>  
>      // U+00B5 and U+00E0 to U+00FF, except U+00F7, have an uppercase form.
>      bool canUpper = ch == MICRO_SIGN ||

Maybe s/canUpper/hasUpper/?

@@ +331,5 @@
>      if (MOZ_LIKELY(ch < 128))
>          return ch >= 'A' && ch <= 'Z';
>  
>      // U+00C0 to U+00DE, except U+00D7, have a lowercase form.
>      bool canLower = ((ch & ~0x1F) == LATIN_CAPITAL_LETTER_A_WITH_GRAVE) &&

Dito.

@@ +408,5 @@
> + * AppendUpperCaseSpecialCasing in the former case and ToUpperCase in the
> + * latter.
> + *
> + * NOTE: All special upper case mappings are unconditional (that is, they don't
> + *       depend on language/locale or context within the string) in Unicode 9.

Nit: Update to "Unicode 10".
Attachment #8959404 - Flags: review?(andrebargull) → review+
(In reply to André Bargull [:anba] from comment #4)
> We should probably just remove "using namespace js::unicode;" from
> Unicode.cpp. Which makes me wonder whether or not "using namespace
> js::unicode;" violates the namespace rules in [1]? (And it looks like
> there's no static analyser to check for "using namespace" violations. -> [2])

I'll look into removing that in a separate followup.

> Maybe s/canUpper/hasUpper/?
> Dito.

WFM except "ditto".  :-)
(In reply to Jeff Walden [:Waldo] from comment #5)
> > Maybe s/canUpper/hasUpper/?
> > Dito.
> 
> WFM except "ditto".  :-)

It's interesting that English kept "ditto" instead of going with the French version "dito". (Given that many English words were borrowed from French. Maybe the word "ditto" is just too new?) For example in German "dito" is nowadays the correct version (since the spelling reform from 1901, before that "ditto" was (also?) the correct version). [Sources: en.wiktionary.org, de.wiktionary.org, archive.org/details/Regeln-fuer-die-deutsche-Rechtschreibung]
Attachment #8959403 - Flags: review?(jorendorff) → review+
Priority: -- → P2
(In reply to André Bargull [:anba] from comment #6)

lol

I like how we can have fun here.  :-)
Pushed by jwalden@mit.edu:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b920e7388514
Adjust make_unicode.py for its new location in vm/ and for the introduction of js/src/tests/non262/.  r=sfink, r=jorendorff
https://hg.mozilla.org/integration/mozilla-inbound/rev/a79e7f60c578
Rename some String.cpp and Unicode.h functions to have better, and different, names -- for clarity particularly in scumbag unified builds with scumbag global |using namespace|.  r=anba
https://hg.mozilla.org/mozilla-central/rev/b920e7388514
https://hg.mozilla.org/mozilla-central/rev/a79e7f60c578
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: