Closed Bug 1063962 Opened 5 years ago Closed 5 years ago

Replace jschar typedef with char16_t


(Core :: JavaScript Engine, defect)

Not set





(Reporter: cpeterson, Assigned: cpeterson)



(Keywords: addon-compat)


(2 files)

SpiderMonkey's "Long-term Plans" wiki says:

> Recent changes mean that SpiderMonkey's historical jschar type is identical to
> char16_t (an emulated typedef in older compilers). We should cut out the middle
> man and just use char16_t directly. This is mostly a large search-and-replace
> operation on Gecko and JSAPI headers. (Although as a short-term compatibility
> hack, we may still want to preserve a jschar typedef for the short run, maybe in
> SpiderMonkey 31, with full removal in 38.)

* Jan: since your recent Latin1 changes touched lots of jschar code, I thought you would be an appropriate reviewer. These patches are mostly search-and-replace plus some whitespace fixups.

The wiki suggests retaining the jschar typedef for JSAPI source compatibility in the next SpiderMonkey release (ESR 31 at the time the wiki was written; now ESR 38). We might also want to retain ctypes.jschar as an alias for the new ctypes.char16_t type. But my patch does neither.

Green Try run:
Jan: since your recent Latin1 changes touched lots of jschar code, I thought you would be an appropriate reviewer. These patches are mostly search-and-replace plus some whitespace fixups.
Attachment #8485433 - Flags: review?(jdemooij)
Part 2: Replace jschar with char16_t in Gecko
Attachment #8485434 - Flags: review?(jdemooij)
When I added JS::Latin1Char, we discussed eventually renaming jschar to JS::TwoByteChar...

Luke, Jason, Waldo: what do you think?
We got rid of jsdouble in favor of using just double.  JS::TwoByteChar smells exactly like jsdouble.  char16_t, even *more* strongly than double (which conceivably could be not-IEEE-754 double format, but which we assume is exactly that format), has exactly the two-byte-character sense we are trying to evoke.  Ergo, no JS::TwoByteChar, bring in char16_t.
char16_t is a new distinct type in C++11, so it provides stronger type checking than typedef'ing JS::TwoByteChar to uint16_t. (C11's char16_t, however, is just a typedef for uint_least16_t.)
I think the vague idea was to rename jschar to JS::TwoByteChar, not to change to uint16_t or some different underlying type from what it is now.  But that makes it window-dressing, and per the jsdouble precedent window-dressing isn't a valid reason to have/use a typedef.
I felt that TwoByteChar > uint16_t, but it seems like char16_t > TwoByteChar since it's shorter, says the same thing, and has better type rules.
Comment on attachment 8485433 [details] [diff] [review]

Review of attachment 8485433 [details] [diff] [review]:

Looks great, thanks!

::: js/public/CharacterEncoding.h
@@ +187,5 @@
>  uint32_t
>  Utf8ToOneUcs4Char(const uint8_t *utf8Buffer, int utf8Length);
>  /*
> + * Inflate bytes in UTF-8 encoding to char16_ts.

char16_ts looks a bit weird. Here we can use char16_t or char16_t's. Can you grep the patch for similar cases and fix them?

::: js/src/frontend/TokenStream.h
@@ +763,5 @@
>      unsigned            lookahead;          // count of lookahead tokens
>      unsigned            lineno;             // current line number
>      Flags               flags;              // flags -- see above
> +    const char16_t        *linebase;          // start of current line;  points into userbuf
> +    const char16_t        *prevLinebase;      // start of previous line;  nullptr if on the first line

Nit: fix spacing here

::: js/src/gdb/mozilla/
@@ +33,5 @@
>  class JSStringPtr(Common):
>      def display_hint(self):
>          return "string"
> +    def char16_ts(self):

Nit: maybe just |chars| or |twoByteChars|?

::: js/src/jsapi.h
@@ +995,5 @@
>   *   u      uint32_t        ECMA uint32_t
>   *   d      double          IEEE double
>   *   I      double          Integral IEEE double
>   *   S      JSString *      Unicode string, accessed by a JSString pointer
> + *   W      char16_t *        Unicode character vector, 0-terminated (W for wide)

Nit: fix spacing

@@ +4600,5 @@
>      unsigned        lineno;         /* source line number */
>      const char      *linebuf;       /* offending source line without final \n */
>      const char      *tokenptr;      /* pointer to error token in linebuf */
> +    const char16_t    *uclinebuf;     /* unicode (original) line buffer */
> +    const char16_t    *uctokenptr;    /* unicode (original) token pointer */

And here.

@@ +4605,4 @@
>      unsigned        flags;          /* error/warning, etc. */
>      unsigned        errorNumber;    /* the error number, e.g. see js.msg */
> +    const char16_t    *ucmessage;     /* the (default) error message */
> +    const char16_t    **messageArgs;  /* arguments for the error message */

And here.

::: js/src/jsprf.cpp
@@ +595,5 @@
>          case TYPE_UINT32:       (void) va_arg(ap, uint32_t);    break;
>          case TYPE_INT64:        (void) va_arg(ap, int64_t);     break;
>          case TYPE_UINT64:       (void) va_arg(ap, uint64_t);    break;
>          case TYPE_STRING:       (void) va_arg(ap, char*);       break;
> +        case TYPE_WSTRING:      (void) va_arg(ap, char16_t*);     break;

Nit: fix whitespace

::: js/src/jsstr.cpp
@@ +209,1 @@
>             b = chars[1],

Nit: fix indentation.

@@ +225,1 @@
>             b = chars[1];

And here.

::: js/src/vm/String.h
@@ +155,5 @@
>          union {
>              union {
>                  /* JS(Fat)InlineString */
>                  JS::Latin1Char     inlineStorageLatin1[NUM_INLINE_CHARS_LATIN1];
> +                char16_t             inlineStorageTwoByte[NUM_INLINE_CHARS_TWO_BYTE];

Nit: fix spacing

@@ +160,5 @@
>              };
>              struct {
>                  union {
>                      const JS::Latin1Char *nonInlineCharsLatin1; /* JSLinearString, except JS(Fat)InlineString */
> +                    const char16_t   *nonInlineCharsTwoByte;/* JSLinearString, except JS(Fat)InlineString */

And here.

::: js/xpconnect/src/XPCConvert.cpp
@@ +1302,5 @@
>      case nsXPTType::T_FLOAT         : POPULATE(float);          break;
>      case nsXPTType::T_DOUBLE        : POPULATE(double);         break;
>      case nsXPTType::T_BOOL          : POPULATE(bool);           break;
>      case nsXPTType::T_CHAR          : POPULATE(char);           break;
> +    case nsXPTType::T_WCHAR         : POPULATE(char16_t);         break;

Nit: fix "break;" indentation (and 3x below)

::: js/xpconnect/src/XPCJSRuntime.cpp
@@ +3055,5 @@
>  // the source for a chrome JS function. See the comment in the XPCJSRuntime
>  // constructor.
>  class XPCJSSourceHook: public js::SourceHook {
> +    bool load(JSContext *cx, const char *filename,
> +              char16_t **src, size_t *length) {

Nit: no need to break this up; XPConnect follows JS style (99 chars per line, 80 for comments).
Attachment #8485433 - Flags: review?(jdemooij) → review+
Comment on attachment 8485434 [details] [diff] [review]

Review of attachment 8485434 [details] [diff] [review]:

::: dom/wifi/WifiUtils.cpp
@@ +213,5 @@
>        for (uint32_t m = 1; m < n; m++)
>          if ((src[i + m] & 0xC0) != 0x80)
>            INVALID(ReportInvalidCharacter, i, m);
> +      // Determine the code unit's length in char16_ts and act accordingly.

Nit: same comment about char16_ts; please fix if you agree.
Attachment #8485434 - Flags: review?(jdemooij) → review+
Landed with fixes for indentation and "char16_ts":
Waldo: When should I update MDN's JSAPI docs so the JSAPI function signatures use char16_t instead of jschar? Now or after the next SpiderMonkey release (38)?

The current JSAPI release (SpiderMonkey 31) still uses jschar, which is gone. SpiderMonkey 31's jschar was a typedef for char16_t, so it should be safe for the docs to tell JSAPI users to use char16_t instead of jschar without breaking their builds or ABI.
Flags: needinfo?(jwalden+bmo)
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
Depends on: 1064988
Depends on: 1064935
Commit pushed to master at
Bug 1063962 - Replace jschar typedef with C++11 char16_t type. r=jandem
(In reply to Chris Peterson (:cpeterson) from comment #11)
> Waldo: When should I update MDN's JSAPI docs so the JSAPI function
> signatures use char16_t instead of jschar? Now or after the next
> SpiderMonkey release (38)?

We can start by noting the change on <>, for sure.  Also on the main reference page, if there are reasonable places needing a change there.

As for when to change all the individual pages...there's not a good answer.  Recall that we switched from JSBool to bool, uintN to unsigned, jsdouble to double, &c. a few years ago.  We still haven't updated the reference for those changes yet!  So in a real sense we have no precedent for how to do this.

It's somewhat probable that most APIs that use jschar, are documented with jschar being a link to the page for the type.  So even if we don't fix all the text/links for jschar's removal, there's at least a link that we can change to say the type is obsolete as of 38 and must be replaced with char16_t.  So I'd be inclined to note the change there, then sometime after 38 is out go and remove all the jschar mentions.
Flags: needinfo?(jwalden+bmo)
I updated the js-ctypes documentation to use ctypes.char16_t and mention that ctypes.jschar is an alias:
You need to log in before you can comment on or make changes to this bug.