Closed Bug 895047 Opened 11 years ago Closed 11 years ago

typedef char16_t PRUnichar/jschar

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla27

People

(Reporter: jcranmer, Assigned: ehsan.akhgari)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

The last type defined in prtypes.h that we still use a lot is PRUnichar. As of bug 796948, we have a definition for char16_t. Once C++11 mode is required for gcc 4.4, we should be able to use char16_t everywhere in our codebase.

However, unlike PRInt*/int*_t, PRUnichar is going to be mostly used as a pointer, which makes it hard to incrementally use char16_t instead of PRUnichar. Since we already go the route of defining the typedefs for PRUnichar in nscore.h, we could go a step further, include mozilla/Char16.h, and typedef char16_t PRUnichar.

This will very likely break anyone who includes an NSPR header before an XPCOM one, and it comes with ABI compatibility concerns, since we're calling NSPR functions with different signatures (per C++11's definition, not C11's definition) than their definitions have.

The intent would be to follow this up shortly thereafter with a mass PRUnichar->char16_t time, after the change has successfully baked for a bit.
See Also: → 523156
(In reply to Joshua Cranmer [:jcranmer] from comment #0)
> The last type defined in prtypes.h that we still use a lot is PRUnichar. As
> of bug 796948, we have a definition for char16_t. Once C++11 mode is
> required for gcc 4.4, we should be able to use char16_t everywhere in our
> codebase.
> 
> However, unlike PRInt*/int*_t, PRUnichar is going to be mostly used as a
> pointer, which makes it hard to incrementally use char16_t instead of
> PRUnichar. Since we already go the route of defining the typedefs for
> PRUnichar in nscore.h, we could go a step further, include mozilla/Char16.h,
> and typedef char16_t PRUnichar.

\o/

> This will very likely break anyone who includes an NSPR header before an
> XPCOM one, and it comes with ABI compatibility concerns, since we're calling
> NSPR functions with different signatures (per C++11's definition, not C11's
> definition) than their definitions have.

Have you verified that the signatures indeed differ on the compilers that we care about?

> The intent would be to follow this up shortly thereafter with a mass
> PRUnichar->char16_t time, after the change has successfully baked for a bit.

I would be more than happy to help with that conversion.  I still have all of my stdint infrastructure in place and modifying that to convert PRUnichar->char16_t will probably be 10 minutes of work.  :-)
char16_t, if you believe the standard, is a totally distinct type that has the same representation as uint_least16_t.  So it *must* be different with conforming compilers.  gcc and clang conform; MSVC doesn't and makes char16_t an alias of WCHAR (?).  There's definitely dragons here.
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #1)
> > This will very likely break anyone who includes an NSPR header before an
> > XPCOM one, and it comes with ABI compatibility concerns, since we're calling
> > NSPR functions with different signatures (per C++11's definition, not C11's
> > definition) than their definitions have.
> 
> Have you verified that the signatures indeed differ on the compilers that we
> care about?

C++11 considers char16_t to be a distinct type that has the same size, signedness, and alignment as uint16_t. C11 considers it to be a typedef of uint16_t.

On Windows, char16_t is currently #define'd to wchar_t, while PRUnichar is a typedef of wchar_t. On everybody else, PRUnichar is a typedef of uint16_t.

> I would be more than happy to help with that conversion.  I still have all
> of my stdint infrastructure in place and modifying that to convert
> PRUnichar->char16_t will probably be 10 minutes of work.  :-)

I was trying to look for ways to start using char16_t and Char16.h without starting a mass conversion, since this does require risky-ish things (like mandatory C++11 mode), and trying to do a backout of a mass conversion is painful for everybody. Since bug 523156 looks like a simpler, less risky change, I might look into doing that first.
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #2)
> char16_t, if you believe the standard, is a totally distinct type that has
> the same representation as uint_least16_t.  So it *must* be different with
> conforming compilers.  gcc and clang conform; MSVC doesn't and makes
> char16_t an alias of WCHAR (?).  There's definitely dragons here.

This discussion has been belabored to death with the bug that actually added Char16.h in the first place. Definitely, before we do the mass conversion, we want to have actual uses of char16_t baking in the tree. One option is not even going as far as adding the typedef but merely including the header in nscore.h (so we can gauge how much code would break by including it) and using it instead of NS_LL in places.

Switching jschar could also get us feedback from embedders about how annoying it is or isn't.
(In reply to Joshua Cranmer [:jcranmer] from comment #0)
> This will very likely break anyone who includes an NSPR header before an
> XPCOM one, and it comes with ABI compatibility concerns, since we're calling
> NSPR functions with different signatures (per C++11's definition, not C11's
> definition) than their definitions have.

Since these are C functions their signature isn't going to change, right? I would assume that passing a char16_t* and a PRUnichar* is going to look exactly the same at the ABI level, so what is the concern here?
Assignee: nobody → ehsan
Summary: typedef char16_t PRUnichar → typedef char16_t PRUnichar/jschar
Depends on: 924011
Depends on: 924012
Depends on: 924014
Attached patch WIP (obsolete) — Splinter Review
Flagging Joshua on the general approach -- this probably won't build much.
Attachment #814020 - Flags: feedback?(Pidgeot18)
Depends on: 924019
Depends on: 924107
Depends on: 924180
So, in order to get this built on Linux, I'll have to remove the <stdint.h> include from Char16.h, and just typedef char16_t to unsigned short.  It seems like the <stdint.h> header breaks building some C libraries suck as sqlite in ways that seem very hard to fix, and I prefer to not patch our sqlite copy for this reason.

Is this fine with you, Joshua?
Flags: needinfo?(Pidgeot18)
Bleh, that's ugly, but maybe the only real option we have.  Really unfortunate C/C++ didn't have the fixed-size types from the start.  :-\
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #10)
> Bleh, that's ugly, but maybe the only real option we have.  Really
> unfortunate C/C++ didn't have the fixed-size types from the start.  :-\

Tell me about it...
Attached patch Patch (v1)Splinter Review
A try build looks promising: <https://tbpl.mozilla.org/?tree=Try&rev=aa908e85f526> (ignore the Linux red builds which I fixed locally in the patch, and the Android 4.0 infra bustage of today!)

Over to review from:
 * Jason on the js/ changes.
 * gps/glandium on the build system changes.
 * Joshua on the rest.

Please note that this patch may easily bitrot, so I'd rather land it as soon as I can tell.  Speedy reviews are appreciated!  :-)
Attachment #814020 - Attachment is obsolete: true
Attachment #814020 - Flags: feedback?(Pidgeot18)
Attachment #817476 - Flags: review?(mh+mozilla)
Attachment #817476 - Flags: review?(jorendorff)
Attachment #817476 - Flags: review?(gps)
Attachment #817476 - Flags: review?(Pidgeot18)
Comment on attachment 817476 [details] [diff] [review]
Patch (v1)

Review of attachment 817476 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla-config.h.in
@@ +36,5 @@
> + */
> +#if !defined(ELFHACK_BUILD) && !defined(ELFDYNSTRGC_BUILD) && !defined(__ASSEMBLER__)
> +#define __PRUNICHAR__
> +#include "mozilla/Char16.h"
> +#endif

I can see why you need ELFDYNSTRGC_BUILD, but not ELFHACK_BUILD (except for a missing dependency between the elfhack directory and mfbt).

Anyways, in both cases, I'd rather solve this with proper dependencies and directory ordering than with define hacks. For elfhack, that should just be about adding something like build/unix/elfhack/export: mfbt/export in Makefile.in. For elf-dynstr-gc, well, i'd first look if it's actually still relevant. If not, remove it, otherwise, move it in some other directory (it's only used on components, so it doesn't really need to be built that early) and use the same kind of dependency as for elfhack.

That being said, i can understand you'd only want to do this in a followup.
Attachment #817476 - Flags: review?(mh+mozilla)
Attachment #817476 - Flags: review?(gps)
Attachment #817476 - Flags: review+
Attachment #817476 - Flags: review?(jorendorff) → review+
Comment on attachment 817476 [details] [diff] [review]
Patch (v1)

Review of attachment 817476 [details] [diff] [review]:
-----------------------------------------------------------------

::: build/unix/elfhack/Makefile.in
@@ +15,5 @@
>  WRAP_LDFLAGS=
>  
>  include $(topsrcdir)/config/rules.mk
>  
> +DEFINES += -DELFHACK_BUILD

Put this in moz.build, please.

::: build/unix/elfhack/inject/Makefile.in
@@ +29,5 @@
>  	cp $< $@
>  
>  GARBAGE += $(CSRCS)
>  
> +DEFINES += -DELFHACK_BUILD

Ditto.
(In reply to :Ms2ger from comment #14)
> Comment on attachment 817476 [details] [diff] [review]
> Patch (v1)
> 
> Review of attachment 817476 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: build/unix/elfhack/Makefile.in
> @@ +15,5 @@
> >  WRAP_LDFLAGS=
> >  
> >  include $(topsrcdir)/config/rules.mk
> >  
> > +DEFINES += -DELFHACK_BUILD
> 
> Put this in moz.build, please.
> 
> ::: build/unix/elfhack/inject/Makefile.in
> @@ +29,5 @@
> >  	cp $< $@
> >  
> >  GARBAGE += $(CSRCS)
> >  
> > +DEFINES += -DELFHACK_BUILD
> 
> Ditto.

Hmm, not sure if that's the right thing to do...

$ git grep -w DEFINES | grep moz.build
python/mozbuild/mozbuild/test/backend/data/defines/moz.build:5:DEFINES = {
python/mozbuild/mozbuild/test/backend/data/defines/moz.build:9:DEFINES['BAZ'] = '"abcd"'
python/mozbuild/mozbuild/test/backend/data/defines/moz.build:10:DEFINES.update({
python/mozbuild/mozbuild/test/frontend/data/defines/moz.build:5:DEFINES = {
python/mozbuild/mozbuild/test/frontend/data/defines/moz.build:9:DEFINES['BAZ'] = '"abcd"'
python/mozbuild/mozbuild/test/frontend/data/defines/moz.build:10:DEFINES.update({
(In reply to Mike Hommey [:glandium] from comment #13)
> Comment on attachment 817476 [details] [diff] [review]
> Patch (v1)
> 
> Review of attachment 817476 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mozilla-config.h.in
> @@ +36,5 @@
> > + */
> > +#if !defined(ELFHACK_BUILD) && !defined(ELFDYNSTRGC_BUILD) && !defined(__ASSEMBLER__)
> > +#define __PRUNICHAR__
> > +#include "mozilla/Char16.h"
> > +#endif
> 
> I can see why you need ELFDYNSTRGC_BUILD, but not ELFHACK_BUILD (except for
> a missing dependency between the elfhack directory and mfbt).

The reason why I added them both was that both tools seem to get compiled as part of the export tier (well, I'm sure about the first one at least <http://mxr.mozilla.org/mozilla-central/source/config/Makefile.in#21>) which means that dist/include/mozilla/Char16.h may not be available when these things are built.  That being said, I can't figure out which tier exactly is elfhack built in, but removing the ELFHACK_BUILD hack definitely breaks Linux builds.

> Anyways, in both cases, I'd rather solve this with proper dependencies and
> directory ordering than with define hacks. For elfhack, that should just be
> about adding something like build/unix/elfhack/export: mfbt/export in
> Makefile.in.

Hmm, is "build/unix/elfhack/export" a make target?  I'm not 100% sure what you're suggesting here (it would be great if you could file the follow-up saying exactly what needs to happen there.)

> For elf-dynstr-gc, well, i'd first look if it's actually still
> relevant.

It is definitely *used* in our 32-bit optimized Linux builds (see my try push which does not include the ELFDYNSTRGC_BUILD hack).  I don't know what the purpose of that tool is so I can't tell whether that's relevant.

> If not, remove it, otherwise, move it in some other directory
> (it's only used on components, so it doesn't really need to be built that
> early) and use the same kind of dependency as for elfhack.

This is where it's used, right? <http://mxr.mozilla.org/mozilla-central/source/config/makefiles/target_libs.mk#26>  That's in the libs (and binaries?!) tiers, so doesn't this need to be built before that tier?  And won't that imply export?

> That being said, i can understand you'd only want to do this in a followup.

Thanks for not making this stuff block landing this patch.  :-)
Comment on attachment 817476 [details] [diff] [review]
Patch (v1)

Review of attachment 817476 [details] [diff] [review]:
-----------------------------------------------------------------

I dislike this patch a lot for a lot of evil global hackiness. I'm willing to stomach the worst of it--PRUnichar definition in Char16.h, Char16.h almost-global inclusion via global headers--if that is removed when massive sed-ification eliminates PRUnichar from the tree.

::: mfbt/Char16.h
@@ +46,5 @@
> +      * stdint.h will break building some of our C libraries, such as
> +      * sqlite.
> +      */
> +     typedef unsigned short char16_t;
> +#  endif

Notice that this doesn't defined MOZ_UTF16 for C code. I don't think it is worth blocking the patch on this, but it does break the invariant that a successful #include guarantees the static_assert's below.

@@ +52,5 @@
>  #  error "Char16.h requires C++11 (or something like it) for UTF-16 support."
>  #endif
>  
> +#define __PRUNICHAR__
> +typedef char16_t PRUnichar;

Ew, hacky hack hack hacky. Please put a comment in here that this is temporary (or, rather, all the changes in this file are temporary) until we actually do the s/PRUnichar/char16_t/ conversion.

::: mozilla-config.h.in
@@ +35,5 @@
> + * during the export tier.  Also, disable this when building assembly files too.
> + */
> +#if !defined(ELFHACK_BUILD) && !defined(ELFDYNSTRGC_BUILD) && !defined(__ASSEMBLER__)
> +#define __PRUNICHAR__
> +#include "mozilla/Char16.h"

I find force-including Char16.h here to be extremely hacky, and problematic, especially given the issues I mentioned for Char16.h
Attachment #817476 - Flags: review?(Pidgeot18) → review+
(In reply to Joshua Cranmer [:jcranmer] from comment #17)
> I dislike this patch a lot for a lot of evil global hackiness.

Makes two of us.  :-)

> I'm willing
> to stomach the worst of it--PRUnichar definition in Char16.h, Char16.h
> almost-global inclusion via global headers--if that is removed when massive
> sed-ification eliminates PRUnichar from the tree.

Yes, that is my plan.  I decided to go with this approach at first to make sure that we won't find any regressions before doing the massive PRUnichar removal.  But when that happens (and I plan to do it very shortly after the next uplift) then we will be able to simplify a lot of this.

> ::: mfbt/Char16.h
> @@ +46,5 @@
> > +      * stdint.h will break building some of our C libraries, such as
> > +      * sqlite.
> > +      */
> > +     typedef unsigned short char16_t;
> > +#  endif
> 
> Notice that this doesn't defined MOZ_UTF16 for C code. I don't think it is
> worth blocking the patch on this, but it does break the invariant that a
> successful #include guarantees the static_assert's below.

static_assert is a C++ keyword, so those assertions would not compile in C code regardless.  I've actually addressed this in the patch!

> @@ +52,5 @@
> >  #  error "Char16.h requires C++11 (or something like it) for UTF-16 support."
> >  #endif
> >  
> > +#define __PRUNICHAR__
> > +typedef char16_t PRUnichar;
> 
> Ew, hacky hack hack hacky. Please put a comment in here that this is
> temporary (or, rather, all the changes in this file are temporary) until we
> actually do the s/PRUnichar/char16_t/ conversion.

Will do.

> ::: mozilla-config.h.in
> @@ +35,5 @@
> > + * during the export tier.  Also, disable this when building assembly files too.
> > + */
> > +#if !defined(ELFHACK_BUILD) && !defined(ELFDYNSTRGC_BUILD) && !defined(__ASSEMBLER__)
> > +#define __PRUNICHAR__
> > +#include "mozilla/Char16.h"
> 
> I find force-including Char16.h here to be extremely hacky, and problematic,
> especially given the issues I mentioned for Char16.

The reason why the force-include is needed is the PRUnichar ubiquity across our code, which will be remedied after the sed-ifying patch.  But note that even after that conversion, we will go back to the existing places where we #include this (for things which need MOZ_UTF16.)
Also note that I did not invent the __PRUNICHAR__ hack, I just moved it out of nscore.h.  :-)
Blocks: 927728
Flags: needinfo?(Pidgeot18)
Gah, this requires clobber I bet :(

Relanded: https://hg.mozilla.org/integration/mozilla-inbound/rev/30d9e30f0c8c
https://hg.mozilla.org/mozilla-central/rev/30d9e30f0c8c
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Depends on: 927899
Please reconsider this solution. It breaks terribly on mingw, where G++ follows standards so it has char16_t, which is different from wchar_t. It would require having tons of casts between win32 API, which uses wchar_t extensivelly, and the rest of code using char16_t.

Looking at the solution for MSVC from the patch:

#  define char16_t wchar_t

I'm pretty sure that's not a good idea to maintain. Trying the same with g++ results in having tons of redefinitions for types specialized both for wchar_t and char16_t. This will also cause similar problems on MSVC once (if) it will have proper char16_t support.
What redefinition errors do you get?  Can you please file a follow-up bug so that we can fix this for mingw?
Depends on: 928351
Depends on: 928091
You need to log in before you can comment on or make changes to this bug.