Closed Bug 1393230 Opened 7 years ago Closed 7 years ago

Convert string classes to use templates

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: erahm, Assigned: erahm)

References

(Blocks 2 open bugs)

Details

Attachments

(3 files, 12 obsolete files)

862 bytes, patch
n.nethercote
: review+
Details | Diff | Splinter Review
5.55 KB, patch
n.nethercote
: review+
xeonchen
: review+
Details | Diff | Splinter Review
228.01 KB, patch
erahm
: review+
Details | Diff | Splinter Review
In the past we've attempted to convert our string classes over to use proper templates and have fallen short (see bug 717356 for example). Now that we have full C++11 support I think we can make another attempt.
Assignee: nobody → erahm
Depends on: 1393235
Blocks: 1393238
WIP for converting to templates. Should compile on Linux at least, still need to deal with Windows char16ptr_t stuff.
There are some rust binding issues I'm working through now. In the meantime I thought it might be helpful to explain some of the oddities in these patches.

Odd double templates in a header file
=====================================

> template <typename T> class String {
> public:
>   int32_t Find(const T*) const; // normal version
>
>   template <typename Q = IsChar16>
>   int32_t Find(const char*) const; // odd version
> };

This is a trick used to only enable a function for a certain character class. In this case if T == char16_t we'll provide a version of `Find` that takes a `char *` in *addition to* a version that takes a `char16_t *`. Why is the template type called Q? It doesn't really matter it just needs to not be T. It came out of a few iterations at trying to make the declarations more readable, originally I had something like:

>   template <typename Q = T, typename R = std::enable_if<std::is_type<char16_t, Q>::value>::type>

For reasons I don't recall in some variations you couldn't use T in the `enable_if`, so Q is defaulted to it instead.


Odd double templates in a cpp file
==================================

> template <typename T>
> template <typename Q>
> int32_t Find(const char*) const { ... }

This is just the corollary to the previous odd item, since we made a member function of a templated class also take a template param we need to separate them. So the `T` line is for the class template and the `Q` line is for the member template. We don't actually use `Q`, it's just there for the `enable_if`-fu we did above.


Odd use of `incompatible_char_type`
==================================

There were some instances where clang got confused by things like:

>   template <typename Q = IsChar16>
>   int32_t Foo(const char*) const; // odd version

But if I switched to:

>   template <typename Q = IsChar16>
>   int32_t Foo(const incompatible_char_type*) const; // odd version

it became happy again. I think it had to do with the `T == char` version conflicting with the second declaration. Calling `nsCString::Foo(char16_t*)` will result in a compilation error, so it's not a big deal.


Odd additions of `this->`
=========================

Switching to templates + inheritance causes some weirdness where you have to disambiguate calls that are supposed to cascade to parent class. I don't recall the specific reason -- some background [1] -- but it's what clang suggested and it just works. Something similar can be accomplished by using `base_string_type::` instead, but that's longer and isn't guaranteed to actually be the next closest base type.


Some seemingly arbitrary conversions of typedefs to using statements
====================================================================

I ran into bindgen issues with some typedefs [2?], I might just convert all of them to be consistent (they're also a bit easier to read and more concise).

[1] https://stackoverflow.com/a/4643295
[2] https://github.com/rust-lang-nursery/rust-bindgen/issues/931
This is the latest patch with a few bindgen fixes, it compiles locally but fails on try.
Attachment #8900522 - Attachment is obsolete: true
Rebased
Attachment #8900521 - Attachment is obsolete: true
Attachment #8902047 - Attachment is obsolete: true
Emilio, these are the string changes I'm making that are breaking our rust bindings. The patches should apply cleanly on a recent inbound revision [1]. I've worked through a few bindgen related failures and everything compiles cleanly for me locally but my try pushes are failing in bindings.rs [2] and gecko_properties.rs [3].

My .mozconfig is pretty standard except I'm using clang 4.0 w/ ccache/sccache and lld for linking [4].

Would you mind helping me track down the issues on the bindgen side?

[1] https://hg.mozilla.org/integration/mozilla-inbound/rev/c5556637f69d48d84e51bfdb4c205a41ef2700e2
[2] https://treeherder.mozilla.org/logviewer.html#?job_id=126565850&repo=try&lineNumber=11642-11657
[3] https://treeherder.mozilla.org/logviewer.html#?job_id=126565850&repo=try&lineNumber=11658-11883
[4] # mozconfig that uses system install of clang-4.0

mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-@CONFIG_GUESS@-clang

# Use clang for compiling
CLANG_VERSION="4.0"
export CC="clang-${CLANG_VERSION}"
export CXX="clang++-${CLANG_VERSION}"
export LLVM_SYMBOLIZER="/usr/bin/llvm-symbolizer-${CLANG_VERSION}"
export LLVM_CONFIG="llvm-config-${CLANG_VERSION}"

# Use llvm's lld for linking
ac_add_options --enable-linker=lld

# Workaround bug 1388713 so that we can use lld
ac_add_options --disable-elf-hack

# Use ccache for c/c++
ac_add_options --with-ccache=/usr/bin/ccache

# Use sccache for rust
mk_add_options "export RUSTC_WRAPPER=/home/erahm/.cargo/bin/sccache"
Flags: needinfo?(emilio)
Comment on attachment 8902053 [details] [diff] [review]
Part 1: Convert the xpcom string classes to be templated on char type

Review of attachment 8902053 [details] [diff] [review]:
-----------------------------------------------------------------

Nick this is getting pretty close so I thought it might be worth taking a look at (there's going to be some churn so maybe just highlevel feedback and questions would make sense). I added some notes in comment 9 about why some of the changes are the way they are, hopefully that makes it easier to review. Also of course feel free to redirect, dbaron might want to be involved as well.
Attachment #8902053 - Flags: review?(n.nethercote)
I think the errors aren't totally bindgen-related, but they have to do with the rust bindings to nsString that live on xpcom/ and servo/components/style/nsstring-vendor.

Fixing this should be easy (though maybe somewhat nasty). Will look into it now.
Actually, this works for you (and for me) but fails on automation because in automation we use libclang 3.9, which doesn't include https://reviews.llvm.org/D26663.
Flags: needinfo?(emilio)
Err, keeping the ni? request for now.
Flags: needinfo?(emilio)
See Also: → 1394825
Attached patch Fix bindgen + libclang3.9 (obsolete) — Splinter Review
So, as far as I can tell, the right fix for the bindgen issue is fixing bug 1394825.

That being said, we have already a few hacks on tree to support this, so in case that doesn't happen soon, this should fix the build.
Flags: needinfo?(emilio)
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #19)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=62f7e24810f8e0c6b341d28daac0c5d7fc3845c8

(In reply to Emilio Cobos Álvarez [:emilio] from comment #18)
> Created attachment 8902307 [details] [diff] [review]
> Fix bindgen + libclang3.9
> 
> So, as far as I can tell, the right fix for the bindgen issue is fixing bug
> 1394825.
> 
> That being said, we have already a few hacks on tree to support this, so in
> case that doesn't happen soon, this should fix the build.

So it still seems sad [1], I think the fixup is choking on nsTArray<nsTString<u16>>? I'll see if I can repro locally by explicitly using clang 3.9.

[1] https://treeherder.mozilla.org/logviewer.html#?job_id=126817164&repo=try&lineNumber=12606-12625
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #20)
> (In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment
> #19)
> > https://treeherder.mozilla.org/#/
> > jobs?repo=try&revision=62f7e24810f8e0c6b341d28daac0c5d7fc3845c8
> 
> (In reply to Emilio Cobos Álvarez [:emilio] from comment #18)
> > Created attachment 8902307 [details] [diff] [review]
> > Fix bindgen + libclang3.9
> > 
> > So, as far as I can tell, the right fix for the bindgen issue is fixing bug
> > 1394825.
> > 
> > That being said, we have already a few hacks on tree to support this, so in
> > case that doesn't happen soon, this should fix the build.
> 
> So it still seems sad [1], I think the fixup is choking on
> nsTArray<nsTString<u16>>? I'll see if I can repro locally by explicitly
> using clang 3.9.
> 
> [1]
> https://treeherder.mozilla.org/logviewer.
> html#?job_id=126817164&repo=try&lineNumber=12606-12625

Okay I can repro [2], but my attempts at tweaking the fixups haven't helped. It's only failing on the `nsTArray<nsString>` declarations [1]. Stuff that's using `nsAString` (aka `nsTSubstring<char16_t>`) seems to work fine. I'm not sure if it's just that it's w/in a template param or if it's simply `nsString`

Emilio if you could take another look tomorrow that would be really helpful.

[1] http://searchfox.org/mozilla-central/rev/cd82cacec2cf734768827ff85ba2dba90a534c5e/layout/style/ServoBindings.h#424
[2] On ubuntu:
sudo apt-get install clang-3.9 llvm-3.9 # install 3.9
export LLVM_CONFIG="llvm-config-3.9" # tell bindgen to use 3.9
Flags: needinfo?(emilio)
Note that that seems in the bindings.rs file, which uses another set of fixups, at the bottom of the file.

So I'd expect a patch like:

diff --git a/layout/style/ServoBindings.toml b/layout/style/ServoBindings.toml
index ccd88dc25edb..685c4e18bb89 100644
--- a/layout/style/ServoBindings.toml
+++ b/layout/style/ServoBindings.toml
@@ -538,6 +538,7 @@ servo-borrow-types = [
     "RawGeckoStyleChildrenIterator",
 ]
 fixups = [
+    { pat = "\\bnsTString<u16>", rep = "nsString" },
     # hack for gecko-owned string
-    { pat = "<nsString", rep = "<nsStringRepr" },
+    { pat = "\\b<nsString\\b", rep = "<nsStringRepr" },
 ]

To help
Flags: needinfo?(emilio)
Comment on attachment 8902053 [details] [diff] [review]
Part 1: Convert the xpcom string classes to be templated on char type

Review of attachment 8902053 [details] [diff] [review]:
-----------------------------------------------------------------

As requested, I did a quick and dirty lookover. It generally seems reasonable, with the expected amount of template grossness. Two mains comments.

* There are numerous cases where an nsFooString.h file #includes nsTFooString.h, and likewise with .cpp files. In some of the cases, the former file doesn't contain much other than this #include. It would be nice if the latter could be merged into the former, though perhaps you haven't done this because it would be even more churn. Might be worth doing as a follow-up?

* It's unfortunate that each string class needs a ton of typedefs at the top; very boilerplate-y. Perhaps unavoidable though?

A few other nits below.

::: xpcom/string/nsStringFlags.h
@@ +4,5 @@
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +#ifndef xpcom_string_nsStringFlags_h
> +#define xpcom_string_nsStringFlags_h

The style guide says that header file guards should match the name you use when #including the file. Which means the "xpcom_string_" prefix here is unnecessary.

::: xpcom/string/nsTStringComparator.cpp
@@ +46,3 @@
>                                               const char_type* aRhs,
>                                               uint32_t aLLength,
>                                               uint32_t aRLength) const

Indentation. (There are probably many more cases like this that I've missed.)

::: xpcom/string/nsTSubstringTuple.h
@@ +83,5 @@
> +//operator+(const typename nsTSubstringTuple<T>::base_string_type& aStrA,
> +//          const typename nsTSubstringTuple<T>::base_string_type& aStrB)
> +//{
> +//  return nsTSubstringTuple<T>(&aStrA, &aStrB);
> +//}

Why is this code commented out?

::: xpcom/string/precompiled_templates.cpp
@@ +22,5 @@
> +
> +template nsTDependentSubstring<char> const Substring<char>(char const*, char const*);
> +template nsTDependentSubstring<char16_t> const Substring<char16_t>(char16_t const*, char16_t const*);
> +
> +// We need to instantiate the odd member functions as well.

Not sure which meaning of "odd" you are using here. As in "a few member functions as well"?
Attachment #8902053 - Flags: review?(n.nethercote) → feedback+
(In reply to Nicholas Nethercote [:njn] from comment #23)
> Comment on attachment 8902053 [details] [diff] [review]
> Part 1: Convert the xpcom string classes to be templated on char type
> 
> Review of attachment 8902053 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> As requested, I did a quick and dirty lookover. It generally seems
> reasonable, with the expected amount of template grossness. Two mains
> comments.
> 
> * There are numerous cases where an nsFooString.h file #includes
> nsTFooString.h, and likewise with .cpp files. In some of the cases, the
> former file doesn't contain much other than this #include. It would be nice
> if the latter could be merged into the former, though perhaps you haven't
> done this because it would be even more churn. Might be worth doing as a
> follow-up?

Yeah I agree, I think it's better in a follow up. I'm not 100% sure but we'd probably end up renaming files and it would get pretty messy. Another thing I've been thinking of is moving the templated string classes to the `mozilla` namespace and dropping the 'ns' prefix but keeping the existing include files as a stopgap.

> * It's unfortunate that each string class needs a ton of typedefs at the
> top; very boilerplate-y. Perhaps unavoidable though?

Right this was another template oddity. The subclass doesn't automatically have access to the parent's typedefs so we have to hoist them up. It's probably clearer if I change those from:

> typedef typename substring_type::char_type char_type;

to:

> using typename substring_type::char_type;

I'll go ahead and do that. It's also possible I can remove some of the typedefs that are unused in certain classes

> A few other nits below.
> 
> ::: xpcom/string/nsStringFlags.h
> @@ +4,5 @@
> > + * License, v. 2.0. If a copy of the MPL was not distributed with this
> > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> > +
> > +#ifndef xpcom_string_nsStringFlags_h
> > +#define xpcom_string_nsStringFlags_h
> 
> The style guide says that header file guards should match the name you use
> when #including the file. Which means the "xpcom_string_" prefix here is
> unnecessary.

Good catch, I fixed that elswhere.

> ::: xpcom/string/nsTStringComparator.cpp
> @@ +46,3 @@
> >                                               const char_type* aRhs,
> >                                               uint32_t aLLength,
> >                                               uint32_t aRLength) const
> 
> Indentation. (There are probably many more cases like this that I've missed.)

I'll clean these up, I'm sure there's a few more places.

> ::: xpcom/string/nsTSubstringTuple.h
> @@ +83,5 @@
> > +//operator+(const typename nsTSubstringTuple<T>::base_string_type& aStrA,
> > +//          const typename nsTSubstringTuple<T>::base_string_type& aStrB)
> > +//{
> > +//  return nsTSubstringTuple<T>(&aStrA, &aStrB);
> > +//}
> 
> Why is this code commented out?

It's dead code, replaced with the lines below. Clang had issues with the `const typename nsTSubstringTuple<T>::base_string_type` version so I replace it with an explicit `const mozilla::detail::nsTStringRepr<T>`. I'll delete the comment out lines.

> ::: xpcom/string/precompiled_templates.cpp
> @@ +22,5 @@
> > +
> > +template nsTDependentSubstring<char> const Substring<char>(char const*, char const*);
> > +template nsTDependentSubstring<char16_t> const Substring<char16_t>(char16_t const*, char16_t const*);
> > +
> > +// We need to instantiate the odd member functions as well.
> 
> Not sure which meaning of "odd" you are using here. As in "a few member
> functions as well"?

Odd as in weird :) I'll rephrase.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #22)
> Note that that seems in the bindings.rs file, which uses another set of
> fixups, at the bottom of the file.

I totally missed that!

> So I'd expect a patch like:
> 
> diff --git a/layout/style/ServoBindings.toml
> b/layout/style/ServoBindings.toml
> index ccd88dc25edb..685c4e18bb89 100644
> --- a/layout/style/ServoBindings.toml
> +++ b/layout/style/ServoBindings.toml
> @@ -538,6 +538,7 @@ servo-borrow-types = [
>      "RawGeckoStyleChildrenIterator",
>  ]
>  fixups = [
> +    { pat = "\\bnsTString<u16>", rep = "nsString" },
>      # hack for gecko-owned string
> -    { pat = "<nsString", rep = "<nsStringRepr" },
> +    { pat = "\\b<nsString\\b", rep = "<nsStringRepr" },
>  ]
> 
> To help

\o/

Builds for me locally w/ 3.9 now.
Attachment #8903386 - Flags: review?(n.nethercote)
Attachment #8902051 - Attachment is obsolete: true
This fixes improper usages of Find where an offset was actually being use for
the boolean ignore case flag. It also fixes a few instances of passing in a
literal wchar_t to our functions where a NS_LITERAL_STRING or char16_t should
be used instead.
Attachment #8903387 - Flags: review?(n.nethercote)
This removes the double-include macro hackery that we use to define two
separate string types (nsAString and nsACString) in favor of a templated
solution.
Attachment #8902053 - Attachment is obsolete: true
Handle mapping the new templated string types to nsStringRepr.

Nick, these are based on Emilio's suggestions, they seem to work.
Attachment #8903389 - Flags: review?(nfitzgerald)
Attached patch Part 3.2: Fix V suppressions (obsolete) — Splinter Review
With the signature change of nsCString to nsTString<char> we need to update
some V suppressions.
Attachment #8903390 - Flags: review?(n.nethercote)
Attachment #8902307 - Attachment is obsolete: true
This removes the double-include macro hackery that we use to define two
separate string types (nsAString and nsACString) in favor of a templated
solution.
Attachment #8903409 - Flags: review?(n.nethercote)
Attachment #8903388 - Attachment is obsolete: true
I hate to say it but IMO it's not clear that the templated code is more readable. :-/
The "typename Q" stuff in particular is pretty intimidating.
(In reply to David Major [:dmajor] from comment #35)
> I hate to say it but IMO it's not clear that the templated code is more
> readable. :-/
> The "typename Q" stuff in particular is pretty intimidating.

The functions themselves arguably aren't that much better to read. But getting away from the fake-templates-via-#include is a big win, IMO.

Have you ever tried to find the definition of a string function, like nsString::Foo()? To find it you need to understand the fake templates and know that it's actually written in the source code as nsTString_CharT::Foo(). (This breaks tags files.) For the same reason, the fake templates complicate debugging.
(In reply to Nicholas Nethercote [:njn] from comment #36)
> (In reply to David Major [:dmajor] from comment #35)
> > I hate to say it but IMO it's not clear that the templated code is more
> > readable. :-/
> > The "typename Q" stuff in particular is pretty intimidating.
> 
> The functions themselves arguably aren't that much better to read. But
> getting away from the fake-templates-via-#include is a big win, IMO.
> 
> Have you ever tried to find the definition of a string function, like
> nsString::Foo()? To find it you need to understand the fake templates and
> know that it's actually written in the source code as
> nsTString_CharT::Foo(). (This breaks tags files.) For the same reason, the
> fake templates complicate debugging.

True, that _is_ a pain.

How would you feel about renaming Q to EnableIf or EnableForChar16, or something like that? It would be reassuring to know that it's just for SFINAE, otherwise I keep asking myself "Where did that come from? What does it mean? How come it's never used?"
Comment on attachment 8903389 [details] [diff] [review]
Part 3.1: Fix building rust bindings on libclang3.9

Review of attachment 8903389 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/style/ServoBindings.toml
@@ +543,2 @@
>      # hack for gecko-owned string
> +    { pat = "\\b<nsString\\b", rep = "<nsStringRepr" },

I wonder if this should be implicitly added by `gecko_build.rs`?
Attachment #8903389 - Flags: review?(nfitzgerald)
Attachment #8903389 - Flags: review?(emilio)
Attachment #8903389 - Flags: review+
(In reply to Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8] from comment #38)
> Comment on attachment 8903389 [details] [diff] [review]
> Part 3.1: Fix building rust bindings on libclang3.9
> 
> Review of attachment 8903389 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: layout/style/ServoBindings.toml
> @@ +543,2 @@
> >      # hack for gecko-owned string
> > +    { pat = "\\b<nsString\\b", rep = "<nsStringRepr" },
> 
> I wonder if this should be implicitly added by `gecko_build.rs`?

this = the \b at the beginning and end
Comment on attachment 8903389 [details] [diff] [review]
Part 3.1: Fix building rust bindings on libclang3.9

Review of attachment 8903389 [details] [diff] [review]:
-----------------------------------------------------------------

I wrote this, so...
Attachment #8903389 - Flags: review?(emilio) → review?(xidorn+moz)
(In reply to Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8] from comment #39)
> (In reply to Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8] from comment #38)
> > Comment on attachment 8903389 [details] [diff] [review]
> > Part 3.1: Fix building rust bindings on libclang3.9
> > 
> > Review of attachment 8903389 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > ::: layout/style/ServoBindings.toml
> > @@ +543,2 @@
> > >      # hack for gecko-owned string
> > > +    { pat = "\\b<nsString\\b", rep = "<nsStringRepr" },
> > 
> > I wonder if this should be implicitly added by `gecko_build.rs`?
> 
> this = the \b at the beginning and end

It did this until now, but turns out that \b<foo>\b doesn't match "<foo>,", for some reason...
(In reply to David Major [:dmajor] from comment #37)
> (In reply to Nicholas Nethercote [:njn] from comment #36)
> > (In reply to David Major [:dmajor] from comment #35)
> > > I hate to say it but IMO it's not clear that the templated code is more
> > > readable. :-/
> > > The "typename Q" stuff in particular is pretty intimidating.
> > 
> > The functions themselves arguably aren't that much better to read. But
> > getting away from the fake-templates-via-#include is a big win, IMO.
> > 
> > Have you ever tried to find the definition of a string function, like
> > nsString::Foo()? To find it you need to understand the fake templates and
> > know that it's actually written in the source code as
> > nsTString_CharT::Foo(). (This breaks tags files.) For the same reason, the
> > fake templates complicate debugging.
> 
> True, that _is_ a pain.
> 
> How would you feel about renaming Q to EnableIf or EnableForChar16, or
> something like that? It would be reassuring to know that it's just for
> SFINAE, otherwise I keep asking myself "Where did that come from? What does
> it mean? How come it's never used?"

I can name it whatever makes the code more readable (obviously I know what it means so it made sense to me). Are you more concerned about the definition than the declaration or both? ie

> // declaration
> class String<T> {
>   template <typename Q = IsChar16>
>   int32_T AwfulFindMethodThasOnlyForChar16(...);
> };

and

> // definition
> template <typename T>
> template <typename Q>
> int32_T AwfulFindMethodThasOnlyForChar16(...) { ... }

Instead of renaming Q in the definition I could also do something like:

> // definition
> template <typename T>
> template <typename Q /* = IsChar16 */>
> int32_T AwfulFindMethodThasOnlyForChar16(...) { ... }
Flags: needinfo?(dmajor)
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #42)

> Are you more concerned about the definition than the declaration or both?

Both. The /* = IsChar16 */ does help somewhat, but in my mind I still want to assign Q some meaning like we're going to use it later in the body. Is it possible to make the type not have a name?

> template <typename = IsChar16>

I've seen some code like that before but I don't know if it works in this specific context.

Alternatively, how would you feel about something like

> // declaration
> class String<T> {
>   template <typename EnableForChar16 = IsChar16>
>   int32_T AwfulFindMethodThasOnlyForChar16(...);
> };

> // definition
> template <typename T>
> template <typename EnableForChar16>
> int32_T AwfulFindMethodThasOnlyForChar16(...) { ... }

The declaration becomes a bit repetitive I admit, but at least it's clear that this is merely "enable_if" machinery with no other purpose, and also the definition is understandable on its own without a /* = */.
Flags: needinfo?(dmajor)
Comment on attachment 8903389 [details] [diff] [review]
Part 3.1: Fix building rust bindings on libclang3.9

Review of attachment 8903389 [details] [diff] [review]:
-----------------------------------------------------------------

::: servo/components/style/build_gecko.rs
@@ +291,5 @@
>                  panic!("Failed to generate bindings, flags: {:?}", command_line_opts);
>              },
>          };
>          for fixup in fixups.iter() {
> +            result = Regex::new(&fixup.pat).unwrap().replace_all(&result, &*fixup.rep)

I don't understand, what's the problem with `\b` here? Why do we have to remove it and add the same thing to toml and below?
(In reply to Xidorn Quan [:xidorn] UTC+10 from comment #44)
> I don't understand, what's the problem with `\b` here? Why do we have to
> remove it and add the same thing to toml and below?

Just opened https://github.com/rust-lang/regex/issues/397. The test-case there is pretty much self-descriptive.
This is unfortunate... And rust regexp doesn't support zero-width assertion, so we can't do something like (?<=\W)...
Comment on attachment 8903389 [details] [diff] [review]
Part 3.1: Fix building rust bindings on libclang3.9

Review of attachment 8903389 [details] [diff] [review]:
-----------------------------------------------------------------

OK, I guess I'm fine with this. A different approach would probably be checking whether the pattern starts/ends with a word character, and only add `\b` to the corresponding side when it does. This way we can avoid adding `\b`s to every pattern.
Attachment #8903389 - Flags: review?(xidorn+moz) → review+
Apologies: I didn't get around to reviewing this today due to Stylo stuff. Tomorrow is more likely.
Attachment #8903386 - Flags: review?(n.nethercote) → review+
Comment on attachment 8903387 [details] [diff] [review]
Part 2: Fix more improper string usages

Review of attachment 8903387 [details] [diff] [review]:
-----------------------------------------------------------------

xeonchen, can you please look at the first comment below?

::: toolkit/system/osxproxy/ProxyUtils.mm
@@ -157,5 @@
>        if (tokenEnd == -1) {
>          tokenEnd = overrideLength; // no '*' char, match rest of string
>        }
>        nsAutoCString token(Substring(override, tokenStart, tokenEnd - tokenStart));
> -      offset = host.Find(token, offset);

Hmm. I can see that the code as written doesn't work the way the author thinks it does, because this hits the first Find() variant in nsTString.h. (And we should file a follow-up to fix this Find() mess.)

But this change clearly changes behaviour, significantly! I'm reluctant to r+ without having an understanding of if this behaviour change is ok. Have you looked at typical inputs?

xeonchen, I believe you wrote this code. Can you please comment on this?

::: toolkit/system/windowsproxy/ProxyUtils.cpp
@@ -157,5 @@
>        if (tokenEnd == -1) {
>          tokenEnd = overrideLength; // no '*' char, match rest of string
>        }
>        nsAutoCString token(Substring(override, tokenStart, tokenEnd - tokenStart));
> -      offset = host.Find(token, offset);

Ditto. (It sucks that this duplicates the function in the above file.)

::: toolkit/xre/nsGDKErrorHandler.cpp
@@ -66,5 @@
>      if (errno)
>        NS_RUNTIMEABORT(message);
>  
>      NS_NAMED_LITERAL_CSTRING(minorCodeString, " minor_code ");
> -    start = buffer.Find(minorCodeString, endptr - buffer.BeginReading());

I'm more comfortable approving this one because I can see that currently it causes the string to be searched from the start, and case insensitively, and fixing those shouldn't change its outcome.
Attachment #8903387 - Flags: review?(xeonchen)
Attachment #8903387 - Flags: review?(n.nethercote)
Attachment #8903387 - Flags: review+
Comment on attachment 8903390 [details] [diff] [review]
Part 3.2: Fix V suppressions

Review of attachment 8903390 [details] [diff] [review]:
-----------------------------------------------------------------

Please fold this into part 3 before landing.
Attachment #8903390 - Flags: review?(n.nethercote) → review+
> > // declaration
> > class String<T> {
> >   template <typename EnableForChar16 = IsChar16>
> >   int32_T AwfulFindMethodThasOnlyForChar16(...);
> > };
> 
> > // definition
> > template <typename T>
> > template <typename EnableForChar16>
> > int32_T AwfulFindMethodThasOnlyForChar16(...) { ... }

Mmm, I like that. But please call it EnableIfChar16 instead of EnableForChar16, to match enable_if<...>.
Comment on attachment 8903409 [details] [diff] [review]
Part 3: Convert the xpcom string classes to be templated on char type

Review of attachment 8903409 [details] [diff] [review]:
-----------------------------------------------------------------

Phew! Looks pretty good. Lots of small comments and questions, but no showstoppers. Don't forget the Q change as per comment 51.

I strongly recommend an all-platforms-all-tests try run before landing this.

All those |this->| qualification sure are annoying. Shame they're unavoidable.

::: xpcom/string/nsString.h
@@ +60,5 @@
>    }
>  
>    NS_LossyConvertUTF16toASCII(const char16ptr_t aString, uint32_t aLength)
>    {
> +    // TODO(ER): Not sure why the implicit cast operator doesn't work here.

Use "TODO(erahm)". Plenty of precedent elsewhere, and much clearer who it is. Or just remove the comment? Doesn't seem that important.

::: xpcom/string/nsStringFwd.h
@@ +41,5 @@
>  // Double-byte (char16_t) string types.
> +using nsAString = nsTSubstring<char16_t>;
> +using nsSubstringTuple = nsTSubstringTuple<char16_t>;
> +using nsString = nsTString<char16_t>;
> +using nsStringRepr = mozilla::detail::nsTStringRepr<char16_t>;

Do we need nsStringRepr and nsCStringRepr? Seems like we don't. And if we do, they should be within mozilla::detail.

@@ +50,5 @@
> +using nsPromiseFlatString = nsTPromiseFlatString<char16_t>;
> +using nsStringComparator = nsTStringComparator<char16_t>;
> +using nsDefaultStringComparator = nsTDefaultStringComparator<char16_t>;
> +using nsLiteralString = nsTLiteralString<char16_t>;
> +using nsFixedString = nsTFixedString<char16_t>;

Hmm, why were nsLiteral[C]String and nsFixed[C]String declared before?

::: xpcom/string/nsTDependentString.cpp
@@ +10,5 @@
>    : string_type(const_cast<char_type*>(aStart), uint32_t(aEnd - aStart),
>                  DataFlags::TERMINATED, ClassFlags(0))
>  {
>    MOZ_RELEASE_ASSERT(aStart <= aEnd, "Overflow!");
> +  string_type::AssertValidDependentString();

this-> ?

@@ +20,5 @@
>  {
>    MOZ_ASSERT(str.GetDataFlags() & DataFlags::TERMINATED, "Unterminated flat string");
>  
>    // If we currently own a buffer, release it.
> +  base_string_type::Finalize();

this-> ?

@@ +42,3 @@
>  {
>    MOZ_RELEASE_ASSERT(aStart <= aEnd, "Overflow!");
> +  base_string_type::Rebind(aStart, uint32_t(aEnd - aStart));

this-> ?

::: xpcom/string/nsTDependentString.h
@@ +68,4 @@
>      : string_type(const_cast<char_type*>(aData), aLength,
>                    DataFlags::TERMINATED, ClassFlags(0))
>    {
> +    string_type::AssertValidDependentString();

this-> ?

::: xpcom/string/nsTDependentSubstring.h
@@ +31,5 @@
> +  typedef nsTSubstring<T>             substring_type;
> +  typedef typename substring_type::fallible_t                 fallible_t;
> +
> +  typedef typename substring_type::char_type char_type;
> +  typedef typename substring_type::char_traits             char_traits;

Inconsistent indenting in these typedefs.

@@ +68,1 @@
>                                uint32_t aLength = size_type(-1))

Indentation.

@@ +94,1 @@
>                                const const_iterator& aEnd);

Indentation.

::: xpcom/string/nsTLiteralString.h
@@ +60,2 @@
>    {
> +    return *reinterpret_cast<const nsTString<T>*>(this);

Does AsString() not work here?

@@ +76,2 @@
>    {
> +    return base_string_type::mData;

this->mData?

::: xpcom/string/nsTString.h
@@ +150,5 @@
>    /**
>     * returns the null-terminated string
>     */
>  
> +  template <typename Q> struct raw_type { typedef const Q* type; };

I'd use U instead of Q here.

@@ +160,2 @@
>    {
> +    return substring_type::mData;

Not this->mData?

@@ +171,5 @@
>  
>    char_type CharAt(index_type aIndex) const
>    {
> +    NS_ASSERTION(aIndex <= substring_type::mLength, "index exceeds allowable range");
> +    return substring_type::mData[aIndex];

Ditto? Same for this->mLength?

@@ +227,5 @@
>     *          given offset.
>     * @return  offset in string, or kNotFound
>     */
>  
> +  // Case aIgnoreCase option only with char versions

Can you please file a bug to fix that up, and mark it blocking the string cleanup bug?

@@ +383,5 @@
>  
>    size_type Right(self_type& aResult, size_type aCount) const
>    {
> +    aCount = XPCOM_MIN(substring_type::mLength, aCount);
> +    return Mid(aResult, substring_type::mLength - aCount, aCount);

this->mLength?

@@ +484,5 @@
>    void AssertValidDependentString()
>    {
> +    NS_ASSERTION(substring_type::mData, "nsTDependentString must wrap a non-NULL buffer");
> +    NS_ASSERTION(substring_type::mLength != size_type(-1), "nsTDependentString has bogus length");
> +    NS_ASSERTION(substring_type::mData[substring_type::mLength] == 0,

Again, |this->| instead of |substring_type::|?

@@ +498,1 @@
>                    ClassFlags aClassFlags)

Indentation.

@@ +770,5 @@
>    }
>  };
>  
>  /**
> +

Accidental blank line insertion?

::: xpcom/string/nsTStringObsolete.cpp
@@ +122,4 @@
>    int32_t theRadix=10; // base 10 unless base 16 detected, or overriden (aRadix != kAutoDetect)
>    int32_t result=0;
> +  bool negate=false;
> +  char_type theChar=0;

Might as well put spaces around the '=' in this function while you're futzing with whitespace.

@@ +372,5 @@
>     */
>  
> +template <typename T>
> +auto
> +nsTString<T>::Mid( self_type& aResult, index_type aStartPos, size_type aLengthToCopy ) const -> size_type

Why the |auto ... ->| here? Normal syntax is nicer, for consistency, unless it's unavoidable for some reason.

@@ +450,2 @@
>  bool
> +nsTString<T>::StripWhitespace( const fallible_t& )

Remove whitespace around param.

@@ +466,2 @@
>  void
> +nsTString<T>::ReplaceChar( char_type aOldChar, char_type aNewChar )

Remove whitespace around params.

@@ +640,2 @@
>  void
> +nsTString<T>::Trim( const char* aSet, bool aTrimLeading, bool aTrimTrailing, bool aIgnoreQuotes )

Whitespace around params.

@@ +708,2 @@
>  void
> +nsTString<T>::CompressWhitespace( bool aTrimLeading, bool aTrimTrailing )

Whitespace around params.

::: xpcom/string/nsTStringRepr.h
@@ +15,5 @@
> +template <typename T> class nsTSubstringTuple;
> +
> +/**
> + * The base for string comparators
> + */

Switch to C++ comments in places where you've moved code, such as this file?

@@ +16,5 @@
> +
> +/**
> + * The base for string comparators
> + */
> +template <typename T> class nsTStringComparator

Moving nsTStringRepr into its own file is good. But did the definitions of nsTStringComparator and nsTDefaultStringComparator need to move? They feel like they don't really fit in this file.

@@ +23,5 @@
> +  typedef T char_type;
> +
> +  nsTStringComparator()
> +  {
> +  }

Could fit this on a single line.

@@ +71,5 @@
> +{
> +public:
> +  typedef mozilla::fallible_t                 fallible_t;
> +
> +  typedef T                                   char_type;

Is char_type still necessary? Seems like using T directly is clearer. Likewise in all the other classes.

@@ +74,5 @@
> +
> +  typedef T                                   char_type;
> +
> +  typedef nsCharTraits<char_type>             char_traits;
> +  typedef typename char_traits::incompatible_char_type incompatible_char_type;

Is incompatible_char_type still needed? AFAICT it only pops up in IsChar16 functions. Would using |char| directly in those work?
(Oh, maybe that's what you were talking about in comment 9?)

@@ +80,5 @@
> +  typedef nsTStringRepr<T>                self_type;
> +  typedef self_type                           base_string_type;
> +
> +  typedef nsTSubstring<T>             substring_type;
> +  typedef nsTSubstringTuple<T>             substring_tuple_type;

Indentation in this class is screwy. I'm fine with the vertical alignment, but please make it consistent. (And make sure there are no tabs.)

Or, you could remove the vertical alignment, which would match how you've done group typedef decls in other classes.

@@ +83,5 @@
> +  typedef nsTSubstring<T>             substring_type;
> +  typedef nsTSubstringTuple<T>             substring_tuple_type;
> +
> +  typedef nsReadingIterator<char_type>        const_iterator;
> +  typedef nsWritingIterator<char_type>        iterator;

Should these have a _type suffix, for consistency?

@@ +88,5 @@
> +
> +  typedef nsTStringComparator<char_type>      comparator_type;
> +
> +  typedef char_type*                          char_iterator;
> +  typedef const char_type*                    const_char_iterator;

Ditto?

(In general, you can see I don't particularly like all these typedefs, and I want to have as few as possible :)

@@ +260,5 @@
> +  // non-constant char array variable. Use EqualsASCII for them.
> +  // The template trick to acquire the array length at compile time without
> +  // using a macro is due to Corey Kosak, with much thanks.
> +  template<int N>
> +  inline bool EqualsLiteral(const char (&aStr)[N]) const

|inline| unnecessary.

@@ +281,5 @@
> +  // explicit size.  Do not attempt to use it with a regular char*
> +  // pointer, or with a non-constant char array variable. Use
> +  // LowerCaseEqualsASCII for them.
> +  template<int N>
> +  inline bool LowerCaseEqualsLiteral(const char (&aStr)[N]) const

Ditto.

@@ +287,5 @@
> +    return LowerCaseEqualsASCII(aStr, N - 1);
> +  }
> +
> +  /**
> +   * returns true if this string overlaps with the given string fragment.

Change to "Returns", while you're here.

@@ +292,5 @@
> +   */
> +  bool IsDependentOn(const char_type* aStart, const char_type* aEnd) const
> +  {
> +    /**
> +     * if it _isn't_ the case that one fragment starts after the other ends,

"If"

@@ +307,5 @@
> +  nsTStringRepr() = delete; // Never instantiate directly
> +
> +  constexpr
> +  nsTStringRepr(char_type* aData, size_type aLength,
> +                      DataFlags aDataFlags, ClassFlags aClassFlags)

indentation

::: xpcom/string/nsTSubstring.cpp
@@ -1015,5 @@
>    }
>  }
>  
> -/* hack to make sure we define FormatWithoutTrailingZeros only once */
> -#ifdef CharT_is_PRUnichar

Hooray for not requiring silliness like this any more.

::: xpcom/string/nsTSubstring.h
@@ +60,5 @@
> +
> +  typedef typename base_string_type::comparator_type comparator_type;
> +
> +  typedef typename base_string_type::char_iterator char_iterator;
> +  typedef typename base_string_type::const_char_iterator const_char_iterator;

const_char_iterator isn't used by this class. Likewise for a number of the other classes. Might be worth minimizing all these typedefs by commenting them all out and then reinstating the ones that are required for compilation?

::: xpcom/string/nsTSubstringTuple.h
@@ +76,5 @@
>  
> +template <typename T>
> +inline const nsTSubstringTuple<T>
> +operator+(const mozilla::detail::nsTStringRepr<T>& aStrA,
> +          const mozilla::detail::nsTStringRepr<T>& aStrB)

Any reason for changing away from base_string_type here?

::: xpcom/string/precompiled_templates.cpp
@@ +7,5 @@
> +#include "nsString.h"
> +
> +/**
> + * This file provides concrete instantiations for externed string template
> + * classes and functions.

C++ comment?

@@ +12,5 @@
> + */
> +
> +// ================
> +// Template classes
> +// ================

I'd put a blank line after this heading.

@@ +28,5 @@
> +template class nsTSubstring<char16_t>;
> +template class nsTSubstringSplitter<char>;
> +template class nsTSubstringSplitter<char16_t>;
> +template class mozilla::detail::nsTStringRepr<char>;
> +template class mozilla::detail::nsTStringRepr<char16_t>;

There are lots of classes not mentioned here, such as nsTLiteralString, nsTAtomString. Why is that? Perhaps a comment about it would help.

Also, I'd order it to match the inheritance hierarchy, i.e. nsTStringRepr at the top, nsTSubstring before nsTString, etc.

@@ +32,5 @@
> +template class mozilla::detail::nsTStringRepr<char16_t>;
> +
> +// =============================
> +// Templated top-level functions
> +// =============================

Ditto.

@@ +34,5 @@
> +// =============================
> +// Templated top-level functions
> +// =============================
> +template
> +int Compare<char>(mozilla::detail::nsTStringRepr<char> const&,

Return type should be on its own line. Ditto for all the other functions in this file.

@@ +53,5 @@
> +                                                          char16_t const*);
> +
> +// =========================================================
> +// Templated member functions that are conditionally enabled
> +// =========================================================

Ditto.

@@ +56,5 @@
> +// Templated member functions that are conditionally enabled
> +// =========================================================
> +template
> +int32_t nsTString<char16_t>::Find(const self_type&, int32_t,
> +                                  int32_t aCount) const;

Remove this parameter name to match the rest of this file.

@@ +68,5 @@
> +                                   int32_t) const;
> +
> +template
> +int32_t nsTString<char16_t>::RFind(const char_type*, int32_t,
> +                                   int32_t aCount) const;

Ditto.
Attachment #8903409 - Flags: review?(n.nethercote) → review+
Comment on attachment 8903387 [details] [diff] [review]
Part 2: Fix more improper string usages

Review of attachment 8903387 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not the original author, I just moved the code from somewhere else.
But it looks reasonable to me, and also passes the gtest, so I'm good with this modification.
Attachment #8903387 - Flags: review?(xeonchen) → review+
(In reply to Nicholas Nethercote [:njn] from comment #51)
> > > // declaration
> > > class String<T> {
> > >   template <typename EnableForChar16 = IsChar16>
> > >   int32_T AwfulFindMethodThasOnlyForChar16(...);
> > > };
> > 
> > > // definition
> > > template <typename T>
> > > template <typename EnableForChar16>
> > > int32_T AwfulFindMethodThasOnlyForChar16(...) { ... }
> 
> Mmm, I like that. But please call it EnableIfChar16 instead of
> EnableForChar16, to match enable_if<...>.

Sounds good I'll update all the instances to either `EnableIfChar` or `EnableIfChar16`.
(In reply to Nicholas Nethercote [:njn] from comment #52)
> Comment on attachment 8903409 [details] [diff] [review]
> Part 3: Convert the xpcom string classes to be templated on char type
> 
> Review of attachment 8903409 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Phew! Looks pretty good. Lots of small comments and questions, but no
> showstoppers. Don't forget the Q change as per comment 51.

Thanks for the mega-review! I'll take care of the Q changes.

> 
> ::: xpcom/string/nsString.h
> @@ +60,5 @@
> >    }
> >  
> >    NS_LossyConvertUTF16toASCII(const char16ptr_t aString, uint32_t aLength)
> >    {
> > +    // TODO(ER): Not sure why the implicit cast operator doesn't work here.
> 
> Use "TODO(erahm)". Plenty of precedent elsewhere, and much clearer who it
> is. Or just remove the comment? Doesn't seem that important.

I'll just remove it.

> ::: xpcom/string/nsStringFwd.h
> @@ +41,5 @@
> >  // Double-byte (char16_t) string types.
> > +using nsAString = nsTSubstring<char16_t>;
> > +using nsSubstringTuple = nsTSubstringTuple<char16_t>;
> > +using nsString = nsTString<char16_t>;
> > +using nsStringRepr = mozilla::detail::nsTStringRepr<char16_t>;
> 
> Do we need nsStringRepr and nsCStringRepr? Seems like we don't. And if we
> do, they should be within mozilla::detail.

I think our rust bindings need them, I'll move to the detail namespace.

> @@ +50,5 @@
> > +using nsPromiseFlatString = nsTPromiseFlatString<char16_t>;
> > +using nsStringComparator = nsTStringComparator<char16_t>;
> > +using nsDefaultStringComparator = nsTDefaultStringComparator<char16_t>;
> > +using nsLiteralString = nsTLiteralString<char16_t>;
> > +using nsFixedString = nsTFixedString<char16_t>;
> 
> Hmm, why were nsLiteral[C]String and nsFixed[C]String declared before?

No reason, I can swap them.

> ::: xpcom/string/nsTDependentString.cpp
> @@ +10,5 @@
> >    : string_type(const_cast<char_type*>(aStart), uint32_t(aEnd - aStart),
> >                  DataFlags::TERMINATED, ClassFlags(0))
> >  {
> >    MOZ_RELEASE_ASSERT(aStart <= aEnd, "Overflow!");
> > +  string_type::AssertValidDependentString();
> 
> this-> ?

Either way works, `this` is more consistent with the other changes so I'll update here and the ones noted below.

> @@ +20,5 @@
> >  {
> >    MOZ_ASSERT(str.GetDataFlags() & DataFlags::TERMINATED, "Unterminated flat string");
> >  
> >    // If we currently own a buffer, release it.
> > +  base_string_type::Finalize();
> 
> this-> ?
> 
> @@ +42,3 @@
> >  {
> >    MOZ_RELEASE_ASSERT(aStart <= aEnd, "Overflow!");
> > +  base_string_type::Rebind(aStart, uint32_t(aEnd - aStart));
> 
> this-> ?
> 
> ::: xpcom/string/nsTDependentString.h
> @@ +68,4 @@
> >      : string_type(const_cast<char_type*>(aData), aLength,
> >                    DataFlags::TERMINATED, ClassFlags(0))
> >    {
> > +    string_type::AssertValidDependentString();
> 
> this-> ?
> 
> ::: xpcom/string/nsTDependentSubstring.h
> @@ +31,5 @@
> > +  typedef nsTSubstring<T>             substring_type;
> > +  typedef typename substring_type::fallible_t                 fallible_t;
> > +
> > +  typedef typename substring_type::char_type char_type;
> > +  typedef typename substring_type::char_traits             char_traits;
> 
> Inconsistent indenting in these typedefs.

Will clean up.

> @@ +68,1 @@
> >                                uint32_t aLength = size_type(-1))
> 
> Indentation.

Will clean up.

> @@ +94,1 @@
> >                                const const_iterator& aEnd);
> 
> Indentation.

Will clean up.

> ::: xpcom/string/nsTLiteralString.h
> @@ +60,2 @@
> >    {
> > +    return *reinterpret_cast<const nsTString<T>*>(this);
> 
> Does AsString() not work here?

Oh right, I was going to try to get rid of `AsString` but that can be follow up.

> @@ +76,2 @@
> >    {
> > +    return base_string_type::mData;
> 
> this->mData?

Will update.

> ::: xpcom/string/nsTString.h
> @@ +150,5 @@
> >    /**
> >     * returns the null-terminated string
> >     */
> >  
> > +  template <typename Q> struct raw_type { typedef const Q* type; };
> 
> I'd use U instead of Q here.

Will update.

> @@ +160,2 @@
> >    {
> > +    return substring_type::mData;
> 
> Not this->mData?

Will update.

> @@ +171,5 @@
> >  
> >    char_type CharAt(index_type aIndex) const
> >    {
> > +    NS_ASSERTION(aIndex <= substring_type::mLength, "index exceeds allowable range");
> > +    return substring_type::mData[aIndex];
> 
> Ditto? Same for this->mLength?

Will update.

> 
> @@ +227,5 @@
> >     *          given offset.
> >     * @return  offset in string, or kNotFound
> >     */
> >  
> > +  // Case aIgnoreCase option only with char versions
> 
> Can you please file a bug to fix that up, and mark it blocking the string
> cleanup bug?

Filed bug 1397045.

> @@ +383,5 @@
> >  
> >    size_type Right(self_type& aResult, size_type aCount) const
> >    {
> > +    aCount = XPCOM_MIN(substring_type::mLength, aCount);
> > +    return Mid(aResult, substring_type::mLength - aCount, aCount);
> 
> this->mLength?

Will update.

> @@ +484,5 @@
> >    void AssertValidDependentString()
> >    {
> > +    NS_ASSERTION(substring_type::mData, "nsTDependentString must wrap a non-NULL buffer");
> > +    NS_ASSERTION(substring_type::mLength != size_type(-1), "nsTDependentString has bogus length");
> > +    NS_ASSERTION(substring_type::mData[substring_type::mLength] == 0,
> 
> Again, |this->| instead of |substring_type::|?

Will update.

> @@ +498,1 @@
> >                    ClassFlags aClassFlags)
> 
> Indentation.

Will update.

> @@ +770,5 @@
> >    }
> >  };
> >  
> >  /**
> > +
> 
> Accidental blank line insertion?

Will remove.

> ::: xpcom/string/nsTStringObsolete.cpp
> @@ +122,4 @@
> >    int32_t theRadix=10; // base 10 unless base 16 detected, or overriden (aRadix != kAutoDetect)
> >    int32_t result=0;
> > +  bool negate=false;
> > +  char_type theChar=0;
> 
> Might as well put spaces around the '=' in this function while you're
> futzing with whitespace.

Ugh more blame :) Will do.

> @@ +372,5 @@
> >     */
> >  
> > +template <typename T>
> > +auto
> > +nsTString<T>::Mid( self_type& aResult, index_type aStartPos, size_type aLengthToCopy ) const -> size_type
> 
> Why the |auto ... ->| here? Normal syntax is nicer, for consistency, unless
> it's unavoidable for some reason.

It just avoided doing a `typename nsTString<T>::size_type`. I can change it.

> @@ +450,2 @@
> >  bool
> > +nsTString<T>::StripWhitespace( const fallible_t& )
> 
> Remove whitespace around param.

Will do (plus the rest of the file).

> @@ +466,2 @@
> >  void
> > +nsTString<T>::ReplaceChar( char_type aOldChar, char_type aNewChar )
> 
> Remove whitespace around params.
> 
> @@ +640,2 @@
> >  void
> > +nsTString<T>::Trim( const char* aSet, bool aTrimLeading, bool aTrimTrailing, bool aIgnoreQuotes )
> 
> Whitespace around params.
> 
> @@ +708,2 @@
> >  void
> > +nsTString<T>::CompressWhitespace( bool aTrimLeading, bool aTrimTrailing )
> 
> Whitespace around params.
> 
> ::: xpcom/string/nsTStringRepr.h
> @@ +15,5 @@
> > +template <typename T> class nsTSubstringTuple;
> > +
> > +/**
> > + * The base for string comparators
> > + */
> 
> Switch to C++ comments in places where you've moved code, such as this file?

Sure. We don't seem to call this out in our coding style doc but it makes sense.

> @@ +16,5 @@
> > +
> > +/**
> > + * The base for string comparators
> > + */
> > +template <typename T> class nsTStringComparator
> 
> Moving nsTStringRepr into its own file is good. But did the definitions of
> nsTStringComparator and nsTDefaultStringComparator need to move? They feel
> like they don't really fit in this file.

They either need to go here or in their own file, they're used in the typedefs and the toplevel `Compare(nsTStringRepr, nsTStringRepr)` function.

> @@ +23,5 @@
> > +  typedef T char_type;
> > +
> > +  nsTStringComparator()
> > +  {
> > +  }
> 
> Could fit this on a single line.

Will update.

> @@ +71,5 @@
> > +{
> > +public:
> > +  typedef mozilla::fallible_t                 fallible_t;
> > +
> > +  typedef T                                   char_type;
> 
> Is char_type still necessary? Seems like using T directly is clearer.
> Likewise in all the other classes.

I agree, but maybe make that a follow up? It would affect nearly every file in xpcom/string.

> @@ +74,5 @@
> > +
> > +  typedef T                                   char_type;
> > +
> > +  typedef nsCharTraits<char_type>             char_traits;
> > +  typedef typename char_traits::incompatible_char_type incompatible_char_type;
> 
> Is incompatible_char_type still needed? AFAICT it only pops up in IsChar16
> functions. Would using |char| directly in those work?
> (Oh, maybe that's what you were talking about in comment 9?)

Yeah it's necessary for some of the IsChar16 shenanigans.

> @@ +80,5 @@
> > +  typedef nsTStringRepr<T>                self_type;
> > +  typedef self_type                           base_string_type;
> > +
> > +  typedef nsTSubstring<T>             substring_type;
> > +  typedef nsTSubstringTuple<T>             substring_tuple_type;
> 
> Indentation in this class is screwy. I'm fine with the vertical alignment,
> but please make it consistent. (And make sure there are no tabs.)
> 
> Or, you could remove the vertical alignment, which would match how you've
> done group typedef decls in other classes.

I'll do that.

> @@ +83,5 @@
> > +  typedef nsTSubstring<T>             substring_type;
> > +  typedef nsTSubstringTuple<T>             substring_tuple_type;
> > +
> > +  typedef nsReadingIterator<char_type>        const_iterator;
> > +  typedef nsWritingIterator<char_type>        iterator;
> 
> Should these have a _type suffix, for consistency?

Maybe? It's unchanged from before and I'd rather not spam a bunch of changes to the other files but a follow up might make sense.

> @@ +88,5 @@
> > +
> > +  typedef nsTStringComparator<char_type>      comparator_type;
> > +
> > +  typedef char_type*                          char_iterator;
> > +  typedef const char_type*                    const_char_iterator;
> 
> Ditto?

Same.

> (In general, you can see I don't particularly like all these typedefs, and I
> want to have as few as possible :)

That seems reasonable but is probably better left to a follow up, these are all the same typedefs that we had before my changes.

> @@ +260,5 @@
> > +  // non-constant char array variable. Use EqualsASCII for them.
> > +  // The template trick to acquire the array length at compile time without
> > +  // using a macro is due to Corey Kosak, with much thanks.
> > +  template<int N>
> > +  inline bool EqualsLiteral(const char (&aStr)[N]) const
> 
> |inline| unnecessary.

Will remove.

> @@ +281,5 @@
> > +  // explicit size.  Do not attempt to use it with a regular char*
> > +  // pointer, or with a non-constant char array variable. Use
> > +  // LowerCaseEqualsASCII for them.
> > +  template<int N>
> > +  inline bool LowerCaseEqualsLiteral(const char (&aStr)[N]) const
> 

Will remove for the rest of the file as well.

> 
> @@ +287,5 @@
> > +    return LowerCaseEqualsASCII(aStr, N - 1);
> > +  }
> > +
> > +  /**
> > +   * returns true if this string overlaps with the given string fragment.
> 
> Change to "Returns", while you're here.

Will update.

> @@ +292,5 @@
> > +   */
> > +  bool IsDependentOn(const char_type* aStart, const char_type* aEnd) const
> > +  {
> > +    /**
> > +     * if it _isn't_ the case that one fragment starts after the other ends,
> 
> "If"
> 
> @@ +307,5 @@
> > +  nsTStringRepr() = delete; // Never instantiate directly
> > +
> > +  constexpr
> > +  nsTStringRepr(char_type* aData, size_type aLength,
> > +                      DataFlags aDataFlags, ClassFlags aClassFlags)
> 
> indentation

Will update.

> ::: xpcom/string/nsTSubstring.cpp
> @@ -1015,5 @@
> >    }
> >  }
> >  
> > -/* hack to make sure we define FormatWithoutTrailingZeros only once */
> > -#ifdef CharT_is_PRUnichar
> 
> Hooray for not requiring silliness like this any more.

\o/

> ::: xpcom/string/nsTSubstring.h
> @@ +60,5 @@
> > +
> > +  typedef typename base_string_type::comparator_type comparator_type;
> > +
> > +  typedef typename base_string_type::char_iterator char_iterator;
> > +  typedef typename base_string_type::const_char_iterator const_char_iterator;
> 
> const_char_iterator isn't used by this class. Likewise for a number of the
> other classes. Might be worth minimizing all these typedefs by commenting
> them all out and then reinstating the ones that are required for compilation?

Classes that derive from nsTSubtring do use it. I can look into trimming down the typedefs in the derived classes though. This was less of a priority when I was just trying to get things to compile but makes sense before landing.

> ::: xpcom/string/nsTSubstringTuple.h
> @@ +76,5 @@
> >  
> > +template <typename T>
> > +inline const nsTSubstringTuple<T>
> > +operator+(const mozilla::detail::nsTStringRepr<T>& aStrA,
> > +          const mozilla::detail::nsTStringRepr<T>& aStrB)
> 
> Any reason for changing away from base_string_type here?

Don't remember which but it wouldn't compile on either msvc or clang.

> ::: xpcom/string/precompiled_templates.cpp
> @@ +7,5 @@
> > +#include "nsString.h"
> > +
> > +/**
> > + * This file provides concrete instantiations for externed string template
> > + * classes and functions.
> 
> C++ comment?

Will update.

> @@ +12,5 @@
> > + */
> > +
> > +// ================
> > +// Template classes
> > +// ================
> 
> I'd put a blank line after this heading.

Will update.

> @@ +28,5 @@
> > +template class nsTSubstring<char16_t>;
> > +template class nsTSubstringSplitter<char>;
> > +template class nsTSubstringSplitter<char16_t>;
> > +template class mozilla::detail::nsTStringRepr<char>;
> > +template class mozilla::detail::nsTStringRepr<char16_t>;
> 
> There are lots of classes not mentioned here, such as nsTLiteralString,
> nsTAtomString. Why is that? Perhaps a comment about it would help.

I don't know what nsTAtomString is, so that would be why ;) nsTLiteralString is just an omission I'll add it (and double check the rest against nsStringFwd). Basically it's anything I declared `extern`. Really it's just supposed to be an optimization for compile times so it doesn't hurt if they're missing but we should include most of them.

> Also, I'd order it to match the inheritance hierarchy, i.e. nsTStringRepr at
> the top, nsTSubstring before nsTString, etc.

I'll do my best, there's various branches of inheritance going on here.

> @@ +32,5 @@
> > +template class mozilla::detail::nsTStringRepr<char16_t>;
> > +
> > +// =============================
> > +// Templated top-level functions
> > +// =============================
> 
> Ditto.
> 
> @@ +34,5 @@
> > +// =============================
> > +// Templated top-level functions
> > +// =============================
> > +template
> > +int Compare<char>(mozilla::detail::nsTStringRepr<char> const&,
> 
> Return type should be on its own line. Ditto for all the other functions in
> this file.

Will update them.

> @@ +53,5 @@
> > +                                                          char16_t const*);
> > +
> > +// =========================================================
> > +// Templated member functions that are conditionally enabled
> > +// =========================================================
> 
> Ditto.
> 
> @@ +56,5 @@
> > +// Templated member functions that are conditionally enabled
> > +// =========================================================
> > +template
> > +int32_t nsTString<char16_t>::Find(const self_type&, int32_t,
> > +                                  int32_t aCount) const;
> 
> Remove this parameter name to match the rest of this file.

Will update.

> @@ +68,5 @@
> > +                                   int32_t) const;
> > +
> > +template
> > +int32_t nsTString<char16_t>::RFind(const char_type*, int32_t,
> > +                                   int32_t aCount) const;
> 
> Ditto.

Will update.
Attached patch Code review fixes (obsolete) — Splinter Review
Nick can you give these a once over, it's just an interdiff of the fixes for part 3.
Attachment #8905134 - Flags: review?(n.nethercote)
Attached patch Part 3.3: Fix hazard annotations (obsolete) — Splinter Review
This updates the hazard annotations to use the 'nsTSubstring<T>' pattern
instead of 'nsAC?String'.

Steve can you take a look at this? I have a try run with the changes pending.
Attachment #8905146 - Flags: review?(sphink)
Comment on attachment 8905146 [details] [diff] [review]
Part 3.3: Fix hazard annotations

Review of attachment 8905146 [details] [diff] [review]:
-----------------------------------------------------------------

This worked? \o/
Attachment #8905146 - Flags: review?(sphink) → review+
All builds are green at this point. Once I get the okay from njn on the interdiff this should be good to land.
Comment on attachment 8905134 [details] [diff] [review]
Code review fixes

Review of attachment 8905134 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/string/nsStringFwd.h
@@ +65,5 @@
> +namespace mozilla {
> +namespace detail {
> +using nsCStringRepr = nsTStringRepr<char>;
> +} // namespace detail
> +} // namespace mozilla

Merge this into the mozilla::detail block above?

::: xpcom/string/nsTString.h
@@ +744,5 @@
>  
>    char_type mStorage[N];
>  };
>  
> +// Externs for the most common nstAutoStringN variations.

typo: nsTAutoStringN

::: xpcom/string/nsTStringRepr.h
@@ +307,5 @@
>          const mozilla::detail::nsTStringRepr<T>& aRhs,
>          const nsTStringComparator<T>& = nsTDefaultStringComparator<T>());
>  
>  template <typename T>
> +bool

Did I ask you to remove the inline on all these functions? If so, I shouldn't have: it's unnecessary for methods defined within a class, but is reasonable for top-level functions.
Attachment #8905134 - Flags: review?(n.nethercote) → review+
This removes the double-include macro hackery that we use to define two
separate string types (nsAString and nsACString) in favor of a templated
solution.

Annotations for Valgrind and the JS hazard analysis are updated as well as
the rust binding generations for string code.
Attachment #8903409 - Attachment is obsolete: true
Attachment #8903389 - Attachment is obsolete: true
Attachment #8903390 - Attachment is obsolete: true
Comment on attachment 8905146 [details] [diff] [review]
Part 3.3: Fix hazard annotations

># HG changeset patch
># User Eric Rahm <erahm@mozilla.com>
># Date 1504719052 25200
>#      Wed Sep 06 10:30:52 2017 -0700
># Node ID c70257bf1f8d94b6edbb41ecc78084b15190b32c
># Parent  24a0295466c6c561ca4bb4538afd2affb9e79e4a
>Bug 1393230 - Part 3.3: Fix hazard annotations. r=sfink
>
>This updates the hazard annotations to use the 'nsTSubstring<T>' pattern
>instead of 'nsAC?String'.
>
>diff --git a/js/src/devtools/rootAnalysis/analyzeHeapWrites.js b/js/src/devtools/rootAnalysis/analyzeHeapWrites.js
>--- a/js/src/devtools/rootAnalysis/analyzeHeapWrites.js
>+++ b/js/src/devtools/rootAnalysis/analyzeHeapWrites.js
>@@ -439,27 +439,27 @@ function ignoreContents(entry)
>             /nsTArray_Impl.*?::AppendElement/,
>             /nsTArray_Impl.*?::RemoveElementsAt/,
>             /nsTArray_Impl.*?::ReplaceElementsAt/,
>             /nsTArray_Impl.*?::InsertElementsAt/,
>             /nsTArray_Impl.*?::SetCapacity/,
>             /nsTArray_base.*?::EnsureCapacity/,
>             /nsTArray_base.*?::ShiftData/,
>             /AutoTArray.*?::Init/,
>-            /nsAC?String::SetCapacity/,
>-            /nsAC?String::SetLength/,
>-            /nsAC?String::Assign/,
>-            /nsAC?String::Append/,
>-            /nsAC?String::Replace/,
>-            /nsAC?String::Trim/,
>-            /nsAC?String::Truncate/,
>-            /nsAString::StripTaggedASCII/,
>-            /nsAC?String::operator=/,
>-            /nsAutoString::nsAutoString/,
>-            /nsFixedCString::nsFixedCString/,
>+            /nsTSubstring<T>::SetCapacity/,
>+            /nsTSubstring<T>::SetLength/,
>+            /nsTSubstring<T>::Assign/,
>+            /nsTSubstring<T>::Append/,
>+            /nsTSubstring<T>::Replace/,
>+            /nsTSubstring<T>::Trim/,
>+            /nsTSubstring<T>::Truncate/,
>+            /nsTSubstring<T>::StripTaggedASCII/,
>+            /nsTSubstring<T>::operator=/,
>+            /nsTAutoStringN<T, N>::nsTAutoStringN/,
>+            /nsTFixedString<T>::nsTFixedString/,
> 
>             // Similar for some other data structures
>             /nsCOMArray_base::SetCapacity/,
>             /nsCOMArray_base::Clear/,
>             /nsCOMArray_base::AppendElement/,
> 
>             // UniquePtr is similar.
>             /mozilla::UniquePtr/,
Attachment #8905146 - Attachment is obsolete: true
Attachment #8905134 - Attachment is obsolete: true
Comment on attachment 8905361 [details] [diff] [review]
Part 3: Convert the xpcom string classes to be templated on char type

Carrying forward various r+'s
Attachment #8905361 - Flags: review+
I can't land this as the servo changes need to land on the servo repo, but I can't land those if I don't land the non-servo patches so I'm not sure what I'm supposed to do. I'm just going to mark it as checkin-needed and let someone who understands whatever byzantine rules we have deal with it.
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c366bfc13e86
Part 1: Remove remaining string forward declarations. r=njn
https://hg.mozilla.org/integration/mozilla-inbound/rev/f762f605dd83
Part 2: Fix more improper string usages. r=njn
https://hg.mozilla.org/integration/mozilla-inbound/rev/9115364cd4aa
Part 3: Convert the xpcom string classes to be templated on char type. r=njn, r=fitzgen, r=sfink
Keywords: checkin-needed
This touched servo/ so needs to go through servo PRs (yeah, I know :/).

I asked RyanVM to back it out, and opened https://github.com/servo/servo/pull/18408. When that merges this can be pushed to autoland.
Backout by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7e21f710ecec
Backed out 3 changesets because it touches the servo directory.
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/ac28b19fc283
Part 1: Remove remaining string forward declarations. r=njn
https://hg.mozilla.org/integration/autoland/rev/a1bcb487ffec
Part 2: Fix more improper string usages. r=njn
https://hg.mozilla.org/integration/autoland/rev/8b18f545fd6f
Part 3: Convert the xpcom string classes to be templated on char type. r=njn, r=fitzgen, r=sfink
Depends on: 1397942
\o/ Thank you for taking on this work, and thanks for entertaining my intrusion.

If you haven't already, please be sure to give the stability folks a heads up about the changing function signatures.
Depends on: 1399315
Depends on: 1399789
These improvements were noticed when either this bug or the servo merge with PR https://github.com/servo/servo/pull/18408 happened:

== Change summary for alert #9292 (as of September 07 2017 15:26 UTC) ==

Improvements:

 76%  Strings PerfStripCharsCRLF windows7-32 opt      1,069,484.25 -> 260,214.17
 73%  Strings PerfStripCharsCRLF windows10-64 opt     696,691.08 -> 188,524.33
 70%  Strings PerfStripCharsWhitespace windows7-32 opt 997,609.08 -> 297,478.17
 61%  Strings PerfStripCharsCRLF osx-10-10 opt        683,223.17 -> 268,667.83
 48%  Strings PerfStripCharsWhitespace osx-10-10 opt  779,182.17 -> 408,373.17
 43%  Strings PerfStripCharsWhitespace windows10-64 opt 690,325.08 -> 396,434.42
  4%  TestStandardURL NormalizePerf windows10-64 opt  56,072.92 -> 54,063.33
  3%  TestStandardURL NormalizePerf windows7-32 opt   75,958.79 -> 73,513.00

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=9292
Depends on: 1402328
Depends on: 1403083
Component: String → XPCOM
You need to log in before you can comment on or make changes to this bug.