Closed Bug 1028866 Opened 5 years ago Closed 5 years ago

String allocation functions should create Latin1 strings

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla33

People

(Reporter: jandem, Assigned: jandem)

References

Details

Attachments

(5 files)

No description provided.
This patch moves the string-creation functions into the js namespace, and removes the js_* prefix.
Attachment #8444387 - Flags: review?(luke)
We can just forward to NewStringCopyN, and remove some crufty code.
Attachment #8444394 - Flags: review?(luke)
NewStringCopyN seems to be the right place for this because: (1) It avoids an extra copy: the TwoByte chars can be deflated while we copy them to the new string and (2) it seems to catch almost all strings: AtomizeChars for instance calls NewStringCopyN

There's also js::NewString etc, some callers of that function will also want the deflation (and inline-string creation), but not all of them. Will address that in a later patch.
Attachment #8444426 - Flags: review?(luke)
Attachment #8444387 - Flags: review?(luke) → review+
Comment on attachment 8444394 [details] [diff] [review]
Part 2 - NewStringCopyZ cleanup

Nice
Attachment #8444394 - Flags: review?(luke) → review+
Comment on attachment 8444426 [details] [diff] [review]
Part 3 - Make NewStringCopyN create Latin1 strings

Review of attachment 8444426 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsstr.cpp
@@ +4393,5 @@
>  js::NewStringCopyN(ThreadSafeContext *cx, const CharT *s, size_t n)
>  {
>      if (EnableLatin1Strings) {
> +        if (IsSame<CharT, jschar>::value && CanStoreCharsAsLatin1(s, n))
> +            return NewStringDeflated<allowGC>(cx, s, n);

It's good to put this test on the pinchpoint, since that way we default to always deflating if possible.  However, I wonder if we shouldn't have a path for callers that know they have non-latin1 (viz., when the source jschars come from a two-byte JSString).  Scanning the previous renaming patch for uses of NewStringCopyN:
 - string-copying in jscompartment.cpp
 - JSON-parsing (where the CharT template arg is determined by an isLatin1 branch on the parsed string)
 - ScriptSource::substring (we need jschars to pass to the parser; in the worst case, we'd ensureHasTwoByte which would re-inflate!)
OTOH, we probably would want the test for JSAPI callers of ParseJSONWithReviver, since we likely can deflate, but that could be done in jsapi.cpp.

Perhaps we should have a NewStringCopyNTwoByte(cx, const jschar*, size_t)?  We could share that code with the tail of this function.
Attachment #8444426 - Flags: review?(luke) → review+
(In reply to Luke Wagner [:luke] from comment #5)
> It's good to put this test on the pinchpoint, since that way we default to
> always deflating if possible.  However, I wonder if we shouldn't have a path
> for callers that know they have non-latin1 (viz., when the source jschars
> come from a two-byte JSString).  Scanning the previous renaming patch for
> uses of NewStringCopyN:
>  - string-copying in jscompartment.cpp

Yeah, I agree it would be nice to use this here.

>  - JSON-parsing (where the CharT template arg is determined by an isLatin1
> branch on the parsed string)

I think there we want to deflate: let's say we have a huge JSON string and only one string literal in it uses a TwoByte char, then IMO we shouldn't use TwoByte strings for all *other* strings the parser allocates.

> Perhaps we should have a NewStringCopyNTwoByte(cx, const jschar*, size_t)? 
> We could share that code with the tail of this function.

Yeah, but some callers (like the jscompartment.cpp one) want a templated function. Maybe NewStringCopyNDontDeflate? That's slightly confusing if CharT is Latin1Char though...
(In reply to Jan de Mooij [:jandem] from comment #6)
> I think there we want to deflate: let's say we have a huge JSON string and
> only one string literal in it uses a TwoByte char, then IMO we shouldn't use
> TwoByte strings for all *other* strings the parser allocates.

Ah, good point.

> Yeah, but some callers (like the jscompartment.cpp one) want a templated
> function. Maybe NewStringCopyNDontDeflate? That's slightly confusing if
> CharT is Latin1Char though...

That's a better name; doesn't seem to confusing to me.
I was wrong; this function is not actually called with CharT = Latin1Char. However, defining it this way seems simplest, because NewStringCopyN can just forward to NewStringCopyNDontDeflate in all cases, without requiring a helper function (for the common tail part) that both can call.
Attachment #8444700 - Flags: review?(luke)
Comment on attachment 8444700 [details] [diff] [review]
Part 4 - Add NewStringCopyNDontDeflate

Great!
Attachment #8444700 - Flags: review?(luke) → review+
This patch renames NewString to NewStringDontDeflate, and NewString now does the deflation.
Attachment #8445149 - Flags: review?(luke)
Comment on attachment 8445149 [details] [diff] [review]
Part 5 - NewString

Review of attachment 8445149 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsstr.cpp
@@ +4423,5 @@
> +                return cx->staticStrings().getUnit(c);
> +            }
> +        }
> +
> +        JSFlatString *s = NewStringDeflated<allowGC>(cx, chars, length);

For symmetry, could the length == 1 check be moved into NewStringDeflated?
Attachment #8445149 - Flags: review?(luke) → review+
Comment on attachment 8444387 [details] [diff] [review]
Part 1 - Remove js_* prefix

Review of attachment 8444387 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsstr.h
@@ -125,5 @@
> -js_NewStringCopyZ(js::ExclusiveContext *cx, const jschar *s);
> -
> -template <js::AllowGC allowGC>
> -extern JSFlatString *
> -js_NewStringCopyZ(js::ThreadSafeContext *cx, const char *s);

Since you are pretty much renaming and rewriting all of these anyway, do you suppose you could hoist all these functions into vm/String.{h,cpp}?  It has always seemed like they belonged there instead of in jsstr.cpp, which is string builtin fucntions (that one day should go into builtin/).  rs=me if you agree
(In reply to Luke Wagner [:luke] from comment #11)
> For symmetry, could the length == 1 check be moved into NewStringDeflated?

That's what I had at first, but it breaks the world: we can call NewStringCopyN, which also calls NewStringDeflated, before the static strings are initialized. Then we return a nullptr string and this is treated as an error... We could also null-check the result of StaticStrings::getUnit or add a "bool initialized_" to StaticStrings.

(In reply to Luke Wagner [:luke] from comment #12)
> Since you are pretty much renaming and rewriting all of these anyway, do you
> suppose you could hoist all these functions into vm/String.{h,cpp}?  It has
> always seemed like they belonged there instead of in jsstr.cpp, which is
> string builtin fucntions (that one day should go into builtin/).  rs=me if
> you agree

Heh, I was just thinking about it earlier today. I'll land those patches first (as I've already Try-servered them), then do this as a follow-up patch.
Sorry for that, Wes.

My patches always manage to break Windows PGO builds every few months; it's pretty fragile but somehow it only happens to me :(

I pushed this patch to Try to see if it's this patch or the one in bug 1028867 that's causing it.
Flags: needinfo?(jdemooij)
(In reply to Jan de Mooij [:jandem] from comment #20)
> I pushed this patch to Try to see if it's this patch or the one in bug
> 1028867 that's causing it.

It turns out the test is still failing intermittently after the backout; it's a problem with the test. Hopefully this can reland when bug 1028872 is fixed.
Depends on: 1028872
(In reply to Luke Wagner [:luke] from comment #12)
> Since you are pretty much renaming and rewriting all of these anyway, do you
> suppose you could hoist all these functions into vm/String.{h,cpp}?  It has
> always seemed like they belonged there instead of in jsstr.cpp, which is
> string builtin fucntions (that one day should go into builtin/).  rs=me if
> you agree

https://hg.mozilla.org/integration/mozilla-inbound/rev/e2e07dba5965
Keywords: leave-open
https://hg.mozilla.org/mozilla-central/rev/e2e07dba5965
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
You need to log in before you can comment on or make changes to this bug.