Closed Bug 966911 Opened 10 years ago Closed 10 years ago

Possible inconsistency in DOM/AString conversion

Categories

(Core :: XPConnect, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla30

People

(Reporter: neil, Assigned: neil)

References

Details

Attachments

(4 files, 3 obsolete files)

Having made it possible in bug 514173 for literal strings to be passed to the JS engine without being copied I started investigating what would happen if such a string was passed back into C++.

Unfortunately I very quickly got lost in the rats' nest that is JSData2Native.

In particular, the code paths with and without useAllocator appear to differ substantially. I don't know how much of this is deliberate.
Attached patch WIP (obsolete) — Splinter Review
I tried to streamline the DOM/AString conversion.

I added an early check for a void string in the AString case before falling through to the DOMString case.
I then started with a check for a null string. If we still don't have a void string then I check for an empty string.
Finally we're ready to allocate the string. We can then set the string's data.
In the case of a void string we must be in DOMString territory so we simply set the string to the literal "undefined". (Bug 514173 makes this cheap.)
In the case of an external string with a shared buffer I recover the buffer and assign it to the new string.
(Ironically the one case I haven't figured out was the external string with a literal buffer, which was the original point of the exercise.)
There was a case with the old code whereby the string was dependent on the JS string, I hope I got the conditions for that all correct.
If all else fails then I simply copy the wchars of the string as before.

I also tweaked the AUTF8/ACString conversion to avoid constructing null or empty strings. I didn't actually profile to see whether this is a win though, I only assumed it was based on the existing NewStringWrapper code preferring to avoid allocations.

I also rewrote the aforementioned NewStringWrapper code. Instead of a complex custom structure and class type I switched to a simple array of nsXPIDLString. This has the useful property of defaulting to void, which is never the type of a used string (since we special-case null strings earlier).
Attachment #8369339 - Flags: feedback?(bobbyholley)
Comment on attachment 8369339 [details] [diff] [review]
WIP

Review of attachment 8369339 [details] [diff] [review]:
-----------------------------------------------------------------

I'm really finding myself wondering whether this whole mScratchStrings stuff is worth it. It saves us a container allocation, but not the allocation of the underlying buffer, and adds its own overhead. Anyway, I guess that's a battle for another day.

Kudos for the careful work. r=bholley with comments addressed.

::: js/xpconnect/src/XPCConvert.cpp
@@ +470,5 @@
>      }
>  
>      case nsXPTType::T_ASTRING:
>      {
> +        if (JSVAL_IS_VOID(s)) {

Hm. In the JSVAL_IS_VOID case, the old code would pass the empty string for AString, not a nulled nsString. Unless I'm mistaken, we should fix that here.

@@ +510,4 @@
>              }
>          }
>  
> +        nsString* ws;

Can we get a more descriptive name here?

@@ +524,5 @@
> +                   JS_GetExternalStringFinalizer(str) ==
> +                   &XPCStringConvert::sDOMStringFinalizer) {
> +            nsStringBuffer::FromData((void *)chars)->ToString(length, *ws);
> +        // } else if (JS_IsExternalString(str)) {
> +        // Need to figure out how to recreate the original literal string

These 2 cases are optimizations that are not present on trunk, right? Assuming that's the case, I think we should separate them out into a separate patch that lands on top of this one.

@@ +525,5 @@
> +                   &XPCStringConvert::sDOMStringFinalizer) {
> +            nsStringBuffer::FromData((void *)chars)->ToString(length, *ws);
> +        // } else if (JS_IsExternalString(str)) {
> +        // Need to figure out how to recreate the original literal string
> +        } else if (useAllocator && STRING_TO_JSVAL(str) == s && !chars[length]) {

Some comments:

* This trick is super sketchy. I wish we could do something better, but that's probably out of scope. At the very least, please comment what's happening, and why it's "safe".
* Is there actually a reason to only do this in the useAllocator case, or is it only for consistency with the old behavior?
* How is the !chars[length] test valid? If we're dealing with a non-null-terminated string, that read is undefined. But moreover, this should probably jsut be an assertion, because chars comes from JS_GetStringCharsZAndLength, which guarantees null termination.

@@ +526,5 @@
> +            nsStringBuffer::FromData((void *)chars)->ToString(length, *ws);
> +        // } else if (JS_IsExternalString(str)) {
> +        // Need to figure out how to recreate the original literal string
> +        } else if (useAllocator && STRING_TO_JSVAL(str) == s && !chars[length]) {
> +            ws->Rebind(chars, length);

Please add some documentation for this method. I couldn't find it anywhere. AFAICT it effectively morphs the string into an nsDependentString?

::: js/xpconnect/src/XPCJSRuntime.cpp
@@ +1415,5 @@
>      return n;
>  }
>  
> +nsString *
> +XPCJSRuntime::NewString()

Let's call this NewShortLivedString.

@@ +1430,4 @@
>  }
>  
>  void
>  XPCJSRuntime::DeleteString(nsAString *string)

ReleaseShortLivedString.

::: js/xpconnect/src/xpcprivate.h
@@ +618,5 @@
>      mozilla::TimeStamp mSlowScriptCheckpoint;
>  
>  #define XPCCCX_STRING_CACHE_SIZE 2
>  
> +    nsXPIDLString mScratchStrings[XPCCCX_STRING_CACHE_SIZE];

Add a comment explaining that we're using nsXPIDLString for its nullability, not for its getter_Copies properties.
Attachment #8369339 - Flags: feedback?(bobbyholley) → review+
(In reply to Bobby Holley from comment #2)
> > +        if (JSVAL_IS_VOID(s)) {
> 
> Hm. In the JSVAL_IS_VOID case, the old code would pass the empty string for
> AString, not a nulled nsString. Unless I'm mistaken, we should fix that here.

Actually you get a null string. I misread that too, and a test failure in updateManagerXML.js on a try run (ecce70c392ae) showed me I was wrong.

Perhaps you do get an empty string in the useAllocator case, in which case that's an example of the inconsistency that I was worried about!

> > +        nsString* ws;
> 
> Can we get a more descriptive name here?

I just used the old name (see later on in the diff). I can't think of anything more descriptive than "result", if that's OK?

> > +        } else if (JS_IsExternalString(str) &&
> > +                   JS_GetExternalStringFinalizer(str) ==
> > +                   &XPCStringConvert::sDOMStringFinalizer) {
> > +            nsStringBuffer::FromData((void *)chars)->ToString(length, *ws);
> > +        // } else if (JS_IsExternalString(str)) {
> > +        // Need to figure out how to recreate the original literal string
> 
> These 2 cases are optimizations that are not present on trunk, right?
> Assuming that's the case, I think we should separate them out into a
> separate patch that lands on top of this one.

Fair enough.

> > +        } else if (useAllocator && STRING_TO_JSVAL(str) == s && !chars[length]) {
> 
> Some comments:
> 
> * This trick is super sketchy. I wish we could do something better, but
> that's probably out of scope. At the very least, please comment what's
> happening, and why it's "safe".

The STRING_TO_JSVAL(str) was copied from the old code.

> * Is there actually a reason to only do this in the useAllocator case, or is
> it only for consistency with the old behavior?

Just for consistency, although it seems like a return value would be a bad time to do this, which would correspond to a !useAllocator case.

> * How is the !chars[length] test valid? If we're dealing with a
> non-null-terminated string, that read is undefined. But moreover, this
> should probably jsut be an assertion, because chars comes from
> JS_GetStringCharsZAndLength, which guarantees null termination.

Ah, at one point I was using JS_GetStringCharsAndLength, so the check was necessary. (Ideally I need a function JS_AreStringCharsTerminated...)

> > +            ws->Rebind(chars, length);
> 
> Please add some documentation for this method. I couldn't find it anywhere.
> AFAICT it effectively morphs the string into an nsDependentString?

Indeed it does. Sadly when bz moved the declaration from nsTString.h to nsTSubstring.h he forgot to do anything about the comment.

> > +nsString *
> > +XPCJSRuntime::NewString()
> 
> Let's call this NewShortLivedString.

OK.

> >  void
> >  XPCJSRuntime::DeleteString(nsAString *string)
> 
> ReleaseShortLivedString.

OK.

> > +    nsXPIDLString mScratchStrings[XPCCCX_STRING_CACHE_SIZE];
> 
> Add a comment explaining that we're using nsXPIDLString for its nullability,
> not for its getter_Copies properties.

OK. (Its getter_Copies properties are inherited from nsString these days! It does however have an implicit char16_t cast operator though.)
(In reply to neil@parkwaycc.co.uk from comment #3)
> Perhaps you do get an empty string in the useAllocator case, in which case
> that's an example of the inconsistency that I was worried about!

Yeah. It seems like it depends on the value of useAllocator. In the (JSVAL_IS_VOID(s) && useAllocator && !isDOMString) case, I'm pretty sure we end up invoking:

const nsAString *rs = new nsString(EMPTY_STRING, 0);

I'm happy to unify the useAllocator/!useAllocator behavior if we think it's likely that people aren't depending on it, but we should do that in a separate patch for easy regression detection.
 
> I just used the old name (see later on in the diff). I can't think of
> anything more descriptive than "result", if that's OK?

Eh, just leasve it as |ws| for now then.

> > * This trick is super sketchy. I wish we could do something better, but
> > that's probably out of scope. At the very least, please comment what's
> > happening, and why it's "safe".
> 
> The STRING_TO_JSVAL(str) was copied from the old code.

Yeah I know. I'm just saying that we should document the very sketchy reason why this is GC-safe (i.e. these things are actually stack-scoped, and will be cleaned up before the XPCWN method call returns, but their scope is wider than JSData2Native's.

> > * Is there actually a reason to only do this in the useAllocator case, or is
> > it only for consistency with the old behavior?
> 
> Just for consistency, although it seems like a return value would be a bad
> time to do this, which would correspond to a !useAllocator case.

OK. Maybe comment this?

> > Please add some documentation for this method. I couldn't find it anywhere.
> > AFAICT it effectively morphs the string into an nsDependentString?
> 
> Indeed it does. Sadly when bz moved the declaration from nsTString.h to
> nsTSubstring.h he forgot to do anything about the comment.

Can you add it while you're here?

> > > +    nsXPIDLString mScratchStrings[XPCCCX_STRING_CACHE_SIZE];
> > 
> > Add a comment explaining that we're using nsXPIDLString for its nullability,
> > not for its getter_Copies properties.
> 
> OK. (Its getter_Copies properties are inherited from nsString these days! It
> does however have an implicit char16_t cast operator though.)

Hm. Maybe we should rename nsXPIDLString to something that better-reflects its new properties? At the very least, we should update the internal string guide to explain that.
(In reply to Bobby Holley from comment #4)
> (In reply to comment #3)
> > Perhaps you do get an empty string in the useAllocator case, in which case
> > that's an example of the inconsistency that I was worried about!
> 
> Yeah. It seems like it depends on the value of useAllocator. In the
> (JSVAL_IS_VOID(s) && useAllocator && !isDOMString) case, I'm pretty sure we
> end up invoking:
> 
> const nsAString *rs = new nsString(EMPTY_STRING, 0);
> 
> I'm happy to unify the useAllocator/!useAllocator behavior if we think it's
> likely that people aren't depending on it, but we should do that in a
> separate patch for easy regression detection.

Fair enough, but it did pass Try.
(In reply to neil@parkwaycc.co.uk from comment #5)
> Fair enough, but it did pass Try.

Yeah. But this is the kind of thing that breaks some extension somewhere down the line, and it's great to let the regression-finders pinpoint it down to a small well-understood change rather than a megapatch.
Attached patch Proposed patch (obsolete) — Splinter Review
Assignee: nobody → neil
Attachment #8369339 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8372833 - Flags: review?(bobbyholley)
(In reply to Bobby Holley from comment #4)
> I'm happy to unify the useAllocator/!useAllocator behavior if we think it's
> likely that people aren't depending on it, but we should do that in a
> separate patch for easy regression detection.

Here's an example of where the useAllocator behaviour is the oddball:

> var ss = Components.classes['@mozilla.org/supports-string;1']
>                    .createInstance(Components.interfaces.nsISupportsString);
> scs.data = undefined;

This results in scs.data === ""

> var scs = Components.classes['@mozilla.org/supports-cstring;1']
>                     .createInstance(Components.interfaces.nsISupportsCString);
> scs.data = undefined;

This results in scs.data === null

This also applies to AUTF8String, string or wstring parameters.

This also applies to returning values or out parameters from JS to C++ of all the above string types, including AString, because useAllocator is false in this case.
https://tbpl.mozilla.org/?tree=Try&rev=8aaa9f743854

I don't seem to be having much luck with those Linux debug browser chrome tests but everything else seems to have greened up.
Why the switch to IsEmpty as sentinel? It seems like that could be a problem if a caller ends up storing what turns out to be am empty string in the returned buffer, and then it gets reused, right?

It seems like we should be really explicit about nullability. To be honest, I think I'd prefer an array of Maybe<nsString> instances instead, since that would also handle the case in which the caller does SetIsVoid() without realizing that doing so marks the string as available in the cache.
(In reply to Bobby Holley from comment #10)
> Why the switch to IsEmpty as sentinel?
Because nsXPIDLString was too magic, and I realised that I didn't need it.

> It seems like that could be a problem if a caller ends up storing what turns out
> to be am empty string in the returned buffer, and then it gets reused, right?
No, these are all const nsAString& parameters, so they can't be changed.
By "callers" I meant XPCConvert and company. I think we should have explicit state tracking whether the string is in-use. Either a struct {nsString; bool;} or Maybe<nsString>.
(In reply to Bobby Holley from comment #12)
> By "callers" I meant XPCConvert and company. I think we should have explicit
> state tracking whether the string is in-use. Either a struct {nsString;
> bool;} or Maybe<nsString>.

There is only one caller, and it always sets it to a non-empty string...
It still feels like too much of a foot-gun to me. I think we need a sentinel mechanism that is not itself a valid string.

What is the overhead of:

str.Truncate();
str.Assign("foo");

Compared to destruction + construction?
Comment on attachment 8372833 [details] [diff] [review]
Proposed patch

>+    if (string == &EmptyString() || string == &NullString())
>+        return;
>+
>+    if (string >= mScratchStrings &&
>+        string < mScratchStrings + XPCCCX_STRING_CACHE_SIZE) {
>+        // One of our internal strings is no longer in use, mark
>+        // it as such and free its data.
>+        string->Truncate();
>+        return;
>     }
> 
>     // We're done with a string that's not one of our internal
>     // strings, delete it.
>     delete string;
How about I add an assertion that (once the known cases are excluded) the string is non-empty?
(In reply to neil@parkwaycc.co.uk from comment #15)
> How about I add an assertion that (once the known cases are excluded) the
> string is non-empty?

It'd certainly be something, but there's no guarantee we'd hit it. My fear is that it would be very easy to write code with the possibility of filling such a string with the empty string, even if it rarely happens in practice.

Is there a compelling reason not to use an {nsString, boolean} pair here? It's a bit of extra book-keeping, but the abstraction lives entirely within {Allocate,Delete}ShortLivedString, and doesn't impose any tricky rules on the caller.
So, I wrote some extra assertions to prove me correct, and found a "bug" in XPConnect. Apparently for dipper params we manually allocate an nsAutoString and then expect DeleteString to be able to delete it. (I don't know why we allocate an nsAutoString for AString but an nsCString for ACString parameters. Also prior to DeleteString we used delete on an nsString* to delete it, which is almost as bad.)
(In reply to neil@parkwaycc.co.uk from comment #17)
> So, I wrote some extra assertions to prove me correct, and found a "bug" in
> XPConnect. Apparently for dipper params we manually allocate an nsAutoString
> and then expect DeleteString to be able to delete it.

Yikes. Can we fix that (in a separate patch)?
(In reply to Bobby Holley from comment #18)
> (In reply to comment #17)
> > So, I wrote some extra assertions to prove me correct, and found a "bug" in
> > XPConnect. Apparently for dipper params we manually allocate an nsAutoString
> > and then expect DeleteString to be able to delete it.
> 
> Yikes. Can we fix that

Sure, now that we have the extra in-use check from Maybe<nsString> we can call NewShortLivedString for dippers too.

> (in a separate patch)?

By your command...
Attachment #8372833 - Attachment is obsolete: true
Attachment #8372833 - Flags: review?(bobbyholley)
Attachment #8374929 - Flags: review?(bobbyholley)
Comment on attachment 8374929 [details] [diff] [review]
Addressed review comments

Review of attachment 8374929 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comment addressed.

::: js/xpconnect/src/XPCWrappedNative.cpp
@@ +2369,3 @@
>      // UTF8_STRING and CSTRING are also quite similar, and both use nsCString.
>      if (type_tag == nsXPTType::T_ASTRING || type_tag == nsXPTType::T_DOMSTRING)
> +        dp->val.p = nsXPConnect::GetRuntimeInstance()->NewShortLivedString();

Separate this out into a separate patch per the above?
Attachment #8374929 - Flags: review?(bobbyholley) → review+
Really in a separate patch this time.
Attachment #8375089 - Flags: review?(bobbyholley)
Attached patch Fix undefined AString params (obsolete) — Splinter Review
Attachment #8375091 - Flags: review?(bobbyholley)
Comment on attachment 8375091 [details] [diff] [review]
Fix undefined AString params

Review of attachment 8375091 [details] [diff] [review]:
-----------------------------------------------------------------

This is the same patch as the other, and the incorrect one given the title I think.
Attachment #8375091 - Flags: review?(bobbyholley) → review-
Attachment #8375089 - Flags: review?(bobbyholley) → review+
Comment on attachment 8374929 [details] [diff] [review]
Addressed review comments

Bah, I'm not having much luck today am I...
Attachment #8374929 - Attachment description: Addressed review commants → Addressed review comments
Attachment #8375091 - Attachment is obsolete: true
Attachment #8375182 - Flags: review?(bobbyholley)
Attachment #8375182 - Flags: review?(bobbyholley) → review+
Attachment #8375905 - Flags: review?(bobbyholley)
Comment on attachment 8375905 [details] [diff] [review]
Recover external strings

Review of attachment 8375905 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with those.

::: js/xpconnect/src/XPCConvert.cpp
@@ +521,5 @@
>  
>          if (!str) {
>              ws->AssignLiteral(MOZ_UTF16("undefined"));
> +        } else if (XPCStringConvert::IsDOMString(str)) {
> +            nsStringBuffer::FromData((void *)chars)->ToString(length, *ws);

Please add a comment here explaining what this does (i.e. compute the nsStringBuffer of the origin DOMString that was passed into JS, and then assigning that to the nsString we've created, which should cause the underlying buffer to be shared rather than copied).

@@ +523,5 @@
>              ws->AssignLiteral(MOZ_UTF16("undefined"));
> +        } else if (XPCStringConvert::IsDOMString(str)) {
> +            nsStringBuffer::FromData((void *)chars)->ToString(length, *ws);
> +        } else if (XPCStringConvert::IsLiteral(str)) {
> +            ws->AssignLiteral(chars, length);

Please add a comment explaining that literals are effectively interned, so we don't have to worry about the lifetime of their buffer.
Attachment #8375905 - Flags: review?(bobbyholley) → review+
https://hg.mozilla.org/mozilla-central/rev/bb5c1b2dc7e4
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Filed bug 1330671 on adding the non-copying fast-paths to bindings too.
Blocks: 1330671
You need to log in before you can comment on or make changes to this bug.