Closed Bug 830500 Opened 12 years ago Closed 12 years ago

Need way to convert two-byte characters to jsid outside the engine

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla21

People

(Reporter: bholley, Assigned: bholley)

References

Details

Attachments

(1 file, 1 obsolete file)

In the XBL code, I have a field name as a PRUnichar*, and I need it as an id. I can make it a JSString, but there's no way to convert that to a jsid right now without interning it. This would leak the string for the lifetime of the runtime, which we definitely don't want.

Patch coming up.
Waldo points out that you can do PRUnichar* -> JSString -> JSVal -> JS_ValueToId. But I think a direct API is probably better.
Indeed.  And PRUnichar* -> JSString probably gives you a flat or linear string, which would make JS_ValueToId slower than it ideally would be.  jschar* -> jsid directly means we can internally do jschar* -> JSAtom* -> jsid, which is much more efficient (only one string created, not two).

Watch out for jschar* that contain an integer-valued jsid.  I'm not sure offhand what's the best sequence of intermediate calls to handle that possibility, but it does have to be addressed right now.  :-\  Or we could assert that the jschar* never would convert to an integer-valued jsid; I dunno if your use case permits that sort of assertion or not.
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #2)
> Watch out for jschar* that contain an integer-valued jsid.  I'm not sure
> offhand what's the best sequence of intermediate calls to handle that
> possibility, but it does have to be addressed right now.  :-\  Or we could
> assert that the jschar* never would convert to an integer-valued jsid; I
> dunno if your use case permits that sort of assertion or not.

I don't know what that's about. Can you explain? I'm just trying to convert a regular old UTF16 string to a jsid.
In the general run of affairs there are two different kinds of jsid: one that contains certain kinds of integers, and one that contains atomized strings.  (There are a few more for E4X now that we might use for ES6 stuff eventually.)  If your characters don't spell out a number, then the correct jsid representation is through an atomized string.  If they do spell out a certain kind of number, then the correct jsid representation is the integer form.  You can determine if an integer matches the integer form using INT_FITS_IN_JSID(int32_t).

This means the first thing you'd want to do is test the characters to see if they contain such an integer.  If they do, you return the integer-id kind.  Otherwise you create a PropertyName* and return NameToId(propname).  But if you can require that your chars never contain such an integer, you could straightaway create the PropertyName* and jsid for it.  I don't know if your use case would let you say that no such integers would ever flow through or not.
Attached patch Implement JS_StringToId. v1 (obsolete) — Splinter Review
Hm. Well, <field>s probably don't have integer names (though I'd need to
double-check the XBL spec), but it seems like if this is an API method it should
handle the general case, right? So where do I check this? Before the call to
AtomizeChars? And how do I do the requisite atoi?
If I am interpreting Mr. Waldo's comments correctly, I believe it would look something like

  JS_FunkyStringToId(jschar *chars, uint32_t len) {
    if (len == 0)
      return JSID_EMPTY;

    double d;
    int32_t i;
    if (js_strtod(chars, chars + len, &end, &d) && end == chars + len && MOZ_DOUBLE_IS_INT32(d, &i))
     return INT_TO_JSID(i);

    return NON_INTEGER_ATOM_TO_JSID(AtomizeChars(chars, len));
  }

Maybe. r?Waldo
Comment on attachment 702002 [details] [diff] [review]
Implement JS_StringToId. v1

Review of attachment 702002 [details] [diff] [review]:
-----------------------------------------------------------------

The general case is whatever the API defines it as.  Because we're moving toward bifurcating uint32_t and non-uint32_t accesses, it probably makes sense to require that the string not contain a uint32_t, and not the slightly-different case of not containing an integer id.  Thus in terms of the created-property space this covers, it covers the complement of JS_IndexToId.

But I guess if it's covering non-uint32_t, then trying to optimize this into always creating an atom-valued jsid is impossible and/or misguided.  :-(  So I think something like this is probably the most reasonable thing to do:

JS_PUBLIC_API(JSBool)
JS_CharsToId(JSContext* cx, JS::TwoByteChars chars, jsid *idp)
{
    RootedAtom atom(cx, AtomizeChars(cx, chars.start(), chars.length()));
    if (!atom)
        return false;
#ifdef DEBUG
    uint32_t dummy;
    MOZ_ASSERT(!atom->isIndex(&dummy));
#endif
    *idp = AtomToId(atom);
    return true;
}

::: js/src/jsapi.cpp
@@ +7079,5 @@
>      return IndexToId(cx, index, id);
>  }
>  
>  JS_PUBLIC_API(JSBool)
> +JS_StringToId(JSContext* cx, const jschar *str, jsid *idp)

I'd name this JS_CharsToId, as String would suggest JSString* to me.  Also note that the prototype in jsapi.h doesn't match this!  And now that bug 811060 has landed, this should take a JS::TwoByteChars.
We used to have a CheckForStringIndex thing... is that dead already?
Yes, bhackett removed the need for it.  Now if you have a jsid, it's appropriately canonicalized.  Which, hm, come to think of it, might have meant there wasn't anything to worry about re getting the right jsid kind.  :-\

Sigh, jsid is a mess.
Comment on attachment 702002 [details] [diff] [review]
Implement JS_StringToId. v1

Flagging for review given comment 9.
Attachment #702002 - Flags: review?(jwalden+bmo)
Comment on attachment 702002 [details] [diff] [review]
Implement JS_StringToId. v1

Review of attachment 702002 [details] [diff] [review]:
-----------------------------------------------------------------

Comment 7.  :-)
Attachment #702002 - Flags: review?(jwalden+bmo) → review-
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #9)
> Which, hm, come to think of it, might have
> meant there wasn't anything to worry about re getting the right jsid kind. 

I took this to mean that comment 7 was obsolete. If that's not true, I can implement comment 7.
Attachment #702002 - Attachment is obsolete: true
Comment on attachment 705836 [details] [diff] [review]
Implement JS_CharsToId. v2

Sorry for the lag, here.
Attachment #705836 - Flags: review?(jwalden+bmo)
Comment on attachment 705836 [details] [diff] [review]
Implement JS_CharsToId. v2

Review of attachment 705836 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi.h
@@ +6162,5 @@
>  extern JS_PUBLIC_API(JSBool)
>  JS_IndexToId(JSContext *cx, uint32_t index, jsid *id);
>  
>  /*
> + * Convert chars into a jsid.

Please add a note that the chars can't be an index.
Comment on attachment 705836 [details] [diff] [review]
Implement JS_CharsToId. v2

Review of attachment 705836 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi.cpp
@@ +7082,5 @@
>      return true;
>  }
>  
>  JS_PUBLIC_API(JSBool)
> +JS_CharsToId(JSContext* cx, JS::TwoByteChars chars, jsid *idp)

JS::MutableHandleId for the last parameter, these days.

@@ +7089,5 @@
> +    if (!atom)
> +        return false;
> +#ifdef DEBUG
> +    uint32_t dummy;
> +    MOZ_ASSERT(!atom->isIndex(&dummy));

Add a second argument "API misuse: |chars| must not encode an index" to explain exactly what's wrong for anyone who hits this.
Attachment #705836 - Flags: review?(jwalden+bmo) → review+
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #16)
> >  JS_PUBLIC_API(JSBool)
> > +JS_CharsToId(JSContext* cx, JS::TwoByteChars chars, jsid *idp)
> 
> JS::MutableHandleId for the last parameter, these days.

I was under the impression per bug 781070 comment 5 and bug 781070 comment 7 that we were explicitly not using the Handle API in JSAPI except for callbacks. Can you clarify whether that has changed?
Flags: needinfo?(jwalden+bmo)
Flags: needinfo?(terrence)
Summary: Need way to convert JSString to jsid outside the engine → Need way to convert two-byte characters to jsid outside the engine
Yeah, we don't want that to be a handle right now.
Flags: needinfo?(terrence)
Flags: needinfo?(jwalden+bmo)
Oh, for crying out loud - the signature of AtomizeChars changed a few days ago to accept a new template parameter, which is why this bustage didn't show up for me locally or on try. :-(
Fixed and pushed:

https://hg.mozilla.org/integration/mozilla-inbound/rev/2f9cc81bbc67

The build bustage was uniform (a missing template parameter for a function whose call signature changed since my last rebase), that appeared locally for me when I pulled and rebased. So I just made sure it built locally and pushed. Hopefully this is enough.
https://hg.mozilla.org/mozilla-central/rev/2f9cc81bbc67
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: