Closed Bug 607695 Opened 9 years ago Closed 9 years ago

Avoid unnecessary JS_GetStringBytes

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: igor, Assigned: igor)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 3 obsolete files)

Currently in many cases JS_GetStringBytes is used to convert jsid, JSString* and similar into char* array just to pass the array into JS_NeFunction and other JSAPI that take char* parameter where the array is converted back into the string.

Irrespective of the fate JS_GetStringBytes in bug 607292 we should avoid this conversations. They not only adds complexity but may also introduce subtle bugs as JS_GetStringBytes strips high byte from jschar. So JS_GetStringBytes -> StringConstructionFromBytes does not produce the original string.
Attached patch untested patch (obsolete) — Splinter Review
The patch adds JS_NewFunctionById, JS_DefineFunctionById so the code can use id directly with these JS API. Another addition is JS_MatchStringAndAscii that compares the string against a pure ASCII C-string (the restriction allows not to deal with encoding issues when comparing strings). That allowed to eliminate a few GetStringBytes/strcmp calls.

The patch also exposes JS_PutEscapedString and JS_FileEscapedString so various debug and dumping functions can use them avoiding GetStringBytes while properly dealing with non-ascii strings.

In few places the patch redactors code to avoid returning JS_GetStringBytes moving that into the caller. This way it should be easier to replace JS_GetStringBytes with a RAII counterpart.
Attached patch v1 (obsolete) — Splinter Review
Here is a rebased and tested patch. For details see comments above.
Attachment #486652 - Attachment is obsolete: true
Attachment #487185 - Flags: review?(mrbkap)
Attached patch v2 (obsolete) — Splinter Review
The new patch replaces few more JS_GetStringBytes cases with JS_MatchAsciiBytes and JS_FileEscapedString.
Attachment #487185 - Attachment is obsolete: true
Attachment #487257 - Flags: review?(mrbkap)
Attachment #487185 - Flags: review?(mrbkap)
Comment on attachment 487257 [details] [diff] [review]
v2

mrbkap is doing b7 blockers, I will take care of this
Attachment #487257 - Flags: review?(mrbkap) → review?(gal)
Attached patch v3Splinter Review
Here is yet another update with more replacements JS_GetStringBytes() by JS_MatchStringAndAscii()
Attachment #487257 - Attachment is obsolete: true
Attachment #487429 - Flags: review?(gal)
Attachment #487257 - Flags: review?(gal)
Comment on attachment 487429 [details] [diff] [review]
v3

we started adding C++ JS API code (search for namespace JS). That seems ideal here. Just do str->compare(const char *) or if you insist on the JS namespace, maybe Compare(JSString *, const char *). What do you think?
(In reply to comment #6)
> Comment on attachment 487429 [details] [diff] [review]
> v3
> 
> we started adding C++ JS API code (search for namespace JS).

All the extra API in the patch should work with a C embedding. But I will use in the bug 607292 JS::AutoStringBytes instead of JSAutoStringBytes for C++-only class.
I've said this before: don't add namespace JS C++ API if the new APIs are just flat functions, because that locks out C embeddings for no gain to anyone.

The purpose of C++ APIs is to do things C cannot do, as Igor's RAII invocation ("Auto" as prefix on helper class name) suggests in comment 7.

/be
(In reply to comment #8)
> The purpose of C++ APIs is to do things C cannot do, as Igor's RAII invocation
> ("Auto" as prefix on helper class name) suggests in comment 7.

I am not even sure that JS:: is a good think for classes unless we renames all C++ helpers from jsapi.h to put them under the namespace. We should not change API just for aesthetic reasons so we do not have option of gradually dropping the JS prefix as things change. So the JS prefix will continue to exist and mixing that with JS:: looks ugly.
blocking2.0: ? → betaN+
I agree with comment 9, we can take this to the wiki or the forthcoming internals list (whatever it's called), or even m.d.t.js-engine if you can cope with spam (by reading in gmail -- ironic). Just hacking JS::Foo into jsapi.h is easy to do in haste and you will repent at leisure.

/be
We are about to replace JSD with JSD2, after that there will be no more C embeddings. Its 2010. Its really time to let go of that one.
(In reply to comment #11)
> We are about to replace JSD with JSD2, after that there will be no more C
> embeddings. Its 2010. Its really time to let go of that one.

This is not the place to foist C++ on all embedders, take it to m.d.t.js-engine!

Also, this in no way justifies piecemeal introduction of JS:: into jsapi.h !!!

:-(

If we think there's a better direction for a C++ API, let's agree on the design and we can certainly incrementally implement it. JS::Foo instead of JS_Foo is *not* a better direction. I'm calling halt. The way this exchange is going, I'm not even willing to trust piecemeal implementation will produce a good API.

/be
Alright. This discussion is for a different bug then though. If you are happy with the "JS_MatchAsciiBytes" name (I just find it terribly ugly), lets stick with igor's patch as is.
Which is why I've been begging Luke to be design-lead for a better C++ API but he is just favorite C++ maven. A troika of Igor, jorendorff (who did a lot to doc and fix the API over the years), and Luke would be fine. Or Igor, jorendorff, Waldo. With a dash of mrbkap on the side. But not here, in a wiki. If we do not have a design, we should make minimal improvements in the C-or-C++ way to jsapi.h.

JSD2 is not "about" to happen, compared to what is going in right now, before Mozilla 2 / Firefox 4. Whatever happens for JSD2, and we should compare notes with jimb and the jsapi.h troika/whatever, it needs designing before random csets show up in jsapi.h or any other header.

This is not a waterfall process. We need only enough design to agree on scope, do's and don'ts, patterns, conventions, namespacing. We can and will iterate. But what is happening now is not coherent, it's not designed except by swapping :: for _.

And we need postingts to m.d.t.js-engine, which is the (spammy if you read via the web) way to reach embedders.

/be
Ascii beats ASCII. Is it really 7-bit ASCII, though? Isn't it really 8-bit ISO-Latin-1, the chopped low byte of our UTF16 pretend-UCS Unicode?

/be
I agree completely with comment 14.
(In reply to comment #15)
> Ascii beats ASCII. Is it really 7-bit ASCII, though?

For a simplest possible interface I need an infallible function. Supporting js_CStringsAreUTF8 would require to deal in one way or another with non-UTF16 JS strings which I wanted to avoid. Hence the restriction to match against pure ascii strings enforced via an assert.
(In reply to comment #17)
> (In reply to comment #15)
> > Ascii beats ASCII. Is it really 7-bit ASCII, though?
> 
> For a simplest possible interface I need an infallible function. Supporting
> js_CStringsAreUTF8 would require to deal in one way or another with non-UTF16
> JS strings which I wanted to avoid. Hence the restriction to match against pure
> ascii strings enforced via an assert.

Another possibility is to add something like JS_MatchStringAndChars(JSString *str, jschar *, size_t length), but that would require to use ugly jschar array initializers.
Igor, the ASCII restriction makes sense.

/be
Comment on attachment 487429 [details] [diff] [review]
v3

>-        JSString* name = ATOM_TO_STRING(fun->atom);
>+        char name[40];

Why 40? Please explain. This should be at least some sort of MAX #define.

>-  const char* name = JS_GetStringBytes(JSID_TO_STRING(aId));
>+  JSString *str = JSID_TO_STRING(aId);
> 
>   // Figure out which listener we're setting.
>   SetListenerFunc func;
>-  if (!strcmp(name, "onmessage")) {
>+  if (JS_MatchStringAndAscii(str, "onmessage")) {
>     func = &nsDOMWorkerScope::SetOnmessage;
>   }
>-  else if (!strcmp(name, "onerror")) {
>+  else if (JS_MatchStringAndAscii(str, "onerror")) {
>     func = &nsDOMWorkerScope::SetOnerror;

I think the right fix here is to atomize onerror and onmessage and then just compare. That's faster, too.

>@@ -2827,20 +2823,19 @@ split_addProperty(JSContext *cx, JSObjec
> static JSBool
> split_getProperty(JSContext *cx, JSObject *obj, jsid id, jsval *vp)
> {
>     ComplexObject *cpx;
> 
>     cpx = split_get_private(cx, obj);
>     if (!cpx)
>         return JS_TRUE;
> 
>-    if (JSID_IS_ATOM(id) &&
>-        !strcmp(JS_GetStringBytes(JSID_TO_STRING(id)), "isInner")) {
>+    if (JSID_IS_ATOM(id) && JS_MatchStringAndAscii(JSID_TO_STRING(id), "isInner")) {

again atom comparison would be faster here
> 
>-    if (JSID_IS_ATOM(id) &&
>-        !strcmp(JS_GetStringBytes(JSID_TO_STRING(id)), "isInner")) {
>+    if (JSID_IS_ATOM(id) && JS_MatchStringAndAscii(JSID_TO_STRING(id), "isInner")) {

and here

I wonder whether JS_AsciiToId() is the better API here. And then just do a comparison. We have the auto id vectors to keep ids alive in heap structures. Up to you.
Attachment #487429 - Flags: review?(gal) → review+
(In reply to comment #20)
> >+  JSString *str = JSID_TO_STRING(aId);
...
> >+  if (JS_MatchStringAndAscii(str, "onmessage")) {
...
> >+  else if (JS_MatchStringAndAscii(str, "onerror")) {
> >     func = &nsDOMWorkerScope::SetOnerror;

> I think the right fix here is to atomize onerror and onmessage and then just
> compare. That's faster, too.

That requires more code to atomize the ids lazily and manage their lifetime even with a help of autoclasses. So unless some benchmark would require a faster implementation I would prefer to keep a simple solution. And even if faster code would be necessary, a version of JS_MatchStringAndAscii that takes the length of the string may be sufficient. The latter would be a compile-time constant so a quick compare against the string length may provide a sufficient speedup.

> >+    if (JSID_IS_ATOM(id) && JS_MatchStringAndAscii(JSID_TO_STRING(id), "isInner")) {
> 
> and here
> 
> I wonder whether JS_AsciiToId() is the better API here. And then just do a
> comparison.

From code simplicity point of view something like JS_MatchIdAndAscii would be a winner here. But the conclusion may change if one take into account that currently these id/ascii matching sequences do not properly account for Object.getOwnPropertyNames that is called before the property is resolved. That would definitely require atomization and a common code to pre-atomize the id may win. In any case, this is for another bug.

> We have the auto id vectors to keep ids alive in heap structures.

Those are useful for temporary rooting, not for a permanent storage of ids in heap structures.
http://hg.mozilla.org/tracemonkey/rev/d1c6cef6da3a
Whiteboard: fixed-in-tracemonkey
Uh...  This can't be right:


    30.9 +            JSString* str = JS_ValueToString(cx, s);
   30.10  
   30.11 -            if(!(str = JS_ValueToString(cx, s))||
   30.12 -               !(bytes = JS_GetStringBytes(str)))
   30.13 +            if(str)
   30.14              {
   30.15                  return JS_FALSE;

Do we seriously have no tests for this codepath?
> +    /* Only one memberId or memberName should be given. */

"one of"?
http://hg.mozilla.org/tracemonkey/rev/a4ed852d402a - fix for comment 23 and comment 24. Thanks for finding this!
Igor, can you please file a bug on adding a test for that codepath?  Thanks!
Depends on: greenpocalypse
http://hg.mozilla.org/mozilla-central/rev/d1c6cef6da3a
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.