Last Comment Bug 737624 - Remove support for non-memory XDR operations
: Remove support for non-memory XDR operations
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: x86_64 Linux
: -- normal (vote)
: mozilla14
Assigned To: Igor Bukanov
:
: Jason Orendorff [:jorendorff]
Mentors:
Depends on: 730221
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-20 13:45 PDT by Igor Bukanov
Modified: 2012-03-25 06:12 PDT (History)
9 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
v1 (95.32 KB, patch)
2012-03-22 08:11 PDT, Igor Bukanov
no flags Details | Diff | Splinter Review
v2 (117.43 KB, patch)
2012-03-23 02:56 PDT, Igor Bukanov
luke: review+
Details | Diff | Splinter Review

Description Igor Bukanov 2012-03-20 13:45:09 PDT
+++ This bug was initially created as a clone of Bug #730221 +++

Currently XDR API provides infrastructure to transcode scripts and functions into a generic storage. However, this support is rather costly is it means that most XDR operations including simple one like reading 4 bytes goes via a indirect call. In addition it requires to allocate XDR objects on the heap rather than on the stack.

For this reason I suggest to make XDR memory-only with most methods inlined to transcode straight to a byte array. Another idea is to use template<XDRMode mode> to eliminate runtime checks for XDR encoding/decoding mode.
Comment 1 Igor Bukanov 2012-03-22 08:11:21 PDT
Created attachment 608335 [details] [diff] [review]
v1

The patch allows to encode/decode scripts and functions only to/from memory. In addition the XDR implementation uses template<XDRMode> to avoid runtime checks for xdr encoding/decoding modes. See comments at the start of the patch for details.
Comment 2 Luke Wagner [:luke] 2012-03-22 10:22:35 PDT
Comment on attachment 608335 [details] [diff] [review]
v1

Review of attachment 608335 [details] [diff] [review]:
-----------------------------------------------------------------

Wowzers.  Overall, this looks great.  Any chance you could instrument a browser and see how much it saves on total startup time?  (Snappy would probably like to hear about it if it's substantial.)

I have a bit more to review, but I'll post what I have now.

In accordance with the Great Incremental Renaming, how about renaming:
  js/src/jsxdrapi.h ==> js/public/XdrApi.h
  js/src/jsxdr.h    ==> js/src/vm/Xdr.h
and split js/src/jsxdrapi.cpp into
  js/src/vm/Xdr.cpp (for vm/Xdr.h XDRState::* definitions)
  js/src/XdrApi.cpp (for XdrApi.h JS_* definitions)

::: js/src/js.msg
@@ -149,5 @@
> -MSG_DEF(JSMSG_END_OF_DATA,             63, 0, JSEXN_INTERNALERR, "unexpected end of data")
> -MSG_DEF(JSMSG_SEEK_BEYOND_START,       64, 0, JSEXN_INTERNALERR, "illegal seek beyond start")
> -MSG_DEF(JSMSG_SEEK_BEYOND_END,         65, 0, JSEXN_INTERNALERR, "illegal seek beyond end")
> -MSG_DEF(JSMSG_END_SEEK,                66, 0, JSEXN_INTERNALERR, "illegal end-based seek")
> -MSG_DEF(JSMSG_WHITHER_WHENCE,          67, 1, JSEXN_INTERNALERR, "unknown seek whence: {0}")

heh

::: js/src/jsatom.cpp
@@ +666,5 @@
> +#if JS_HAS_XDR
> +
> +template<XDRMode mode>
> +bool
> +XDRAtom(XDRState<mode> *xdr, JSAtom **atomp)

The style we agree on earlier is js::XDRAtom, instead of a surrounding namespace js {}.

@@ +674,5 @@
> +        return xdr->codeString(&str);
> +    }
> +
> +    /*
> +     * Inline JS_XDRString when decoding to avoid JSString allocation

XDRState::codeString

::: js/src/jsscript.cpp
@@ +665,3 @@
>              return false;
> +        if (mode == XDR_DECODE)
> +            script->filename = SaveScriptFilename(cx, filename);

I like what you did with codeCString.  It seems like SaveScriptFilename can still fail and requires a check, though.

@@ +681,5 @@
> +         * compilation errors when instantiating when mode is XDR_ENCODE.
> +         */
> +        XDRDecoder *decoder =
> +            static_cast<XDRDecoder *>(reinterpret_cast<XDRState<XDR_DECODE> *>(xdr));
> +        decoder->initScriptPrincipals(script);

Instead of this monkey business, what about putting initScriptPrincipals on XDRState and then just putting JS_ASSERT(mode == XDR_DECODE).

::: js/src/vm/RegExpObject.cpp
@@ +705,4 @@
>  /* Functions */
>  
>  JSObject *
> +CloneRegExpObject(JSContext *cx, JSObject *obj, JSObject *proto)

wrong direction

@@ +757,2 @@
>  bool
> +XDRScriptRegExpObject(XDRState<mode> *xdr, HeapPtrObject *objp)

Again, js::

::: js/src/vm/ScopeObject.cpp
@@ +935,5 @@
>      JS_ResolveStub,
>      JS_ConvertStub
>  };
>  
> +namespace js {

js::
Comment 3 Igor Bukanov 2012-03-22 11:34:02 PDT
(In reply to Luke Wagner [:luke] from comment #2)
> Wowzers.  Overall, this looks great.  Any chance you could instrument a
> browser and see how much it saves on total startup time?

It is not straightforward due to lazy loading of the cache so there is no single entry that loads everything. In addition, loading into an empty page does not load all cache entries.

> In accordance with the Great Incremental Renaming, how about renaming:
>   js/src/jsxdrapi.h ==> js/public/XdrApi.h
>   js/src/jsxdr.h    ==> js/src/vm/Xdr.h
> and split js/src/jsxdrapi.cpp into
>   js/src/vm/Xdr.cpp (for vm/Xdr.h XDRState::* definitions)
>   js/src/XdrApi.cpp (for XdrApi.h JS_* definitions)

With the patch the XDR API set is reduced to 4 entries/ So what about simply moving the API into jsapi.(h|cpp) and keeping just vm/xdr.(h|cpp_

> > +        XDRDecoder *decoder =
> > +            static_cast<XDRDecoder *>(reinterpret_cast<XDRState<XDR_DECODE> *>(xdr));
> > +        decoder->initScriptPrincipals(script);
> 
> Instead of this monkey business, what about putting initScriptPrincipals on
> XDRState and then just putting JS_ASSERT(mode == XDR_DECODE).

The reason I put initScriptPrincipals to the encoder is to emphasize that the principals are decoder-only feature. But I agree that the double cast looks ugly. So I guess I will move the principals and the init method to the XDRState.
Comment 4 Luke Wagner [:luke] 2012-03-22 11:48:12 PDT
(In reply to Igor Bukanov from comment #3)
> It is not straightforward due to lazy loading of the cache so there is no
> single entry that loads everything.

What about simply putting 'rdtsc' at the beginning end of all public XDR APIs?  The overhead of rdtsc should be amortized enough that the sum gives you an accurate idea over total time in XDR before/after.

> With the patch the XDR API set is reduced to 4 entries/ So what about simply
> moving the API into jsapi.(h|cpp) and keeping just vm/xdr.(h|cpp_

Ah, yes, sounds good.
Comment 5 Igor Bukanov 2012-03-23 02:56:11 PDT
Created attachment 608639 [details] [diff] [review]
v2

The patch addresses the comments. I also removed JS_HAS_XDR. As CloneScript depends on XDR !JS_HAS_XDR has been broken for quite some time.
Comment 6 Luke Wagner [:luke] 2012-03-23 09:05:01 PDT
Comment on attachment 608639 [details] [diff] [review]
v2

Review of attachment 608639 [details] [diff] [review]:
-----------------------------------------------------------------

Nicely done

::: js/src/jsapi.cpp
@@ +105,5 @@
>  #include "vm/RegExpObject-inl.h"
>  #include "vm/RegExpStatics-inl.h"
>  #include "vm/Stack-inl.h"
>  #include "vm/String-inl.h"
> +#include "vm/Xdr.h"

This #include should be below vm/StringBuffer.h

Here is the general style:
  https://wiki.mozilla.org/JavaScript:SpiderMonkey:C++_Coding_Style#Includes
There are 4 or so other #includes that need reordering; could you grep the patch for '#include' and fix the rest up?

::: js/src/jsatom.cpp
@@ +681,5 @@
> +#if IS_LITTLE_ENDIAN
> +    /* Directly access the little endian chars in the XDR buffer. */
> +    const jschar *chars = reinterpret_cast<const jschar *>(xdr->buf.read(nchars * sizeof(jschar)));
> +    atom = js_AtomizeChars(cx, chars, nchars);
> +#else

Nice

::: js/src/jsxdrapi.cpp
@@ +89,5 @@
> +        js_ReportOutOfMemory(cx());
> +        return false;
> +    }
> +    if (data != base) {
> +        base = static_cast<uint8_t *>(data);

Seems like the branch isn't really necessary.

::: js/src/vm/Xdr.h
@@ +59,5 @@
> +
> +class XDRBuffer {
> +  public:
> +    XDRBuffer(JSContext *cx)
> +        : context(cx), base(NULL), cursor(NULL), limit(NULL) { }

Two space indent for : instead of four.

@@ +93,5 @@
> +        return ptr;
> +    }
> +
> +    uint8_t *write(size_t n) {
> +        if (n > size_t(limit - cursor)) {

Might be worth storing 'length' directly to avoid subtraction on every write.

@@ +122,5 @@
> +
> +#if defined IS_LITTLE_ENDIAN
> +
> +inline uint32_t
> +SwabXDR32(uint32_t x)

Is Swab standard lingo?  Since there are only a few uses, how about we go long and say NormalizeByteOrder(16|32)?

@@ +188,5 @@
> +            uint8_t *ptr = buf.write(sizeof tmp);
> +            if (!ptr)
> +                return false;
> +            tmp = SwabXDR16(*n);
> +            memcpy(ptr, &tmp, sizeof tmp);

It seems like memcpy would, in the best case, generate the same code as
  *(uint16_t *)ptr = tmp;
and I would prefer to read this.  If there is not another reason, could you turn all the scalar memcpy's into assignments?
Comment 7 Igor Bukanov 2012-03-23 13:52:53 PDT
(In reply to Luke Wagner [:luke] from comment #6)
> > +            tmp = SwabXDR16(*n);
> > +            memcpy(ptr, &tmp, sizeof tmp);
> 
> It seems like memcpy would, in the best case, generate the same code as
>   *(uint16_t *)ptr = tmp;
> and I would prefer to read this.

The patch removes 4-byte padding that the current XDR code has. As such the reads can be misaligned. For such reads in general one cannot use assigns and memcpy usage is a must. For x86/x64 CPUs that allow misaligned memory access GCC correctly transforms those 4-bytes memcpy back into assignments.
Comment 8 Luke Wagner [:luke] 2012-03-23 13:54:18 PDT
(In reply to Igor Bukanov from comment #7)
Ah, that's the reason.  Thanks for explaining.
Comment 10 :Ms2ger (⌚ UTC+1/+2) 2012-03-24 03:06:16 PDT
Went all red, so backed out

https://hg.mozilla.org/integration/mozilla-inbound/rev/7a39ee24bd89
Comment 11 Igor Bukanov 2012-03-24 04:21:24 PDT
relanded with a fix for improper inbound import - https://hg.mozilla.org/integration/mozilla-inbound/rev/30798fdc5bad
Comment 12 Dão Gottwald [:dao] 2012-03-24 04:35:33 PDT
All red and backed out again.
Comment 13 Ryan VanderMeulen [:RyanVM] 2012-03-24 05:00:58 PDT
http://30.media.tumblr.com/tumblr_m11ykqEl0P1rrf1eeo1_500.jpg
Comment 14 Igor Bukanov 2012-03-24 09:42:11 PDT
(In reply to Dão Gottwald [:dao] from comment #12)
> All red and backed out again.

Sorry about that - I forgot to commit the merge fix before pushing again.

https://hg.mozilla.org/integration/mozilla-inbound/rev/6b6084350c40
Comment 15 Igor Bukanov 2012-03-24 10:22:28 PDT
To measure the effect of the patch I measured using rdtsc number of CPU cycles spent between JS_XDRNewMem/JS_XDRDestroy before the patch and between XDRState constructor/destructor after the patch. When browser starts to show 2 tabs, one with gmail, and one with about:memory (this way most of XDR cache is exercised) I have:

Before the the patch I got 47.4 millions of cycle, the patch reduced that to 33.2, which translates into a win of 11.1 millions of cycles or 3.7 ms of start up time on 3.0 GHz desktop.
Comment 16 Ed Morley [:emorley] 2012-03-25 06:12:43 PDT
https://hg.mozilla.org/mozilla-central/rev/6b6084350c40

Note You need to log in before you can comment on or make changes to this bug.