Closed Bug 1316081 Opened 3 years ago Closed 3 years ago

Encode XDR after the bytecode generator.

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox54 --- fixed

People

(Reporter: nbp, Assigned: nbp)

References

Details

Attachments

(3 files, 2 obsolete files)

+++ This bug was initially created as a clone of Bug #1311020 +++

Currently XDR encode is able to encode script after the execution.  Unfortunately this has multiple drawbacks which are that we cannot do it in parallel of the execution, and we have to disable a memory optimization for large object literals (JSOP_OBJECT).

Jon suggested in Bug 1311020 comment 2 that instead of encoding the JSScript after the execution, we could encode them as we parse them.

Doing it this way would distribute the cost of the JSScript encoding across the execution, without blocking the main thread in a non-interruptible manner.

Moreover, this would allow us to keep the memory optimization of JSOP_OBJECT, as we would encode the JSObject before they got a chance to be mutated.

This raise additional difficulties which is that XDR is currently creating JSScript by doing a top-down traversal of the functions/scripts which are encoded.  If we serialize the code as we go, we would have to reconstruct the format expected by XDR buffers, or create a new format for XDR content.
No longer depends on: 1311020
See Also: → 1311020
Ok, here is the idea that I will implement:

To encode each JSScript as soon as they are being compiled into bytecode, we have to be able to find the associated XDREncoder to use.  Thus, I will add a Incremental-XDREncoder owned by the ScriptSourceObject.

At one point the Incremental-XDREncoder would be moved out of the ScriptSourceObject in order to be written to disk by another thread.

the Incremental-XDREncoder would be similar to the current XDREncoder with the exception that we will record the associated locations of XDRInterpretedFunction calls in terms of XDR buffer ranges.  These ranges would be indexed by some JSFunction properties, such as begin & end, which uniquely identify a JSFunction from a ScriptSourceObject.

We should use this Incremental-XDREncoder to encode the top-level JSScript, and use it again to encode each function which is being delazified.

When we encode a delazyfied function, we would redo the XDRInterpretedFunction call, and record with the begin&end-key the range of the delazyfied XDR content.

When we serialize, we would start from the top-level, and instead of encoding the ranges of the JSFunctions with LazyScript, we copy the delazyfied content.  This implies that we basically reconstruct the sequence ranges which have to be memcpy in the final buffer that we save in the cache.

LazyScript encoding should be small enough to not be a big concerned in terms of wasted space.

As a similar concern, we would have to wait for the source compression to finish, such that we do not encode the de-compressed source in the XDR buffer, but hold until the end of compression.
Assignee: nobody → nicolas.b.pierron
Status: NEW → ASSIGNED
This patch adds 3 structures:
 - XDRReplaceablePart & XDRReplaceableHook
 - XDRIncrementalEncoder

* XDRReplaceablePart & XDRReplaceableHook

These are made to scope content written by XDR, which can later be replaced
by another version with the same key.

This is based on the assumption that begin & end of a JSScript / LazyScript
are enough to uniquely identify the bytecode associated to the function for
a given ScriptSource (part 2 adds the XDRIncrementalEncoder on the
ScriptSource)

The XDRRepleaceableHook is used as a simple way to avoid repeating the
template argument of XDRState within the XDRRepleaceablePart structure.

* XDRIncrementalEncoder

This is an XDREncoder which is made to be used multiple times, and to be
linearized before getting the final buffer filled with the encoded data.

The way it is implemented, is by always appending bytes to a temporary
slices_ buffer instead of the targeted transcode buffer given as argument to
the constructor.

When new parts are encoded, with XDRReplaceablePart scopes, we register new
slices in a tree-like structure stored in a HashMap.

When old parts are re-encoded, we replace the previous tree-node, indexed by
the XDRReplaceablePart::Key by a new one.  We do not atempt to manage the
memory more than only appending data into the vector, and leaving old slices
dead within the slices_ vector.

We we linearize the tree, we do a depth-first traversal of the tree
structure stored in the HashMap.

The nodes of the tree are represented by a vector of structures which
reference an index and the length of the slices_ buffer, and the child-node
(JSFunction part) which is interleaved in-between.
Attachment #8826173 - Flags: review?(luke)
This patch makes use of the XDRIncrementalEncoder added in part 1, and adds
an instance of the ScriptSource object.

This is made to find the associated encoder when we delazify a JSFunction,
if the script is currently waiting to be encoded before being cached.
Attachment #8826175 - Flags: review?(luke)
Add new JSAPI to start using the incremental encoder on a given script, and
to later serialize everything.

This patch also instrument the evaluate & evalWithCache functions to add
test cases for the incremental encoder.
Attachment #8826176 - Flags: review?(luke)
This is directly tied to the BCE and other details that I'm no longer familiar with, so I think shu would be the better reviewer here.
Comment on attachment 8826173 [details] [diff] [review]
part 1 - Add XDRIncrementalEncoder to replace delazified LazyScript in the encoded XDR buffer.

(see comment 2)
Attachment #8826173 - Flags: review?(luke) → review?(shu)
Comment on attachment 8826175 [details] [diff] [review]
part 2 - Add an XDRIncrementalEncoder instance on the ScriptSource.

(see comment 3)
Attachment #8826175 - Flags: review?(luke) → review?(shu)
Comment on attachment 8826176 [details] [diff] [review]
part 3 - Expose a new JSAPI to incrementally encode bytecode when it is generated.

(see comment 4)
Attachment #8826176 - Flags: review?(luke) → review?(shu)
Comment on attachment 8826173 [details] [diff] [review]
part 1 - Add XDRIncrementalEncoder to replace delazified LazyScript in the encoded XDR buffer.

Review of attachment 8826173 [details] [diff] [review]:
-----------------------------------------------------------------

I had trouble fully understanding the incremental encoding algorithm. Many type and variable names should be clarified and renamed. The setReplaceablePart function should also be split up instead of taking the bool parameter.

::: js/src/vm/Xdr.cpp
@@ +209,5 @@
> +
> +    if (fun->isInterpretedLazy()) {
> +        static_assert(sizeof(fun->lazyScript()->begin()) == 4 ||
> +                      sizeof(fun->lazyScript()->end()) == 4,
> +                      "Break XDRReplaceablePart key invarriant");

Please make the assert message direct. How about "XDRReplaceablePart key requires LazyScripts to be uint32"

Also typo: invariant

@@ +214,5 @@
> +        key_ = uint64_t(fun->lazyScript()->begin()) << 32 | fun->lazyScript()->end();
> +    } else if (fun->isInterpreted()) {
> +        // We cannot make a static assert with |sourceStart| and |sourceEnd|
> +        // because the accessors return a size_t instead of the uint32_t
> +        // storage.

Add MOZ_ASSERTS that sourceStart and sourceEnd <= UINT32_MAX?

@@ +247,5 @@
> +    return true;
> +}
> +
> +void
> +XDRIncrementalEncoder::setReplacablePart(XDRReplaceablePart* part, bool shouldReplace)

This function is really confusing and needs to be split up into two functions, one for starting a new subtree (shouldReplace = true) and adding a level to the tree and one for going back up a level in the tree (shouldReplace = false). A block comment describing the algorithm would be good.

@@ +252,5 @@
> +{
> +    if (oom_)
> +        return;
> +
> +    // Skip replacable part which have no way to be identified uniquely.

Typo: replaceable

@@ +284,5 @@
> +                oom_ = true;
> +                return;
> +            }
> +        } else {
> +            p->value() = mozilla::Move(tmp);

What's the situation where there's a part with shouldReplace = true that's already in the tree?

@@ +290,5 @@
> +    }
> +    MOZ_ASSERT(p, "The entry should exists by now");
> +    currentParts_ = &p->value();
> +
> +    // Add the first or next entry in the new repleacable part.

Typo: replaceable

::: js/src/vm/Xdr.h
@@ +61,5 @@
>      size_t cursor_;
>  };
>  
> +// This class is used by XDRIncrementalEncoder for detecting sections which can
> +// be substituted when writting to disk.

Typo: writing

@@ +63,5 @@
>  
> +// This class is used by XDRIncrementalEncoder for detecting sections which can
> +// be substituted when writting to disk.
> +class XDRReplaceablePart;
> +class XDRReplaceableHook {

XDRReplaceableHook is a confusing name, since it's acting as a base class and isn't a hook itself.

I'd prefer a more obvious name, like XDRCoderBase.

@@ +66,5 @@
> +class XDRReplaceablePart;
> +class XDRReplaceableHook {
> +  protected:
> +    XDRReplaceableHook() {}
> +    ~XDRReplaceableHook() {}

Is the destructor made virtual in a later patch? Not sure why the empty destructor is typed out here.

@@ +70,5 @@
> +    ~XDRReplaceableHook() {}
> +
> +  public:
> +    virtual bool isIncrementalEncoder() const { return false; }
> +    virtual void setReplacablePart(XDRReplaceablePart*, bool shouldReplace) {}

Typo here and elsewhere: Replacable -> Replaceable

@@ +291,5 @@
>  };
>  
> +class XDRIncrementalEncoder;
> +
> +class MOZ_RAII XDRReplaceablePart

This needs a block comment that should cover:
  - What a "replaceable part" during XDR encoding is -- sub scripts?
  - Why we need these -- inner lazy scripts don't have bytecode immediately, anything else?
  - Invariants -- are these replaced only once? Can they get replaced over and over?
  - That it's only usable with an incremental encoder.

@@ +295,5 @@
> +class MOZ_RAII XDRReplaceablePart
> +{
> +  public:
> +    // For a JSFunction, a replaceable-part key is defined as being:
> +    //     script()->begin << 32 | script()->end

This comment makes it seem like XDRReplaceablePart could be used for non-inner functions. If there's no plan to, it'd be easier to understand if it were called Function instead of Part.

@@ +324,5 @@
> +    using Part = XDRReplaceablePart;
> +
> +    // The incremental encoder encodes the content of scripts and functions in
> +    // the XDRBuffer. The slices from the XDRBuffer are interleaved with
> +    // repleacable parts, which can be substituted by some other sequence of

typo: replaceable

@@ +325,5 @@
> +
> +    // The incremental encoder encodes the content of scripts and functions in
> +    // the XDRBuffer. The slices from the XDRBuffer are interleaved with
> +    // repleacable parts, which can be substituted by some other sequence of
> +    // bits later on.

This is pretty vague. What does "later on" mean?

@@ +327,5 @@
> +    // the XDRBuffer. The slices from the XDRBuffer are interleaved with
> +    // repleacable parts, which can be substituted by some other sequence of
> +    // bits later on.
> +    struct Range {
> +        size_t sliceBegin;

Rename this struct to Slice or rename these fields to rangeBegin and rangeEnd?

@@ +329,5 @@
> +    // bits later on.
> +    struct Range {
> +        size_t sliceBegin;
> +        size_t sliceLength;
> +        Part::Key nextSubPart;

Is nextSubPart a child or a sibling in the tree? It's a child, right? The "next" makes it confusing.

@@ +335,5 @@
> +
> +    // This hashmap encode the tree of replaceable parts. This tree is rooted at
> +    // the |topLevel| key, and one sub-trees can replace another if it has the same
> +    // replacable-part key.
> +    using Parts = Vector<Range, 1, SystemAllocPolicy>;

"Parts" is asking for typos and is also not what it does: it doesn't hold Parts, which is the RAII thing, but Ranges. RangeVector please.

@@ +336,5 @@
> +    // This hashmap encode the tree of replaceable parts. This tree is rooted at
> +    // the |topLevel| key, and one sub-trees can replace another if it has the same
> +    // replacable-part key.
> +    using Parts = Vector<Range, 1, SystemAllocPolicy>;
> +    using TreeParts = HashMap<Part::Key, Parts, DefaultHasher<Part::Key>, SystemAllocPolicy>;

RangeTree preferred.

@@ +341,5 @@
> +
> +    Part* current_;
> +    Parts* currentParts_;
> +
> +    TreeParts parts_;

These 3 names, current_, currentParts_, and parts_, really confused me when reading the code. current_ is some RAII on-stack Part. currentParts_ doesn't hold Part instances, but instead Ranges. parts_ is a hash map of those Ranges.

What is the depth of this tree? The nesting depth of inner functions?

@@ +352,5 @@
> +      : XDREncoder(cx, slices_, 0),
> +        current_(nullptr),
> +        currentParts_(nullptr),
> +        parts_(),
> +        slices_(),

No need for these empty constructor calls.
Attachment #8826173 - Flags: review?(shu)
Comment on attachment 8826173 [details] [diff] [review]
part 1 - Add XDRIncrementalEncoder to replace delazified LazyScript in the encoded XDR buffer.

Review of attachment 8826173 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsfun.cpp
@@ +596,5 @@
>      }
>  
> +    // Scope everything added below, such that we can substitute it by the
> +    // non-lazy-script version of this function later.
> +    js::XDRReplaceablePart funPart(xdr, fun);

It'd be nice if this had symmetry with the topScriptPart in the next patch and were in the encoder instead of in the otherwise bidirectional XDRInterpretedFunction function. It'd be easier to read and hopefully remove the weird isIncrementalEncoder checks.
Comment on attachment 8826175 [details] [diff] [review]
part 2 - Add an XDRIncrementalEncoder instance on the ScriptSource.

Review of attachment 8826175 [details] [diff] [review]:
-----------------------------------------------------------------

This part looks fine to me.

::: js/src/jsscript.cpp
@@ +1926,5 @@
> +        return false;
> +    }
> +
> +    RootedScript s(cx, script);
> +    XDRReplaceablePart topScriptPart(xdrEncoder_.get(), XDRReplaceablePart::topLevel);

See note for previous patch about seeing if it's possible to make this XDRReplaceablePart symmetric between the top-level script and the function case. If not, xdrFunction needs a comment about where its XDRReplaceablePart is.

::: js/src/jsscript.h
@@ +590,5 @@
> +    // Create a new XDR encoder, and encode the top-level JSScript. The result
> +    // of the encoding would be available in the |buffer| provided as argument,
> +    // as soon as |xdrFinalize| is called and all xdr function calls returned
> +    // successfully.
> +    bool xdrTopLevel(ExclusiveContext* cx, JS::TranscodeBuffer& buffer, HandleScript script);

Since this function only encodes, please rename to xdrEncodeTopLevel. Most XDR functions are bidirectional.

@@ +595,5 @@
> +
> +    // Encode a delazified JSFunction.  In case of errors, the XDR encoder is
> +    // freed and the |buffer| provided as argument to |xdrTopLevel| is
> +    // considered undefined.
> +    bool xdrFunction(ExclusiveContext* cx, HandleFunction fun);

xdrEncodeFunction

@@ +600,5 @@
> +
> +    // Linearize the encoded content in the |buffer| provided as argument to
> +    // |xdrTopLevel|, and free the XDR encoder.  In case of errors, the |buffer|
> +    // is considered undefined.
> +    bool xdrFinalize();

xdrFinalizeEncode
Attachment #8826175 - Flags: review?(shu) → review+
Comment on attachment 8826176 [details] [diff] [review]
part 3 - Expose a new JSAPI to incrementally encode bytecode when it is generated.

Review of attachment 8826176 [details] [diff] [review]:
-----------------------------------------------------------------

Some big picture questions:

1. Is the unit of incrementality LazyScripts? When encoding the top-level scripts, we'll encode all inner JSScripts eagerly. LazyScripts are encoded as LazyScripts and replaced by encoded JSScripts when they're delazified.

2. If a function doesn't get delazified by the time FinishIncrementalEncoding is called, does it get written out to disk as an encoded LazyScript?

3. Related to 2, what does the bytecode cache do about decoded scripts with inner LazyScripts?
Thanks shu for the reviews, the following reply is about the design questions of these patches.

(In reply to Shu-yu Guo [:shu] from comment #9)
> @@ +284,5 @@
> > +                oom_ = true;
> > +                return;
> > +            }
> > +        } else {
> > +            p->value() = mozilla::Move(tmp);
> 
> What's the situation where there's a part with shouldReplace = true that's
> already in the tree?

Then it is being replaced. This is expected, as Lazy functions are registered first and then replaced by their delazified versions.
The test case already check that this behaves as expected.

> @@ +295,5 @@
> > +class MOZ_RAII XDRReplaceablePart
> > +{
> > +  public:
> > +    // For a JSFunction, a replaceable-part key is defined as being:
> > +    //     script()->begin << 32 | script()->end
> 
> This comment makes it seem like XDRReplaceablePart could be used for
> non-inner functions. If there's no plan to, it'd be easier to understand if
> it were called Function instead of Part.

This was the design goal, to have this for more than just functions. On of the idea was to have a place-holder for the compressed source before encoding.

Apparently, this would not yet needed because our top-level parser is waiting for the source compression to finish before being able to run any code. Still, I think this might be something we might want to relax in the future, in which case it would make sense to encode a dummy place-holder for the source, and wait for the compressed source to be encoded in XDR before writing the cache.

> @@ +325,5 @@
> > +
> > +    // The incremental encoder encodes the content of scripts and functions in
> > +    // the XDRBuffer. The slices from the XDRBuffer are interleaved with
> > +    // repleacable parts, which can be substituted by some other sequence of
> > +    // bits later on.
> 
> This is pretty vague. What does "later on" mean?

This is means that as we delazify function, we are going to re-encode a sub-tree that we already have, and replace it by this newly encoded sub-tree.

> @@ +329,5 @@
> > +    // bits later on.
> > +    struct Range {
> > +        size_t sliceBegin;
> > +        size_t sliceLength;
> > +        Part::Key nextSubPart;
> 
> Is nextSubPart a child or a sibling in the tree? It's a child, right? The
> "next" makes it confusing.

Yes, this is a child-tree.

> @@ +335,5 @@
> > +
> > +    // This hashmap encode the tree of replaceable parts. This tree is rooted at
> > +    // the |topLevel| key, and one sub-trees can replace another if it has the same
> > +    // replacable-part key.
> > +    using Parts = Vector<Range, 1, SystemAllocPolicy>;
> 
> "Parts" is asking for typos and is also not what it does: it doesn't hold
> Parts, which is the RAII thing, but Ranges. RangeVector please.

Based on RangeTree name, I think it might makes more sense to name it RangeNode then.

> What is the depth of this tree? The nesting depth of inner functions?

Yes, the depth of the tree is the inner functions depth.

(In reply to Shu-yu Guo [:shu] from comment #12)
> Some big picture questions:
> 
> 1. Is the unit of incrementality LazyScripts? When encoding the top-level
> scripts, we'll encode all inner JSScripts eagerly. LazyScripts are encoded
> as LazyScripts and replaced by encoded JSScripts when they're delazified.

Exactly, we will encode all LazyScript, which are low volume compared to their counter part of delazified functions.

> 2. If a function doesn't get delazified by the time
> FinishIncrementalEncoding is called, does it get written out to disk as an
> encoded LazyScript?

Yes.
This is why this is a JS Startup Bytecode cache.

> 3. Related to 2, what does the bytecode cache do about decoded scripts with
> inner LazyScripts?

IT decodes then as LazyScripts, and when they would be used (after the start-up, or after a divergent start-up), they would be delazified with the compressed sources which is encoded as part of XDR.
(In reply to Shu-yu Guo [:shu] from comment #10)
> Comment on attachment 8826173 [details] [diff] [review]
> part 1 - Add XDRIncrementalEncoder to replace delazified LazyScript in the
> ::: js/src/jsfun.cpp
> @@ +596,5 @@
> >      }
> >  
> > +    // Scope everything added below, such that we can substitute it by the
> > +    // non-lazy-script version of this function later.
> > +    js::XDRReplaceablePart funPart(xdr, fun);
> 
> It'd be nice if this had symmetry with the topScriptPart in the next patch
> and were in the encoder instead of in the otherwise bidirectional
> XDRInterpretedFunction function. It'd be easier to read and hopefully remove
> the weird isIncrementalEncoder checks.

Unfortunately, this is not possible because when we encode the top-level, we have to record within the first codeScript call, all the XDRReplaceablePart (renamed to XDRScope in future patches) which are corresponding to the code of a JSFunction.

If we were to move the XDRReplaceablePart from XDRInterpretedFunction to the xdrFunction  fucntion, then we would not be able to replace it, because we would not know what can be replaced.

I still found a way to remove the isIncrementalEncoder function, by adding a function dedicated for computing the key of the XDRScope.  I will also move the current XDRReplaceablePart from the xdrTopLevel (part 2) to codeScript (part 1).
Delta:
 - a Bunch of renames:
     XDRReplaceableHook -> XDRCoderBase
     XDRReplaceablePart -> XDRScope
     Range -> Slice
     Part -> SlicesNode
     TreeParts -> SlicesTree

 - Add a comment describing graphically the tree representation.

 - Split the fact of opening a child node (createOrReplaceSubTree) & closing
   a child node (endSubTree).

 - Replace isIncrementalEncoder by getScopeKey functions, which by default
   return a noKey, and are overriden by XDRIncrementalEncoder.  This
   simplifies the code for the XDRScope constructor and destructor.

 - Removes the assertion of XDRObjectLiteral, which was made to ensure that
   no caller attempt to encode an object which has already been mutated, by
   checking that we have the getSingletonAsTemplates flag.

   In the case of the incremental encoder, we do not want to keep the
   singleton-as-templates flag, as it would consume extra memory.  As we
   encode scripts before their first execution, there is no risk of encoding
   a mutated singleton object (JSOP_OBJECT).

   Fortunately, this runOnce property is checked by the caller of
   XDRObjectLiteral, in XDRScript.

 - Moved XDRScope from the ScriptSource::xdrEncodeTopLevel to
   XDRState<mode>::codeScript, to feel "symetrical" with the XDRScope under
   XDRInterpretedFunction.
Attachment #8829503 - Flags: review?(shu)
Attachment #8826173 - Attachment is obsolete: true
I am trying this set of patches with the rebased prototype of the bytecode cache (Bug 900784), and I am facing a weird issue which I can summarize with the following stack trace:

#2  0x00007fa3d7cd08e7 in js::ScriptSource::performXDR<(js::XDRMode)0>
#3  0x00007fa3d7cd398c in js::XDRScript<(js::XDRMode)0>
#4  0x00007fa3d7c9cc0a in js::XDRInterpretedFunction<(js::XDRMode)0>
#5  0x00007fa3d7ea16c7 in js::XDRState<(js::XDRMode)0>::codeFunction
#6  0x00007fa3d7ca8131 in js::ScriptSource::xdrEncodeFunction
#7  0x00007fa3d7d72b6e in js::frontend::CompileLazyFunction

This stack trace tells us that a LazyScript can have the OwnSource flag. Strangely, I thought that only the top-level script, which has no JSFunction associated with, could be the owner of the source, but none of the functions below.

The ScriptSource pointer is the same between the frame 7, and the frame 2.
On the other hand, what sounds weird is the fact that the sourceStart_ and sourceEnd_ are respectively 0 and 811, where the source length is actually 640047 long (2 bytes chars).

The file in question is https://webmaker.org/js/lib/react.js and the display name of the function is named "[88]<" (I am not kidding).

This cause the source to be encoded 215 times.
(In reply to Nicolas B. Pierron [:nbp] from comment #16)
> #2  0x00007fa3d7cd08e7 in js::ScriptSource::performXDR<(js::XDRMode)0>
> #3  0x00007fa3d7cd398c in js::XDRScript<(js::XDRMode)0>
> #4  0x00007fa3d7c9cc0a in js::XDRInterpretedFunction<(js::XDRMode)0>
> #5  0x00007fa3d7ea16c7 in js::XDRState<(js::XDRMode)0>::codeFunction

The problem comes from the fact that codeFunction provide no enclosingScript argument, which causes the source to be encoded so many times.
(In reply to Nicolas B. Pierron [:nbp] from comment #16)
> […] the sourceStart_ and sourceEnd_ are respectively 0 and 811, […]

Ok, this is not true, the sourceStart_ and sourceEnd_ are 495139 and 512277.  The problem came from me doing a bad reinterpret case of a Rooted in gdb.  So there is no issue to be fixed with these locations.
Depends on: 1334091
Delta:
 - Add Encode as part of the names of the functions. (comment 11)
 - Remove the XDRScope (XDRreplaceablePart) from the xdrEncodeTopLevel and
   moved it to codeScript function in part 1.
 - Follows the modifications made in Bug 1334091, and provide a sourceObject
   argument to the codeFunction function used in xdrEncodeFunction.
Attachment #8830727 - Flags: review?(shu)
Attachment #8826175 - Attachment is obsolete: true
Comment on attachment 8829503 [details] [diff] [review]
part 1 - Add XDRIncrementalEncoder to replace delazified LazyScript in the encoded XDR buffer.

Review of attachment 8829503 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the refactoring. The add/removing of subtrees is more understandable now.

r=me with XDRScope renamed to AutoXDRIncrementalTree or something similar. We can bikeshed in the comments, I just need something that's not "Scope".

::: js/src/vm/Xdr.cpp
@@ +133,5 @@
>      RootedScope scope(cx(), &cx()->global()->emptyGlobalScope());
> +    if (mode == XDR_DECODE)
> +        funp.set(nullptr);
> +    else if (getScopeKey(funp) != XDRScope::noKey)
> +        scope = funp->nonLazyScript()->enclosingScope();

This line is confusing due to the ScopeKey and XDRScope naming. The XDRScope and the ScopeKey in the condition has nothing to do with the scope and enclosingScope in the assignment. See comment below.

@@ +260,5 @@
> +        return;
> +
> +    size_t cursor = buf.cursor();
> +
> +    // End the parent slice here, and a key to the child.

and a key to the child -> set the key to the child

@@ +264,5 @@
> +    // End the parent slice here, and a key to the child.
> +    if (parent) {
> +        Slice& last = node_->back();
> +        last.sliceLength = cursor - last.sliceBegin;
> +        last.child = child->key_;

Can we check invariants of the parent and child keys here? I assume it is an invariant that the parent key's (start, end) range subsumes the child key's range.

::: js/src/vm/Xdr.h
@@ +74,5 @@
> +// encoded by an XDRIncrementalEncoder.
> +//
> +// Sections can be encoded any number of times in an XDRIncrementalEncoder, and
> +// the latest encoded version would replace all the previous one.
> +class MOZ_RAII XDRScope

I'm afraid XDRScope is a confusing name, since Scopes (of binding names) are also XDR'd, "XDRScope" makes it sound like it's related to that. Since this is for 1) incremental encoding, 2) for RAII management of nodes in a tree, how about AutoXDRIncrementalTreeNode or something like that, following the "Auto" tradition.

@@ +87,5 @@
> +
> +    XDRScope(XDRCoderBase* xdr, Key key);
> +    ~XDRScope();
> +
> +    // Used to end the slices when there is no children.

Wrong comment about noKey.

@@ +104,5 @@
> +    XDRScope* parent_;
> +    XDRCoderBase* xdr_;
> +};
> +
> +class XDRCoderBase {

Style nit: { on newline

@@ +329,5 @@
>          return sourceObjectOut_;
>      }
>  };
>  
> +class XDRIncrementalEncoder : public XDREncoder {

Style nit: { on newline

@@ +368,5 @@
> +    //      | . |
> +    //      +---+
> +    //
> +    //
> +    // The scope key is used to idenitfy the child nodes, and to make them

Typo: identify

@@ +374,5 @@
> +    //
> +    // The tree is rooted at the |topLevel| key.
> +    //
> +
> +    struct Slice {

Style nit: { on newline

@@ +387,5 @@
> +
> +    // Last opened scope on the stack.
> +    XDRScope* scope_;
> +    // Node corresponding to the opened scope.
> +    SlicesNode* node_;

Could keep node_ on the RAII struct instead of keeping them in sync manually.

@@ +408,5 @@
> +
> +    virtual ~XDRIncrementalEncoder() {}
> +
> +    XDRScope::Key getTopLevelScopeKey() const override;
> +    XDRScope::Key getScopeKey(JSFunction* fun) const override;

Cool, these are much better than the isIncrementalEncoder virtual.

@@ +413,5 @@
> +
> +    MOZ_MUST_USE bool init();
> +
> +    void createOrReplaceSubTree(XDRScope* child) override;
> +    void endSubTree() override;

Thanks for splitting up these methods, this is a lot more understandable for me.
Attachment #8829503 - Flags: review?(shu) → review+
Attachment #8830727 - Flags: review?(shu) → review+
Comment on attachment 8826176 [details] [diff] [review]
part 3 - Expose a new JSAPI to incrementally encode bytecode when it is generated.

Review of attachment 8826176 [details] [diff] [review]:
-----------------------------------------------------------------

Tests look fine.

One more big picture question: is the intention that the startup cache is re-encoded anew each time the process shuts down? That is, we make an in-memory tree either from newly parsed scripts or XDR decoded scripts when we start the process?
Attachment #8826176 - Flags: review?(shu) → review+
Comment on attachment 8830727 [details] [diff] [review]
part 2 - Add an XDRIncrementalEncoder instance on the ScriptSource.

Review of attachment 8830727 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the cleanup.

One thing I'm unclear on is how JS::EncodeInterpretedFunction is used. What's being passed to it from XPConnect? What if I pass a delazified inner function to it? That'll end up being considered OwnSource, right?
(In reply to Shu-yu Guo [:shu] from comment #22)
> Comment on attachment 8830727 [details] [diff] [review]
> part 2 - Add an XDRIncrementalEncoder instance on the ScriptSource.
> 
> Review of attachment 8830727 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Thanks for the cleanup.
> 
> One thing I'm unclear on is how JS::EncodeInterpretedFunction is used.
> What's being passed to it from XPConnect? What if I pass a delazified inner
> function to it? That'll end up being considered OwnSource, right?

Ignore this comment, I meant that as review to bug 1334091 and typed it in the wrong tab.
(In reply to Shu-yu Guo [:shu] from comment #21)
> One more big picture question: is the intention that the startup cache is
> re-encoded anew each time the process shuts down? That is, we make an
> in-memory tree either from newly parsed scripts or XDR decoded scripts when
> we start the process?

The intent of the JS start-up cache is to persist across processes as part of the alternate-data API provided by the Necko cache.  The idea being to save the bytecode once the page is considered ""idle"" the first time it is used.

The last discussion I had with :bkelly was to use the loadEnd event on the global, as the ""idle"" trigger.  Any request to execute the same script after should pull from the bytecode cache, and no attempt to re-encode anything.

For the moment I only intent to save the bytecode cache on the first visit. The bytecode cache would be discarded as soon as the cache entry is invalidated, either by the remote server, or by the user, or by another version of Firefox (build-id).

The test case does re-encode after the decoding, but this is a sanity check which ensures that we decode properly by verifying that if we encode again we obtain the same bytes that we encoded the first time.
(In reply to Shu-yu Guo [:shu] from comment #20)
> Comment on attachment 8829503 [details] [diff] [review]
> part 1 - Add XDRIncrementalEncoder to replace delazified LazyScript in the
> encoded XDR buffer.
> 
> Review of attachment 8829503 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me with XDRScope renamed to AutoXDRIncrementalTree or something similar.
> We can bikeshed in the comments, I just need something that's not "Scope".

I will name it AutoXDRTree, as the class is still present in both encoding and decoding, even if this is a no-op.

> @@ +387,5 @@
> > +
> > +    // Last opened scope on the stack.
> > +    XDRScope* scope_;
> > +    // Node corresponding to the opened scope.
> > +    SlicesNode* node_;
> 
> Could keep node_ on the RAII struct instead of keeping them in sync manually.

We could, but doing implies moving the Slice and SlicesNode out of the XDRIncrementalEncoder, and also to keep a pointer to the internal data of the XDRIncrementalEncoder on the stack of the XDR function.

I am not sure if this would be more desirable than syncing this pointer manually.
Pushed by npierron@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/05ab647f6e78
part 1 - Add XDRIncrementalEncoder to replace delazified LazyScript in the encoded XDR buffer. r=shu
https://hg.mozilla.org/integration/mozilla-inbound/rev/1dfb4475f41e
part 2 - Add an XDRIncrementalEncoder instance on the ScriptSource. r=shu
https://hg.mozilla.org/integration/mozilla-inbound/rev/c261e950629f
part 3 - Expose a new JSAPI to incrementally encode bytecode when it is generated. r=shu
https://hg.mozilla.org/mozilla-central/rev/05ab647f6e78
https://hg.mozilla.org/mozilla-central/rev/1dfb4475f41e
https://hg.mozilla.org/mozilla-central/rev/c261e950629f
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
You need to log in before you can comment on or make changes to this bug.