Closed Bug 959268 Opened 7 years ago Closed 4 years ago

Add support for sequentially loading AsmJS modules in XDRScript.

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: nbp, Assigned: nbp)

References

Details

Attachments

(1 file)

The goal would be to start create a JSFunction with the beginChars of the AsmJSModule which is stored in the extend field of the JSFunction.
This patch encode the AsmJSModule and decode them by calling the parser again.
As we do not yet recover an AsmJSModule from the AsmJS cache, we cannot assert that we obtain the same bytecode yet.

I will make another patch to make CompileFunctionBody recover an AsmJS module.

I made a simple test case which should work with the instrumentation added in Bug 900789.
Attachment #8361197 - Flags: review?(luke)
Comment on attachment 8361197 [details] [diff] [review]
XDR AsmJS Modules.

I don't think this is the right way to go about this.  What you seem to do is save just enough of an AsmJSModule to, on decoding, create a phony AsmJSModule, and then call the link-failure path which recompiles from source with asm.js *explicitly disabled*.  Instead, just save the offsets in the source so that you can recompile from source.  Looking again, you probably don't want CompileFunctionBody, since then you have do deal with formal arguments yourself.  Instead, you probably want to write a variation of CompileLazyFunction that is specialized to asm.js and doesn't require a LazyScript.
Attachment #8361197 - Flags: review?(luke)
Summary of a discussion with luke over IRC:

Currently the parser does not finish until we terminates AsmJS compilations, or load the AsmJSModules from the cache.

XDR'ed buffers are valid until the time where we change the XDR version number.
AsmJSCache is valid until we change buildid / cpuid.

XDR decoder is trying to emulate what could have been a valid state of the parser. Thus is make sense to expect AsmJS modules to be present at the end of the XDR decoder.  The issue caused by the different cache liveness is that if the AsmJS cache is evicted or invalid, loading from XDR implies falling back on reading the source to parse AsmJS bits.

The solution suggested by luke would be to encode the AsmJS bits within the XDR buffer.  This has one big issue which is that AsmJS is way larger, and we cannot safely make multiple copies of it.  What is currently done for AsmJS cache is that we compute the size ahead of time, and we mmap a file of the given size where the AsmJS bits would be written.

At the moment XDR does not feature any way to extract the size ahead of time, which might make this approach a bit difficult.

-- end of the summary.

I think one option would be to XDR everything but encode an index which correspond to the AsmJSModule entry into an array (attached next to the XDR buffer?).  This way, we can encode the XDR buffer ahead, and find its size while we encode it.  This means that we do not have to add an extra function for computing the size of the XDR content ahead of time, and that we do not have to keep additional copies of AsmJS buffers in memory.

  .------------.-------------------------.-------------------------.
  | XDR buffer |     Asm JS buffer 1     |     Asm JS buffer 2     |
  '------------'-------------------------'-------------------------'

This way, we can request a precise file size which correspond to the XDR buffer size plus the size of the AsmJSModules that are indexed.  And only serialize the AsmJS buffer into the mmaped file.

For decoding, we would have to allocate the AsmJSModule from the XDR function which is indexing these, and they would be filled at the end of the decoding phase, by reading the rest of the file.
I don't think we need the exact size of the XDR buffer ahead of time and I also don't think we need to put the asm.js modules out-of-line. Rather:
 - When writing:
   + write the serializedSize
   + write the asm.js module in-line (it'll use exactly serializedSize bytes)
 - When reading
   + read the serializedSize
   + buffer.read(serializedSize) to get a pointer to contiguous memory, then deserialize

Anyhow, as we also discussed on IRC, it makes sense to just fail serialization if there is an asm.js module and, once the general caching mechanism works, I'll add the asm.js support.
(In reply to Luke Wagner [:luke] from comment #4)
> I don't think we need the exact size of the XDR buffer ahead of time and I
> also don't think we need to put the asm.js modules out-of-line. Rather:
> 
>  - When writing:
>    + write the serializedSize
>    + write the asm.js module in-line (it'll use exactly serializedSize bytes)
>  - When reading
>    + read the serializedSize
>    + buffer.read(serializedSize) to get a pointer to contiguous memory, then
> deserialize

The only way I see to write the AsmJS module inline, would be to make holes of AsmJSModule size which are not allocated but serialized later.

  .-----.---------.-----.---------.-----.
  | XDR | … (1) … | XDR | … (2) … | XDR |
  '-----'---------'-----'---------'-----'

Where (1) & (2) are not allocated but are redirecting to the corresponding AsmJSModules?

Is that what you are suggesting here?
Your diagram looks right, assuming that (1) and (2) contain serialized asm.js modules.  I don't know what you mean "not allocated"; in your diagram (1) and (2) take up space in the buffer so it seems like they are allocated.
(In reply to Luke Wagner [:luke] from comment #6)
> Your diagram looks right, assuming that (1) and (2) contain serialized
> asm.js modules.  I don't know what you mean "not allocated"; in your diagram
> (1) and (2) take up space in the buffer so it seems like they are allocated.

If we serialize while doing the XDR encoding, then we would have content of AsmJSModules duplicated 3 times (for the running code, for the XDR buffer, for the mmap-ed file).  I do not think this is something that we can afford on mobile, can we?

Which is the reason, why I suggested the out-of-line or the non-allocated holes, such as the encoding does not cost us 3 copies, but only 2.  And this would save both time and memory.
(In reply to Nicolas B. Pierron [:nbp] from comment #7)
> If we serialize while doing the XDR encoding, then we would have content of
> AsmJSModules duplicated 3 times (for the running code, for the XDR buffer,
> for the mmap-ed file).  I do not think this is something that we can afford
> on mobile, can we?

There is no mmaped file here; asm.js serialization doesn't care about mmap'd files; it'll read/write to any memory range you give it.
Ok, coming back to this issue.  It seems to me that the conclusion would be different today, as we probably don't want to support asm.js code when we XDREncode / XDRDecode.

The reason is that we are going to have WebAssembly soonish, as a substitute for asm.js, and WebAssembly lives in its own file and not inlined in JS code.  Thus, if we are going to run asm.js code, this would probably be code which came from a short period of time, and which did not got updated to use WebAssembly.

Thus I think solving this issue as a non-issue, and the safest and fastest way forward is just to ignore any XDREncode with asm.js code inside them.
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.