Closed Bug 878399 Opened 7 years ago Closed 6 years ago

"use asm" affects function toString/toSource

Categories

(Core :: JavaScript Engine, defect)

x86_64
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla31

People

(Reporter: jruderman, Assigned: bbouvier)

References

Details

(Keywords: testcase)

Attachments

(4 files, 5 obsolete files)

(function() { "use asm"; function f(x) { x = x|0; return x|0; } return f; })()
warning: successfully compiled asm.js code
function f() {
    [native code]
}

This is likely to confuse jsfunfuzz & compareJIT, making it harder to find real "asm compilation changes the result" bugs.  I'd like a fix at least for --enable-more-deterministic shells.
Duplicate of this bug: 983840
We should do this in two parts: 1) modules (i.e. the outer closure), 2) functions (i.e. what should appear in this example).

I have a WIP patch that prints sources of modules. The tricky part is to correctly get the directive and all whitespaces which could precede it, as the --no-asmjs mode is able to recover them (for instance, function f() {       \n"use asm"  \n\t  function g(){}return g}).

My patch works fine for functions not defined with the Function ctor. I'll post a functional version for all cases soon.
So when we have

function<WS1>NAME<WS2>(<WS3>)<WS4>{...}

where WSx is a set of white spaces, printing source (no asmjs mode) will show

function NAME(<WS3>)<WS4>{...}

which means we'd have to keep white spaces inside the parenthesis and after the parenthesis, but not before. Huh.
Attached patch rename-chars-begin (obsolete) — Splinter Review
Coincidentally, I was working on a patch for this on the plane ride recently.  I didn't get far though so I'm happy if you want to take.  A few notes if you want to take this one:
 - toSource() wraps asm.js functions created with 'new Function' in ( ), but not toString
 - I found it much easier to just split out an AsmJSNativeToString() that was called by js::FunctionToString if js:IsAsmJSNative(fun) than trying to hack on js::FunctionToString.
 - Function constructor is a special case for asm.js modules because the ScriptSource doesn't include the "function" keyword or the args.
 - an AsmJSModule now needs to safe two offsets into the ScriptSource: the offset to right after "use asm" (renamed to a better name in this patch), and the offset to the beginning of the module.
Attachment #8393883 - Flags: review?(benj)
Comment on attachment 8393883 [details] [diff] [review]
rename-chars-begin

This patch doesn't contian the AsmJSNativeToString function you're talking about, so resetting review flag (mistake?). I pretty have the same modifications in my local patch, so I'll upload it, unless you have a complete version of the patch?
Attachment #8393883 - Flags: review?(benj)
Flags: needinfo?(luke)
Right, this patch was just a simple preparatory renaming patch that I was suggesting.
Flags: needinfo?(luke)
Attached patch 1 - CleanupsSplinter Review
Cleanups: coding style and typos.
Assignee: general → benj
Status: NEW → ASSIGNED
Attachment #8396486 - Flags: review?(luke)
Attached patch 2 - AsmJSModule to string (obsolete) — Splinter Review
This works for normal functions (thus functions created in eval sections too) and functions created with the Function ctor.

I am kind of worried that the 'funcStart' variable seems to take random fields from the parser but that's the best way to emulate what the interpreter does (which is, taking all text after the opening parenthesis, so that we have the arguments and the body). This way, it keeps spaces in the arguments list and afterwards. The only spaces that aren't kept are the ones after the last statement but it might not really matter.
Attachment #8393883 - Attachment is obsolete: true
Attachment #8396492 - Flags: review?(luke)
Attachment #8396486 - Flags: review?(luke) → review+
Comment on attachment 8396492 [details] [diff] [review]
2 - AsmJSModule to string

Review of attachment 8396492 [details] [diff] [review]:
-----------------------------------------------------------------

Great!  I assume asm.js inner functions are in the next patch?  Also: could you add some test cases for the cases {non-Function, Function with 0,1,2,3 arguments}.

::: js/src/jit/AsmJSLink.cpp
@@ +711,5 @@
>  
>  bool
> +js::IsAsmJSModule(const Value &funValue)
> +{
> +    return IsMaybeWrappedNativeFunction(funValue, LinkAsmJS);

I think you don't want to check IsMaybeWrappedNativeFunction here: if we have a wrapped object, then fun is a wrapper object, not a real function we can call getExtendedSlot() on.  (If it is a wrapper, we'll unwrap and call toString/toSource on the unwrapped function, so no loss of generality.)  Also, I think this function can take a HandleFunction so that the test is simply fun->isNative() && fun->native() == LinkAsmJS.

@@ +718,5 @@
> +JSString *
> +js::AsmJSModuleToString(JSContext *cx, HandleFunction fun, bool lambdaParen)
> +{
> +    JSObject &moduleObj = fun->getExtendedSlot(MODULE_FUN_SLOT).toObject();
> +    AsmJSModule &module = moduleObj.as<AsmJSModuleObject>().module();

Now that we'd have a 3rd use of MODULE_FUN_SLOT, can you factor out a
  AsmJSModuleObject &ModuleFunctionToModuleObject(JSFunction *fun);
?

@@ +728,5 @@
> +
> +    // Whether the function has been created with a Function ctor
> +    bool funCtor = begin == 0 && end == source->length() && source->argumentsNotIncluded();
> +
> +    if (funCtor && !lambdaParen && !out.append("("))

Pre-existing but: 'lambdaParen' is the worst name (and backwards, I think); at least in this function, can you name it addParenToLambda?

::: js/src/jit/AsmJSModule.h
@@ +402,5 @@
>      ProfiledBlocksFunctionVector          perfProfiledBlocksFunctions_;
>  #endif
>  
>      struct Pod {
> +        uint32_t                          numCharsAfterUseAsm_;

I was thinking we could name this funcLength_ and base it off of funcStart_ instead of offsetToEndOfUseAsm_, which is rather odd.

@@ +422,5 @@
>      bool                                  loadedFromCache_;
>      HeapPtr<ArrayBufferObject>            maybeHeap_;
>  
> +    uint32_t                              funcStart_;
> +    uint32_t                              offsetToEndOfUseAsm_;

Could you add a comment before these two fields explaining that they depend on the position of the module within the ScriptSource, which isn't invariant with caching, so they are kept outside of Pod?
Attachment #8396492 - Flags: review?(luke) → review+
Attached patch 2 - AsmJSModule to string (obsolete) — Splinter Review
Finally got my way into saving spaces after the last statement. This was actually not superfluous, as the code checking whether a function has been created with the Function constructor needs to know about the spaces between the last statement and the closing curly brace. It adds one attribute to the Module pod (funcEnd, offset from the start to the end of the module), so I figured out it re-needs review (and you can also give a look at the tests this way if you want).
Attachment #8396492 - Attachment is obsolete: true
Attachment #8398599 - Flags: review?(luke)
Attached patch 3 - AsmJS Functions (obsolete) — Splinter Review
Wonder whether we could find back the line and column number by reusing the source offsets? That would eliminate the lineno and columnNumber fields in the ExportedFunction pod.
Attachment #8398600 - Flags: review?(luke)
Comment on attachment 8398599 [details] [diff] [review]
2 - AsmJSModule to string

Review of attachment 8398599 [details] [diff] [review]:
-----------------------------------------------------------------

Very nice!

::: js/src/jit/AsmJSLink.cpp
@@ +756,5 @@
> +        if (!out.append("("))
> +            return nullptr;
> +
> +        PropertyName *argName = nullptr;
> +        if ((argName = module.globalArgumentName())) {

You could avoid the assigment-expression-used-as-condition (which we try to avoid in SM) via:
  if (PropertyName *argName = module.globalArgumentName()) {
(You can repeat this for all three, since the declarations are local to the 'if'.)

::: js/src/jit/AsmJSModule.cpp
@@ +1299,5 @@
>      cursor = moduleChars.deserialize(cx, cursor);
>      if (!moduleChars.match(parser))
>          return true;
>  
> +    uint32_t funcStart = parser.pc->maybeFunction->pn_body->pn_pos.begin;

If you passed parser.pc->maybeFunction->pn_pos.begin, then, for non-Function-constructor'd functions, the range of chars would include "function name", I believe.  This would allow pushing more into the Function-constructor path in the ToString function.

::: js/src/jit/AsmJSModule.h
@@ +475,5 @@
>          JS_ASSERT(scriptSource_ != nullptr);
>          return scriptSource_;
>      }
> +    uint32_t funcStart() const {
> +        return funcStart_;

It'd be good to add a comment here explaining exactly what we're defining "start" as (naming both the Function-constructor and non cases).

@@ +484,5 @@
>      void initCharsEnd(uint32_t charsEnd) {
> +        JS_ASSERT(charsEnd >= offsetToEndOfUseAsm_);
> +        pod.funcLength_ = charsEnd - funcStart_;
> +    }
> +    void initFuncEnd(uint32_t funcEnd) {

How about merging these two into one function:
  void initFuncEnd(uint32_t endBeforeCurly, uint32_t endAfterCurly)
?

@@ +491,5 @@
>      }
>      uint32_t charsEnd() const {
> +        return funcStart_ + pod.funcLength_;
> +    }
> +    uint32_t funcEnd() const {

Similarly, how about renaming these two funcEndBeforeCurly() and funcEndAfterCurly()?
Attachment #8398599 - Flags: review?(luke) → review+
Comment on attachment 8398600 [details] [diff] [review]
3 - AsmJS Functions

Review of attachment 8398600 [details] [diff] [review]:
-----------------------------------------------------------------

Very nice patch, just one (I think) bug.  I'll have a second quick look after it's fixed.

::: js/src/jit/AsmJSLink.cpp
@@ +830,5 @@
> +    AsmJSModule &module = fun->getExtendedSlot(ASM_MODULE_SLOT).toObject().as<AsmJSModuleObject>().module();
> +
> +    Value v = fun->getExtendedSlot(ASM_EXPORT_INDEX_SLOT);
> +    JS_ASSERT(v.isInt32());
> +    unsigned funIndex = v.toInt32();

As with ModuleFunctionToModuleObject, could you add an
  int32_t FunctionToExportedFunction(HandleFunction fun)
helper to factor out uses of ASM_EXPORT_INDEX_SLOT?

@@ +843,5 @@
> +    // asm.js functions cannot have been created with a Function constructor
> +    // as they belong within a module.
> +    JS_ASSERT(!(begin == 0 && end == source->length() && source->argumentsNotIncluded()));
> +
> +    if (!out.append("function "))

Ah, so maybe I'm wrong in my previous review comment: for a PNK_FUNCTION node 'fn', does 'fn->pn_pos.begin' point to the beginning of the arguments, not including the 'function' keyword?

::: js/src/jit/AsmJSModule.h
@@ +222,5 @@
>              ReturnType returnType_;
>              uint32_t codeOffset_;
>              uint32_t line_;
>              uint32_t column_;
> +

Can you remove this \n?

@@ +224,5 @@
>              uint32_t line_;
>              uint32_t column_;
> +
> +            uint32_t srcStart_;
> +            uint32_t srcEnd_;

This is something a bit tricky here: these are offsets in the ScriptSource and, on cache hit, the ScriptSource (and thus offsets into it) could be different.  (Same situation as AsmJSModule::charsBegin_.)  One solution is to make these offsets relative to the start of the asm.js module (so that they were invariant under different ScriptSources).

(Can you also add a test-case for this different-ScriptSource situation for asm.js functions/modules?  It should be as easy as just prepending a character in the cache-hit case which would throw off all the offsets.)
Attachment #8398600 - Flags: review?(luke)
Carrying forward r+ from luke.

(In reply to Luke Wagner [:luke] from comment #12)
> Comment on attachment 8398599 [details] [diff] [review]
> 2 - AsmJSModule to string
> You could avoid the assigment-expression-used-as-condition (which we try to
> avoid in SM) via:
>   if (PropertyName *argName = module.globalArgumentName()) {
> (You can repeat this for all three, since the declarations are local to the
> 'if'.)
Ha, thanks. I was wondering why gcc asked me to add parenthesis in the expression. That's the reason: the type qualifier has to be in the parenthesis too.

> 
> If you passed parser.pc->maybeFunction->pn_pos.begin, then, for
> non-Function-constructor'd functions, the range of chars would include
> "function name", I believe.  This would allow pushing more into the
> Function-constructor path in the ToString function.

I have tried this and it was working well in this patch. However, I figured out that I missed modules which also are lambdas.
parser.pc->maybeFunction->pn_pos.begin points to the first character of the function for normal functions; for lambdas, it points to the "function" keyword :/ See next patch that adds support for lambdas modules.

> It'd be good to add a comment here explaining exactly what we're defining
> "start" as (naming both the Function-constructor and non cases).
I've added this comment, feel free to suggest better phrasing if you want to.
Attachment #8398599 - Attachment is obsolete: true
Attachment #8400680 - Flags: review+
Just a little addendum to 2, which didn't take care of the fact that the module can also be a lambda. Fortunately, all these weird cases can't happen with the AsmJSFunctions.

The only trick here is that we need to forward the LAMBDA flag from the original function when constructing the module function.
Attachment #8400682 - Flags: review?(luke)
Comment on attachment 8400682 [details] [diff] [review]
4 - AsmJSModule as lambda support

Review of attachment 8400682 [details] [diff] [review]:
-----------------------------------------------------------------

Good catch!

::: js/src/jit-test/tests/asm.js/testSource.js
@@ +22,5 @@
>  
>  assertEq(f0.toString(), funcBody);
>  assertEq(f0.toSource(), funcBody);
>  
> +var f0 = function() {

Could you also add a named lambda test case?

::: js/src/jit/AsmJSLink.cpp
@@ +745,5 @@
>  
>      // Whether the function has been created with a Function ctor
>      bool funCtor = begin == 0 && end == source->length() && source->argumentsNotIncluded();
>  
> +    if (addParenToLambda && (funCtor || fun->isLambda()) && !out.append("("))

If I read FunctionConstructor() correctly, functions constructed by 'Function' are INTERPRETED_LAMBDA, so would fun->isLambda() suffice?
Attachment #8400682 - Flags: review?(luke) → review+
(In reply to Luke Wagner [:luke] from comment #13)
> Comment on attachment 8398600 [details] [diff] [review]
> 3 - AsmJS Functions
> 
> Review of attachment 8398600 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This is something a bit tricky here: these are offsets in the ScriptSource
> and, on cache hit, the ScriptSource (and thus offsets into it) could be
> different.  (Same situation as AsmJSModule::charsBegin_.)  One solution is
> to make these offsets relative to the start of the asm.js module (so that
> they were invariant under different ScriptSources).
> 
> (Can you also add a test-case for this different-ScriptSource situation for
> asm.js functions/modules?  It should be as easy as just prepending a
> character in the cache-hit case which would throw off all the offsets.)

Good catch! I kinda remember I wanted to write a test for it when I was writing the function but then forgot :( It was indeed wrong, as showed me the test case with eval I added in this patch. Storing offsets to the beginning of the module sounds good.
Attachment #8398600 - Attachment is obsolete: true
Attachment #8400704 - Flags: review?(luke)
True, we can just use the isLambda() call also for functions created with the Function ctor.
Attachment #8400682 - Attachment is obsolete: true
Attachment #8400712 - Flags: review+
Comment on attachment 8400704 [details] [diff] [review]
3 - AsmJS Functions

Review of attachment 8400704 [details] [diff] [review]:
-----------------------------------------------------------------

Beautiful, thanks!  Nice tests.

::: js/src/jit/AsmJS.cpp
@@ +5440,5 @@
> +    JS_ASSERT(funcBegin > m.moduleStart());
> +    JS_ASSERT(funcEnd > m.moduleStart());
> +    funcBegin -= m.moduleStart();
> +    funcEnd -= m.moduleStart();
> +    func->define(funcBegin, funcEnd);

Perhaps put a short comment in front of this block saying "The begin/end char range is relative to the beginning of the module."  Preexisting, but I wonder if "define" isn't a very good name for this method; perhaps finish()?

::: js/src/jit/AsmJSLink.cpp
@@ +327,5 @@
> +static unsigned
> +FunctionToExportedFunctionIndex(HandleFunction fun)
> +{
> +    Value v = fun->getExtendedSlot(ASM_EXPORT_INDEX_SLOT);
> +    JS_ASSERT(v.isInt32());

You can remove this assertion because we can trust toInt32() to do it for us.  Alternatively, you could:
  JS_ASSERT(v.toInt32() >= 0);

@@ +851,5 @@
> +js::AsmJSFunctionToString(JSContext *cx, HandleFunction fun)
> +{
> +    AsmJSModule &module = FunctionToEnclosingModule(fun);
> +
> +    const AsmJSModule::ExportedFunction &f = FunctionToExportedFunction(fun, module);

I'd remove the preceding \n

::: js/src/jit/AsmJSModule.h
@@ +223,5 @@
>              uint32_t codeOffset_;
>              uint32_t line_;
>              uint32_t column_;
> +            // These two fields are offsets to the beginning of the ScriptSource
> +            // of the module

Perhaps you could expand this with ", and thus invariant under serialization (unlike absolute offsets into ScriptSource).".

I'd also rename these fields and the associated parameters and public getters to something like (start,end)OffsetInModule

@@ +599,5 @@
>                               PropertyName *maybeFieldName,
>                               ArgCoercionVector &&argCoercions,
>                               ReturnType returnType)
>      {
> +        ExportedFunction func(name, line, column, srcStart, srcEnd, maybeFieldName,

This would be a good pinch point to assert that (start,end)OffsetInModule are < funcLength_.  With the assertions checked here, you could remove them from CheckFunction() which would allow you to make it a one-liner.
Attachment #8400704 - Flags: review?(luke) → review+
Depends on: 991966
You need to log in before you can comment on or make changes to this bug.