Closed
Bug 225831
Opened 22 years ago
Closed 22 years ago
Byte code generation: using dups instead of temporaries
Categories
(Rhino Graveyard :: Core, defect)
Rhino Graveyard
Core
Tracking
(Not tracked)
VERIFIED
FIXED
1.5R5
People
(Reporter: igor, Assigned: igor)
References
Details
Attachments
(15 files, 1 obsolete file)
7.36 KB,
patch
|
Details | Diff | Splinter Review | |
2.16 KB,
patch
|
Details | Diff | Splinter Review | |
8.49 KB,
patch
|
Details | Diff | Splinter Review | |
37.18 KB,
patch
|
Details | Diff | Splinter Review | |
24.72 KB,
patch
|
Details | Diff | Splinter Review | |
10.52 KB,
patch
|
Details | Diff | Splinter Review | |
3.50 KB,
patch
|
Details | Diff | Splinter Review | |
30.05 KB,
patch
|
Details | Diff | Splinter Review | |
36.42 KB,
patch
|
Details | Diff | Splinter Review | |
14.41 KB,
patch
|
Details | Diff | Splinter Review | |
20.36 KB,
patch
|
Details | Diff | Splinter Review | |
47.27 KB,
patch
|
Details | Diff | Splinter Review | |
13.38 KB,
patch
|
Details | Diff | Splinter Review | |
38.10 KB,
patch
|
Details | Diff | Splinter Review | |
206.64 KB,
patch
|
Details | Diff | Splinter Review |
Currently to implement operations such as x += y, ++x and object and array
literals, {}, [], parser in Rhino generates nodes to hold temporary values.
Interpreter in Rhino does not support more then 256 temporaries and silently
recycles them if their number exceeds 256. The following test case demonstrates
this interpreter limitation.
In principle by changing space taken by temporary index that can be resolved but
it seems that eliminating such temporary nodes and using various forms of dup
operations on stack would make generated byte code smaller and faster both for
interpreter and optimizer.
Assignee | ||
Comment 2•22 years ago
|
||
The test evaluates string of the form ++(a[++f().x + ++f().x + ... + ++f().x])
that will expose recycling of temporaries in interpreter mode when there is
more then 255 ++f terms.
Assignee | ||
Comment 3•22 years ago
|
||
This is trivial part since the patch changes IR-Factory to always creates
Token.INC/Token.DEC nodes for post ++ and --. Previously it happens only for
child nodes of ++ or -- where IRFactory.hasSideEffects() returns true but the
rest of code in Rhino both in interpreter and optimizer modes contained full
support for INC/DEC applied for arbitrary complex lvalue expression. Or did I
miss something?
Comment 4•22 years ago
|
||
Igor: thank you for this testcase! I've added it to the JS testsuite:
mozilla/js/tests/js1_5/Regress/regress-225831.js
As Igor indicates, it currently passes in Rhino compiled mode,
but fails in interpreted mode:
*-* Testcase js1_5/Regress/regress-225831.js failed:
Bug Number 225831
STATUS: Stressing the byte code generator
Failure messages were:
FAILED!: [reported from test()] Section 1 of test -
FAILED!: [reported from test()] Expected value '11', Actual value 'NaN'
Assignee | ||
Comment 5•22 years ago
|
||
The patch ads 3 more sections to the test case to cover post --, {} and []
operations. Without the first fix all sections fails, but with the fix applied
the session 2 testing post-- passes.
Attachment #135601 -
Attachment is obsolete: true
Comment 6•22 years ago
|
||
Igor: thanks, I checked in the extension to the test. But note that
SpiderMonkey doesn't pass Section 4. The failure looks like this:
Expected value = '({1:({a:1}), 2:({a:1}), 3:({a:1}), 4:({a:1}), etc.
Actual value = '({1:{a:1}, 2:{a:1}, 3:{a:1}, 4:{a:1}, etc.
Rhino and SpiderMonkey seem to differ on how toSource() and uneval()
treat objects nested inside objects:
js> var obj = { prop:({a:1}) };
--------------------------------- RHINO -----------------------------------
js> obj.toSource();
({prop:({a:1})})
js> uneval(obj);
({prop:({a:1})})
------------------------------- SPIDERMONKEY -------------------------------
js> obj.toSource();
({prop:{a:1}})
js> uneval(obj);
({prop:{a:1}})
That is, SpiderMonkey doesn't put the parens around the inner object.
What should we do? Is this something that SpiderMonkey should change,
or Rhino?
Assignee | ||
Comment 7•22 years ago
|
||
I will change Rhino to much SM, so the test should use the version without inner
(). So the test would be double-edge sword to check both bytecode generation and
toSource().
Comment 8•22 years ago
|
||
Igor: thanks. I have checked in v 1.3 of the test, without the inner ()'s:
RCS file: /cvsroot/mozilla/js/tests/js1_5/Regress/regress-225831.js,v
retrieving revision 1.2
diff -r1.2 regress-225831.js
106c106
< // build string of the form ({1:({a:1}), 2:({a:1}), ... N:({a:1})})
---
> // build string of the form ({1:{a:1}, 2:{a:1}, ... N:{a:1}})
111c111
< arr[i] = i+":({a:1}), ";
---
> arr[i] = i+":{a:1}, ";
113c113
< arr[N] = N+":({a:1})})";
---
> arr[N] = N+":{a:1}})";
[//d/JS_TRUNK/mozilla/js/tests/js1_5/Regress] cvs ci regress-225831.js
Checking in regress-225831.js;
/cvsroot/mozilla/js/tests/js1_5/Regress/regress-225831.js,v <-- regress-225831.js
new revision: 1.3; previous revision: 1.2
done
Assignee | ||
Comment 9•22 years ago
|
||
I created a branch in CVS templess_bytecode_225831 to put the code for the bug
to merge it later when ready and committed the above patch there.
Status: NEW → ASSIGNED
Assignee | ||
Comment 10•22 years ago
|
||
It turned out that usage of dups or similar stack-based constructions require
proper maintenance of in Interpreter.itsStackDept which is not the case now. It
is not visible since the only consequence of it is wasted stack space but that
will be exposed through stack-based references.
Assignee | ||
Comment 11•22 years ago
|
||
One of the reasons of stack miscalculations is that current IRFactory reuse the
code for "if" statement when generation tree for "?". Since then/else parts of
"?" places an object to stack while then/else of "if" suppose to keep stack the
same, so effectively byte code generation end up assuming 2 additional objects
on stack, not one, for each "?".
The problem presents in the optimizer as well, so class files claims more stack
space then necessary there also.
Patch handles that via using Token.HOOK for parse subtree representing "?" and
adding code to interpreter/optimizer to process that properly.
Patch also adds missed stack adjustments for &&, ||, __proto__ and __parent__
in Interpreter.
I will committ the patch to the branch after regression testing.
Assignee | ||
Comment 12•22 years ago
|
||
Another source for stack depth miscalculations is switch code, where stack was
not updated to account for IF jump. The patch resolves via using 2 new byte
codes, Icode_DUPSECOND duplicating stack element one below top and
Icode_IFEQ_POP which would pop additional stack element in case of successful
jump. In this way switch condition will be popped from the stack when done.
To prevent stack miscalculations, the patch adds explicit checks for expected
stack change during code generation. The overhead of this self checking should
be very low.
The patch also adds one line fix to optimizer/Codegen to release switch
register after code for switches is generated.
Assignee | ||
Comment 13•22 years ago
|
||
The patch changes generation of code for various op= to use dup commands
instead of saving lvalue into temporaries and then restoring it from
temporaries to store assignment result. In this way more compact code is
generated both for interpreter and optimizer and IRFactory code for op= is
simplified since it is no longer necessary to apply optimization of loading
numbers and strings twice instead of using temporaries as DUP is the fastest
way to duplicate even constant values.
The price for this is the need to support 2 additional node types, SETELEM_OP
and SETPROP_OP, but most of the code for them is shared with SETELEM and
SETPROP so the total code bloat is not that big.
Assignee | ||
Comment 14•22 years ago
|
||
The patch changes code to generate [] and {} to use dups instead of temporaries
to get shorter byte code both for interpreter and optimizer. With the patch the
test case now passes in the interpreted mode.
Assignee | ||
Comment 15•22 years ago
|
||
The previous patches removes the need to to have Node.cloneNode so the patch
removes it.
Assignee | ||
Comment 16•22 years ago
|
||
The patch changes interpreter not to use temporaries to calculate both function
reference and function this for function calls. The patch moved away the code
to transform parsing tree in NodeTrandformer and instead adds necessary logic
to interpreter. The code is moved to the OptTransformer so optimizer still uses
it.
The patch changes few special interpreter byte codes that supported function
this calculations to place on the stack both function reference and its this.
In this way byte code for calls became more compact and faster.
It seems that direct code generation from non-transformed call subtree would
shrink generated byte code in the optimizer as well but since various
optimizations there depends on particular transformed tree structure I left it
as is since the optimizer handles references properly
Assignee | ||
Comment 17•22 years ago
|
||
Patch adds new parse tree node, Token.ENUM_ID, to get id string from
enumeration object that is used to implement "for (x in y)". In this way code
to generate byte code to store the result of enumNext to compare it with null
and assign to enum counter becomes unnecessary.
Assignee | ||
Comment 18•22 years ago
|
||
Currently handling of the catch statement involves creation of 2 locals per
catch block, one to store catch object and another to hold catch scope during
its setup. The second usage should use temporary instead since additional
catch, finally or with blocks can not appear between the initial creation of
scope and creation of with object that uses it. So effectively it is a
temporary and as previous patches shows it could be eliminated.
The patch just does that via changing Token.NEWSCOPE nodes into
Token.CATCH_SCOPE since the former was used only to support catch nodes. The
new node is responsible for populating properly the scope objects based on the
exception thus eliminating the last temporary usage in the Interpreter and
simplifying catch parsing trees substantially.
Assignee | ||
Comment 19•22 years ago
|
||
The patch changes omj/optimizer/Block.java to use temporaries instead of locals
to hold results of common this subexpressions since in the optimizer
temporaries are reused while locals are not unless special measures are taken.
Since the previous patches changes moved creation of temporaries to static
method in OptTransformer, the change removed the need to pass IRFactory
instance around Block and FatBlock and the patch removes it.
Assignee | ||
Comment 20•22 years ago
|
||
To allow reuse of additional locals that interpreter/optimizer has to allocate
to support try/catch/finally and for (in) patch adds new parse node node,
Token.LOCAL_BLOCK to indicate sequence of statements that needs new local. In
this way simple stack-like allocation of local slots in the interpreter
provides locals reuse and in the optimizer the locals can be treated in the
same way as temporaries and does not require initial preallocation.
Assignee | ||
Comment 21•22 years ago
|
||
The patch adds checks in Interpreter to throw an exception if number of locals
exceeds 256 and switch to use activation if number of variables exceeds 256. In
this way silent reuse of locals would be avoided and interpreter limitations
are exposed earlier.
Without the the patch js1_5/Regress/regress-226507.js would go to infinite loop
when N there is increased to 257.
Assignee | ||
Comment 22•22 years ago
|
||
The patch adds new parse tree node type, Token.RETURN_POPV, to return the value
stored by the last Token.POPV and updates interpreter/optimizer to support POPV
for functions and the new nodes. With this it in place NodeTransformer changes
code to return from finally to first store via POPV the result of the original
return, then execute finally and then return the stored value fixing bug
225831.
The patch does not depend on changes to avoid using temporaries during
expression evaluation as I initially expected but it does depend on the change
to use recursion of NodeTransformer which is committed on the bug branch so log
it here.
Assignee | ||
Comment 23•22 years ago
|
||
Cumulative patch to merger to HEAD
Assignee | ||
Comment 24•22 years ago
|
||
I committed the change
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment 25•22 years ago
|
||
Verified FIXED.
The above testcase used to fail in both the compiled and interpreted
modes of Rhino. Now it passes in both -
Status: RESOLVED → VERIFIED
Assignee | ||
Comment 26•22 years ago
|
||
*** Bug 228446 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 28•21 years ago
|
||
*** Bug 273745 has been marked as a duplicate of this bug. ***
Comment 29•21 years ago
|
||
I don't really understand in which version it is fixed/will be fixed. Target
milestone indicates 1.5R5 but I have the problem with 1.6R1.
Assignee | ||
Comment 30•21 years ago
|
||
(In reply to comment #29)
> I don't really understand in which version it is fixed/will be fixed. Target
> milestone indicates 1.5R5 but I have the problem with 1.6R1.
That was fixed in 1.5R5. What is your test case? For example, the following case
passes for me in either 1.5R5 or 1.6R1:
var N = 600;
var array = new Array(N);
for (var i = 0; i != N; ++i) {
array[i] = '{index: '+i+', name: "foo'+i+'"}';
}
var script_text = "["+array.join(',\n')+"]";
var script = Script(script_text);
var tab = script();
java.lang.System.out.println(uneval(tab[N - 1]));
Comment 31•21 years ago
|
||
I think that you're right. I'm using js inside htmlunit and I thought that I had
correctly changed my classpath to use version 1.6R1. Testing again it seems to
work fine. Sorry for trouble.
You need to log in
before you can comment on or make changes to this bug.
Description
•