Closed
Bug 437136
Opened 16 years ago
Closed 15 years ago
Reducing memory operation form prologue & epilogue of JITed code
Categories
(Tamarin Graveyard :: Tracing Virtual Machine, defect)
Tracking
(Not tracked)
VERIFIED
WONTFIX
People
(Reporter: habals, Unassigned)
Details
Attachments
(1 file)
631 bytes,
patch
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; InfoPath.1; MS-RTC LM 8; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022) Build Identifier: To align the stack, there is a place in prologue that pushes EBP twice, and in epilogue it POPs twice. It is possible to switch first PUSH EBP to SUB ESP, 4 and POP to ADD ESP, 4. This will increase the prologue instruction size by 3 bytes in x86, but it saves one memory operation. I ran it on Mac OS X with Core 2 Duo 2.8GHz, and got slight improvement on the performance. Here is the patch and performance result. --- diff -r b7fa522d969b nanojit/Nativei386.cpp --- a/nanojit/Nativei386.cpp Tue Jun 03 08:54:09 2008 -0700 +++ b/nanojit/Nativei386.cpp Tue Jun 03 15:38:32 2008 -0700 @@ -90,7 +90,8 @@ NIns *patchEntry = _nIns; MR(FP, SP); PUSHr(FP); // push ebp twice to align frame on 8bytes - PUSHr(FP); + //PUSHr(FP); + SUBi(SP, 4); for(Register i=FirstReg; i <= LastReg; i = nextreg(i)) if (needSaving&rmask(i)) @@ -175,7 +176,8 @@ for (Register i=UnknownReg; i >= FirstReg; i = prevreg(i)) if (restore&rmask(i)) { POP(i); } - POP(FP); + //POP(FP); + ADDi(SP,4); POP(FP); return _nIns; } --- ./runtests.py -i 50 sunspider Executing tests at 2008-05-31 12:36:50.245578 avm: /Users/habals/tamarin-tracing-unmod/dist/shell/avmshell avm2: /Users/habals/tamarin-tracing/dist/shell/avmshell test avm avm2 %sp sunspider/access-binary-trees.as 84.0 84.0 0.0 sunspider/access-fannkuch.as 138.0 136.0 1.4 sunspider/access-nbody.as 160.0 160.0 0.0 sunspider/access-nsieve.as 60.0 60.0 0.0 sunspider/bitops-3bit-bits-in-byte.as 14.0 14.0 0.0 sunspider/bitops-bits-in-byte.as 40.0 40.0 0.0 sunspider/bitops-bitwise-and.as 206.0 201.0 2.4 sunspider/bitops-nsieve-bits.as 52.0 52.0 0.0 sunspider/controlflow-recursive.as 30.0 29.0 3.3 sunspider/crypto-aes.as 169.0 169.0 0.0 sunspider/crypto-sha1.as 39.0 39.0 0.0 sunspider/math-cordic.as 52.0 52.0 0.0 sunspider/math-partial-sums.as 196.0 194.0 1.0 sunspider/math-spectral-norm.as 33.0 33.0 0.0 sunspider/s3d-cube.as 155.0 154.0 0.6 sunspider/s3d-morph.as 77.0 75.0 2.6 sunspider/string-fasta.as 159.0 155.0 2.5 --- I found that after these pushes in prologue, SUBi SP,40 is executed. I think by removing these pushes and combine SUB instruction into one, you'd get a better performance improvement. However, one of the EBP value in stack is used, so I'm not sure how to get rid of it. Any comment if this is possible or a right way to go? Reproducible: Always Steps to Reproduce: 1. 2. 3.
Reporter | ||
Comment 1•16 years ago
|
||
replacing push & pop to sub & add in prologue
Comment 2•16 years ago
|
||
the prologue on windows is aligning esp with 8bytes, and making sure ebp is also aligned. because of how the code is organized the prologue is messy and larger than it should be. further directions that expand the scope but may have more benefit: - how about an optimized prolog for windows that does the 8-aligning of esp integrated with everything else, rather than two mini-prologs? - should the pushes of esi, etc occur after saving ebp, to make the prologue "standard" (aids in debugging) - the prolog is only executed when transitioning between interpreter and traces but not when jumping from one trace to another. its exactly the same prolog for every trace. we could handcode the prolog once and jump directly to a no-prologue trace. this would mean having 1 prologue, period, vs 1 per trace like now. code size then would not matter. - related: when calling a helper function that takes a floating point value (eg fmod) we do PUSH(ECX) twice, then a store to store the fp value. should we intead do sub esp,8? whats the size/speed tradeoff. are there any issues with esp folding? Ed
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → WONTFIX
Updated•15 years ago
|
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•