Closed
Bug 469843
Opened 16 years ago
Closed 16 years ago
Memory bloat because code is both translated and jitted
Categories
(Tamarin Graveyard :: Virtual Machine, defect, P1)
Tamarin Graveyard
Virtual Machine
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: lhansen, Assigned: jodyer)
References
Details
Attachments
(1 file, 2 obsolete files)
658 bytes,
patch
|
lhansen
:
review+
|
Details | Diff | Splinter Review |
Presently ABC code is both translated to wordcode and jitted if the jit is enabled. The reason is that if the jit fails then it can't restart the translation process because some of the ABC structures have been destructively updated. The dual translation causes memory bloat (and right now that memory stays around, which is also not necessary) and incurs performance cost.
This can be fixed in various ways.
* The jit could be guaranteed never to fail.
* Perhaps better, the ABC structure should not be destructively updated in place, allowing a translation pass to be run multiple times. This would also enable adaptive translation policies where code is initially translated, then jitted, then falls back to translated again later because it's no longer hot, etc.
As a temporary workaround the translated word code should be evicted if the JIT succeeds.
Comment 1•16 years ago
|
||
non-destructive ABC will be (most likely) required if we wish to explore dynamic disposal of jit'd code.
Comment 2•16 years ago
|
||
What destructive update is occurring in ABC? I thought it could be treated as read-only.
Reporter | ||
Comment 3•16 years ago
|
||
I believe the ABC itself is const but would have to be re-parsed, and that it's the parsed ABC structures that are updated.
Comment 4•16 years ago
|
||
ok good, there used to be an optimization Tommy added ages ago that patched the abc to jump to some generated code that did field initialization. It helped with startup times. But I can no longer find it, it's probably since been removed.
Comment 5•16 years ago
|
||
almost, we generated abc dynamically to initialize fields and then as a last step we jumped to the real ctor. Then we cooked the traits init MethodInfo to point to this stub. The salient point was to JIT the init code instead of having a loop over the traits written in C. Today I suspect we'd generate WORDCODE to do something similar?
Comment 6•16 years ago
|
||
one of the OOM things I was planning on implementing was to kill nanojit and free all machine data pages when we hit the reserve. Would it make sense/be possible to also free all WORDCODE and go back to the abc interpreter?
Comment 7•16 years ago
|
||
we only have one interpreter compiled in at a time. it might make sense to blow away all the generated wordcode, but continuing execution would need to generate more wordcode.
or... we could compile in two interpreters. but i'm not sure that complexity pays for itself.
Reporter | ||
Comment 8•16 years ago
|
||
(In reply to comment #7)
> we only have one interpreter compiled in at a time. it might make sense to
> blow away all the generated wordcode, but continuing execution would need to
> generate more wordcode.
>
> or... we could compile in two interpreters. but i'm not sure that complexity
> pays for itself.
I think we wish to pay for complexity elsewhere: we want to discard as much of the ABC structure as we can, notably the ABC code.
It is however possible that we should consider something that might at first blush seem counterintuitive: using word code only on platforms where the interpreter is likely to be used quite a bit, and interpreting ABC code on platforms where the JIT is likely to dominate (notably desktop). The performance gain from translating to word code on desktop is likely to be slight or nonexistent because only init code will be interpreted, and it may not contain many loops, hence translation costs will cancel execution gains from the faster interpreter.
Reporter | ||
Comment 9•16 years ago
|
||
Marking this as a blocker for the dec/jan release because it's a regression that showed up in the nov release, technically. A minimum workaround is that we should discard the translated code if the jit succeeds.
Blocks: 469836
Updated•16 years ago
|
Flags: flashplayer-triage+
Flags: flashplayer-qrb?
Reporter | ||
Updated•16 years ago
|
OS: Mac OS X → All
Priority: -- → P1
Hardware: x86 → All
Reporter | ||
Updated•16 years ago
|
Assignee: nobody → jodyer
Assignee | ||
Comment 10•16 years ago
|
||
This last suggestion seems most attractive at the moment, combined with making the jit able to compile more programs successfully. I'm still getting oriented; does anyone know if the old bytecode interpreter still works? I haven't found the right combination of #defines to get it to build, yet.
Jd
Comment 11•16 years ago
|
||
see comments in core/avmbuild.h; defining AVMPLUS_ABC_INTERPRETER should do it (caveat: i haven't tried it)
Reporter | ||
Comment 12•16 years ago
|
||
(In reply to comment #11)
> see comments in core/avmbuild.h; defining AVMPLUS_ABC_INTERPRETER should do it
> (caveat: i haven't tried it)
I believe I at least tested it :-)
Reporter | ||
Comment 13•16 years ago
|
||
(In reply to comment #10)
> This last suggestion seems most attractive at the moment, combined with making
> the jit able to compile more programs successfully. I'm still getting oriented;
> does anyone know if the old bytecode interpreter still works? I haven't found
> the right combination of #defines to get it to build, yet.
I view that proposal (throw away the translated code) as strictly a stopgap measure, as it makes us use more memory and CPU than we need to and effectively reduces the performance and applicability of the JIT. Also, it doesn't speak to the need for being able to discard jitted code and fall back to the interpreter, and then perhaps rejit again in the future, all of which require the abc updating to work differently than now.
Comment 14•16 years ago
|
||
Yes, we need to be able to retain the abc for both re-JIT and possibly, although arguably less likely, re-word coding.
The other facet, that Tommy brought up for low-mem, is that simply ditching the JIT'd code memory and/or word code is not going to work without some sort of on-stack replacement strategy or maybe he was suggesting a more passive approach such as letting the stack unwind.
Assignee | ||
Comment 15•16 years ago
|
||
Below are some preliminary memory and time numbers for byte code interpreted (word code translator turned off) vs. jitted execution. From this its appears that skipping word code translation on platforms that JIT is a viable option. Anyway, this fix is probably good enough to turn this blocking bug into a non-blocking performance bug.
Jd
Executing 111 tests against vm: ../../../tamarin-redux/platform/mac/shell/build/Release/shell
Executing tests at 2009-01-25 14:54:44.160511
avm: ../../../tamarin-redux/platform/mac/shell/build/Release/shell
avm2: ../../platform/mac/shell/build/Release/shell
iterations: 1
test ../../../tamarin-redux/platform/mac/shell/build/Release/shell avm2 %sp metric
canaries/simpleflexapp.as 7.6M 6.9M -9.2 memory
jsbench/Crypt.as 51.8M 54.5M 5.2 memory
jsbench/Euler.as 6.9M 6.7M -2.9 memory
jsbench/FFT.as 46.9M 47.5M 1.3 memory
jsbench/HeapSort.as 8.1M 7.9M -2.5 memory
jsbench/LUFact.as 129.7M 128.4M -1.0 memory
jsbench/Moldyn.as 2.0M 1.7M -15.0 memory
jsbench/RayTracer.as 2.2M 1.6M -27.3 memory
jsbench/Series.as 32.1M 32.3M 0.6 memory
jsbench/SOR.as 59.8M 57.6M -3.7 memory
jsbench/SparseMatmult.as 16.9M 16.9M 0.0 memory
jsbench/typed/Crypt.as 0K 0K 9999.0 memory
jsbench/typed/Euler.as 3.3M 3.0M -9.1 memory
jsbench/typed/FFT.as 9.8M 9.7M -1.0 memory
jsbench/typed/HeapSort.as 0K 0K 9999.0 memory
jsbench/typed/LUFact.as 3.7M 3.7M 0.0 memory
jsbench/typed/Moldyn.as 1.7M 1.6M -5.9 memory
jsbench/typed/RayTracer.as 1.6M 1.5M -6.3 memory
jsbench/typed/Series.as 1.9M 1.9M 0.0 memory
jsbench/typed/SOR.as 9.7M 9.7M 0.0 memory
jsbench/typed/SparseMatmult.as 6.3M 6.2M -1.6 memory
misc/boids.as 1.5M 1.5M 0.0 memory
misc/boidshack.as 7.8M 7.6M -2.6 memory
misc/gameoflife.as 2.0M 1.9M -5.0 memory
misc/primes.as 0K 0K 9999.0 memory
scimark/FFT.as 1.7M 1.6M -5.9 memory
scimark/LU.as 3.3M 3.1M -6.1 memory
scimark/MonteCarlo.as 1.7M 1.7M 0.0 memory
scimark/SOR.as 3.3M 3.1M -6.1 memory
scimark/SparseCompRow.as 1.8M 1.7M -5.6 memory
sunspider/access-binary-trees.as 1.6M 1.6M 0.0 memory
sunspider/access-fannkuch.as 0K 0K 9999.0 memory
sunspider/access-nbody.as 1.7M 1.6M -5.9 memory
sunspider/access-nsieve.as 3.5M 3.4M -2.9 memory
sunspider/bitops-3bit-bits-in-byte.as 0K 0K 9999.0 memory
sunspider/bitops-bits-in-byte.as 0K 0K 9999.0 memory
sunspider/bitops-bitwise-and.as 0K 0K 9999.0 memory
sunspider/bitops-nsieve-bits.as 2.0M 1.7M -15.0 memory
sunspider/controlflow-recursive.as 0K 0K 9999.0 memory
sunspider/crypto-aes.as 1.3M 1.2M -7.7 memory
sunspider/crypto-md5.as 1.6M 1.6M 0.0 memory
sunspider/crypto-sha1.as 1.6M 1.6M 0.0 memory
sunspider/date-format-tofte.as 1.6M 1.6M 0.0 memory
sunspider/math-cordic.as 1.6M 1.6M 0.0 memory
sunspider/math-partial-sums.as 1.6M 1.5M -6.3 memory
sunspider/math-spectral-norm.as 1.6M 1.5M -6.3 memory
sunspider/s3d-cube.as 2.0M 1.9M -5.0 memory
sunspider/s3d-morph.as 1.7M 1.6M -5.9 memory
sunspider/s3d-raytrace.as 2.7M 2.4M -11.1 memory
sunspider/string-fasta.as 1.6M 1.5M -6.3 memory
sunspider/string-unpack-code.as 2.6M 2.6M 0.0 memory
sunspider/string-validate-input.as 2.4M 2.6M 8.3 memory
sunspider/as3/access-binary-trees.as 1.6M 1.6M 0.0 memory
sunspider/as3/access-fannkuch.as 0K 0K 9999.0 memory
sunspider/as3/access-nbody.as 0K 0K 9999.0 memory
sunspider/as3/access-nsieve.as 3.1M 3.4M 9.7 memory
sunspider/as3/bitops-3bit-bits-in-byte.as 0K 0K 9999.0 memory
sunspider/as3/bitops-bits-in-byte.as 0K 0K 9999.0 memory
sunspider/as3/bitops-bitwise-and.as 0K 0K 9999.0 memory
sunspider/as3/bitops-nsieve-bits.as 2.0M 2.0M 0.0 memory
sunspider/as3/controlflow-recursive.as 0K 0K 9999.0 memory
sunspider/as3/crypto-aes.as 1.3M 1.2M -7.7 memory
sunspider/as3/crypto-md5.as 1.6M 1.6M 0.0 memory
sunspider/as3/crypto-sha1.as 1.6M 1.6M 0.0 memory
sunspider/as3/date-format-tofte.as 1.6M 1.6M 0.0 memory
sunspider/as3/math-cordic.as 1.7M 1.5M -11.8 memory
sunspider/as3/math-partial-sums.as 1.5M 1.5M 0.0 memory
sunspider/as3/math-spectral-norm.as 0K 0K 9999.0 memory
sunspider/as3/s3d-cube.as 1.6M 1.5M -6.3 memory
sunspider/as3/s3d-morph.as 1.7M 1.6M -5.9 memory
sunspider/as3/s3d-raytrace.as 2.4M 2.0M -16.7 memory
sunspider/as3/string-fasta.as 1.7M 1.6M -5.9 memory
sunspider/as3/string-unpack-code.as 2.7M 2.7M 0.0 memory
sunspider/as3/string-validate-input.as 2.6M 2.5M -3.8 memory
sunspider/as3vector/access-fannkuch.as 0K 0K 9999.0 memory
sunspider/as3vector/access-nbody.as 0K 0K 9999.0 memory
sunspider/as3vector/access-nsieve.as 1.6M 1.6M 0.0 memory
sunspider/as3vector/bitops-nsieve-bits.as 0K 0K 9999.0 memory
sunspider/as3vector/math-cordic.as 1.6M 1.5M -6.3 memory
sunspider/as3vector/math-spectral-norm.as 1.6M 1.6M 0.0 memory
sunspider/as3vector/s3d-cube.as 1.4M 1.4M 0.0 memory
sunspider/as3vector/s3d-morph.as 1.9M 1.9M 0.0 memory
sunspider/as3vector/string-fasta.as 1.6M 1.5M -6.3 memory
sunspider/as3vector/string-validate-input.as 2.3M 2.5M 8.7 memory
v8/crypto.as 2.1M 1.8M -14.3 memory
v8/deltablue.as 1.8M 1.6M -11.1 memory
v8/raytrace.as 1.6M 1.6M 0.0 memory
v8/richards.as 1.3M 1.2M -7.7 memory
v8/typed/crypto.as 2.3M 2.0M -13.0 memory
v8/typed/deltablue.as 1.8M 1.7M -5.6 memory
v8/typed/raytrace.as 1.6M 1.6M 0.0 memory
v8/typed/richards.as 1.6M 1.5M -6.3 memory
Tamarin tests started: 2009-01-25 15:17:38.537061
Executing 111 tests against vm: ../../../tamarin-redux/platform/mac/shell/build/Release/shell
Executing tests at 2009-01-25 15:17:38.587246
avm: ../../../tamarin-redux/platform/mac/shell/build/Release/shell
avm2: ../../platform/mac/shell/build/Release/shell
iterations: 1
test ../../../tamarin-redux/platform/mac/shell/build/Release/shell avm2 %sp metric
canaries/simpleflexapp.as 424 403 5.2 time
jsbench/Crypt.as 7235 7169 0.9 time
jsbench/Euler.as 24045 23881 0.7 time
jsbench/FFT.as 15630 15993 -2.3 time
jsbench/HeapSort.as 8610 8642 -0.4 time
jsbench/LUFact.as 17835 17370 2.7 time
jsbench/Moldyn.as 21705 21835 -0.6 time
jsbench/RayTracer.as 30431 30389 0.1 time
jsbench/Series.as 18933 19368 -2.2 time
jsbench/SOR.as 66071 66115 -0.1 time
jsbench/SparseMatmult.as 25107 25345 -0.9 time
jsbench/typed/Crypt.as 1223 1150 6.3 time
jsbench/typed/Euler.as 23706 23869 -0.7 time
jsbench/typed/FFT.as 8086 7981 1.3 time
jsbench/typed/HeapSort.as 2805 2891 -3.0 time
jsbench/typed/LUFact.as 17231 16497 4.4 time
jsbench/typed/Moldyn.as 9864 10140 -2.7 time
jsbench/typed/RayTracer.as 3366 3380 -0.4 time
jsbench/typed/Series.as 17234 16919 1.9 time
jsbench/typed/SOR.as 49839 48611 2.5 time
jsbench/typed/SparseMatmult.as 3096 3205 -3.4 time
misc/boids.as 5451 5404 0.9 time
misc/boidshack.as 1162 1185 -1.9 time
misc/gameoflife.as 4052 4103 -1.2 time
misc/primes.as 5750 5772 -0.4 time
scimark/FFT.as 5870 5881 -0.2 time
scimark/LU.as 5807 5826 -0.3 time
scimark/MonteCarlo.as 4372 4271 2.4 time
scimark/SOR.as 5116 5107 0.2 time
scimark/SparseCompRow.as 4299 4377 -1.8 time
sunspider/access-binary-trees.as 80 78 2.6 time
sunspider/access-fannkuch.as 154 155 -0.6 time
sunspider/access-nbody.as 217 214 1.4 time
sunspider/access-nsieve.as 81 83 -2.4 time
sunspider/bitops-3bit-bits-in-byte.as 28 28 0.0 time
sunspider/bitops-bits-in-byte.as 42 43 -2.3 time
sunspider/bitops-bitwise-and.as 287 289 -0.7 time
sunspider/bitops-nsieve-bits.as 73 72 1.4 time
sunspider/controlflow-recursive.as 58 56 3.6 time
sunspider/crypto-aes.as 99 98 1.0 time
sunspider/crypto-md5.as 51 49 4.1 time
sunspider/crypto-sha1.as 50 48 4.2 time
sunspider/date-format-tofte.as 287 277 3.6 time
sunspider/math-cordic.as 118 116 1.7 time
sunspider/math-partial-sums.as 276 267 3.4 time
sunspider/math-spectral-norm.as 105 104 1.0 time
sunspider/s3d-cube.as 140 138 1.4 time
sunspider/s3d-morph.as 85 83 2.4 time
sunspider/s3d-raytrace.as 169 167 1.2 time
sunspider/string-fasta.as 140 137 2.2 time
sunspider/string-unpack-code.as 298 291 2.4 time
sunspider/string-validate-input.as 130 131 -0.8 time
sunspider/as3/access-binary-trees.as 22 22 0.0 time
sunspider/as3/access-fannkuch.as 79 79 0.0 time
sunspider/as3/access-nbody.as 11 10 10.0 time
sunspider/as3/access-nsieve.as 45 43 4.7 time
sunspider/as3/bitops-3bit-bits-in-byte.as 13 13 0.0 time
sunspider/as3/bitops-bits-in-byte.as 15 15 0.0 time
sunspider/as3/bitops-bitwise-and.as 2 2 0.0 time
sunspider/as3/bitops-nsieve-bits.as 41 40 2.5 time
sunspider/as3/controlflow-recursive.as 10 11 -9.1 time
sunspider/as3/crypto-aes.as 72 71 1.4 time
sunspider/as3/crypto-md5.as 59 57 3.5 time
sunspider/as3/crypto-sha1.as 49 46 6.5 time
sunspider/as3/date-format-tofte.as 251 240 4.6 time
sunspider/as3/math-cordic.as 41 40 2.5 time
sunspider/as3/math-partial-sums.as 97 97 0.0 time
sunspider/as3/math-spectral-norm.as 15 16 -6.2 time
sunspider/as3/s3d-cube.as 38 36 5.6 time
sunspider/as3/s3d-morph.as 56 56 0.0 time
sunspider/as3/s3d-raytrace.as 72 69 4.3 time
sunspider/as3/string-fasta.as 96 94 2.1 time
sunspider/as3/string-unpack-code.as 294 286 2.8 time
sunspider/as3/string-validate-input.as 113 114 -0.9 time
sunspider/as3vector/access-fannkuch.as 50 48 4.2 time
sunspider/as3vector/access-nbody.as 11 11 0.0 time
sunspider/as3vector/access-nsieve.as 21 20 5.0 time
sunspider/as3vector/bitops-nsieve-bits.as 9 9 0.0 time
sunspider/as3vector/math-cordic.as 31 31 0.0 time
sunspider/as3vector/math-spectral-norm.as 45 43 4.7 time
sunspider/as3vector/s3d-cube.as 36 36 0.0 time
sunspider/as3vector/s3d-morph.as 52 52 0.0 time
sunspider/as3vector/string-fasta.as 106 103 2.9 time
sunspider/as3vector/string-validate-input.as 118 116 1.7 time
v8/crypto.as 204 205 -0.5 v8
v8/deltablue.as 324 323 0.3 v8
v8/raytrace.as 1013 1034 -2.0 v8
v8/richards.as 316 310 1.9 v8
v8/typed/crypto.as 214 212 0.9 v8
v8/typed/deltablue.as 675 683 -1.2 v8
v8/typed/raytrace.as 2186 2225 -1.8 v8
v8/typed/richards.as 1022 1033 -1.1 v8
Assignee | ||
Comment 16•16 years ago
|
||
this fix includes some clean up to compile with the word code translator turned off and the use of an instance of NullWriter in the case the the jit is enabled but not used (e.g. init code).
Jd
Attachment #358773 -
Attachment is patch: true
Attachment #358773 -
Attachment mime type: application/octet-stream → text/plain
Assignee | ||
Comment 17•16 years ago
|
||
delete word code if jit is successful.
Jd
Attachment #359811 -
Flags: review?(edwsmith)
Assignee | ||
Comment 18•16 years ago
|
||
Attachment #358773 -
Attachment is obsolete: true
Attachment #359811 -
Attachment is obsolete: true
Attachment #360117 -
Flags: review?(lhansen)
Attachment #359811 -
Flags: review?(edwsmith)
Reporter | ||
Updated•16 years ago
|
Attachment #360117 -
Flags: review?(lhansen) → review+
Updated•16 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 19•16 years ago
|
||
fix pushed: d6ef62566f1b
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Comment 20•15 years ago
|
||
Resolved fixed engineering / work item that has been pushed. Setting status to verified.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•