Closed
Bug 688486
Opened 14 years ago
Closed 13 years ago
[pacman] Speculatively inline Array::getUintProperty in the JIT
Categories
(Tamarin Graveyard :: Library, defect)
Tamarin Graveyard
Library
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: pnkfelix, Assigned: pnkfelix)
References
Details
(Whiteboard: has-patch, PACMAN)
Attachments
(6 files, 6 obsolete files)
3.86 KB,
patch
|
treilly
:
review+
lhansen
:
superreview+
lhansen
:
feedback+
|
Details | Diff | Splinter Review |
41.97 KB,
text/plain
|
Details | |
41.89 KB,
text/plain
|
Details | |
4.48 KB,
patch
|
treilly
:
review+
lhansen
:
superreview+
|
Details | Diff | Splinter Review |
28.70 KB,
patch
|
lhansen
:
review+
|
Details | Diff | Splinter Review |
8.27 KB,
patch
|
lhansen
:
review+
lhansen
:
superreview+
|
Details | Diff | Splinter Review |
For Array fetches on arrays that (1) are dense, (2) are not subclassed [*], (3) have no holes and (4) start at 0, we can use a similar path to the inline code that recently landed for Vector (Bug 599099).
Doing this properly might require adding a new member that tracks whether an array is "simple" (ie conforms to the above constraints) to the Array representation; that is probably acceptable, especially since simple-ness is likely to be required to do a similar treatment for Array::setUintProperty.
[*] Regarding the "not subclassed" constraint: Maybe this constraint is unnecessary? Sealed-ness should not matter for reads (only for writes, i.e. setUintProperty). Do overrides of the length getter matter for the semantics of Array referencing?
Assignee | ||
Updated•14 years ago
|
Blocks: array-tracker
Assignee | ||
Updated•14 years ago
|
Assignee: nobody → fklockii
Assignee | ||
Comment 1•14 years ago
|
||
(FYI: it isn't out of the question to remove other constraints, e.g. (3) and (4) may not be necessary, depending on how much code generation we're willing to pay for in the jit. But the constraints as written provide a simple start that is likely to match common use cases.)
Assignee | ||
Comment 2•14 years ago
|
||
These inline routines are not be called from outside ArrayObject.cpp, at least within AVM.
When I started, I thought it would ease development (mostly by reducing compile times) to have them in the .cpp rather than the -inlines.h header.
In hindsight these routines were not touched by my subsequent work on this bug. So it probably does not matter for now if this patch lands.
Assignee | ||
Comment 3•14 years ago
|
||
I haven't come up with a more clever representation than a separate field in the ArrayObject, and arguably its not worth trying to do so.
This patch isn't 100% of the story; it gets things started so that code like:
var a = []
produces a simple-marked array. But that's not good enough, since it doesn't cover:
var b = new Array(0);
Follow-on patch C will show one way to cover the second case.
Attachment #562049 -
Flags: feedback?(lhansen)
Assignee | ||
Comment 4•14 years ago
|
||
I admit up front that this patch smells bad.
The main goal is to identify which constructor calls are direct calls to new Array(..) itself, versus calls to the Array constructor from one of its subclasses.
The approach is modeled after what Steven did for Bug 654807.
I briefly tried to take a more disciplined approach by directly changing ArrayClass::createInstanceProc but quickly realized that I don't know enough about nativegen to do that.
(Also, the description for this bug mentions that it may not be necessary to keep the not-subclassed constraint. Even so, its probably better to track m_isSimple in a manner analogous to what this patch does, to enable speculative inlining of setUintProperty.)
Attachment #562055 -
Flags: feedback?(lhansen)
Assignee | ||
Comment 5•14 years ago
|
||
The payoff! :)
This is my first foray into CodegenLIR.
My use of suspendCSE and resumeCSE is based on a conversation with wmaddox about how to generate code with a diamond control flow.
Here are some preliminary performance results; note that the read-{C,Number,int}-2 cases are all sparse arrays, so its not surprising that they regressed. The degree of slow-down in the init's is a little more troubling:
avm avm2
test best avg best avg %dBst %dAvg
Metric: iterations/second
Dir: asmicro/
array-1 1706.3 1702.9 3802.2 3800.6 122.8 123.2 ++
array-2 553.4 550.8 542.9 542.4 -1.9 -1.5 -
array-init-1 398.2 398.0 372.6 371.7 -6.4 -6.6 --
array-init-2 418.7 417.2 395.6 392.1 -5.5 -6.0 --
array-init-3 266.2 265.0 257.2 256.1 -3.4 -3.4 -
array-init-4 253.0 252.9 247.5 247.2 -2.2 -2.3 -
array-pop-1 383.2 380.1 350.3 346.5 -8.6 -8.8 --
array-push-1 240.3 239.6 257.7 257.4 7.3 7.4 ++
array-read-C-1 176.8 176.7 393.8 393.7 122.7 122.8 ++
array-read-C-2 26.5 26.5 26.1 26.0 -1.7 -1.5 -
array-read-Number-1 174.8 174.7 397.2 393.1 127.2 125.1 ++
array-read-Number-2 26.1 26.0 26.1 25.9 -0.0 -0.6
array-read-int-1 171.8 171.7 396.2 394.2 130.6 129.6 ++
array-read-int-2 26.3 26.2 26.1 26.0 -0.8 -0.5
array-read-int-5 161.5 161.1 276.7 276.5 71.3 71.6 ++
array-read-int-6 158.9 158.0 271.5 271.1 70.8 71.6 ++
array-read-int-7 158.2 157.6 268.9 268.4 70.0 70.3 ++
array-read-length-1 69.7 69.6 69.6 69.5 -0.1 -0.2
Attachment #562059 -
Flags: feedback?(lhansen)
Assignee | ||
Comment 6•14 years ago
|
||
(In reply to Felix S Klock II from comment #0)
> [*] Regarding the "not subclassed" constraint: Maybe this constraint is
> unnecessary? Sealed-ness should not matter for reads (only for writes, i.e.
> setUintProperty). Do overrides of the length getter matter for the
> semantics of Array referencing?
(In reply to Felix S Klock II from comment #4)
> (Also, the description for this bug mentions that it may not be necessary to
> keep the not-subclassed constraint. Even so, its probably better to track
> m_isSimple in a manner analogous to what this patch does, to enable
> speculative inlining of setUintProperty.)
Note also that getUintProperty is virtual on the C++ side; this is not just about preserving behavior for AS3 subclasses of Array, but also for C++ subclasses of Array. (On the other hand, its conceivable that SemiSealedArray is the only such subclass...)
Anyway, the simple thing for the short term is to use the definition of simple-ness that I originally wrote in this description; that is, make subclassing (from either C++ or AS3) immediately imply non-simpleness.
Assignee | ||
Updated•14 years ago
|
Whiteboard: has-patch
Comment 7•14 years ago
|
||
F+ on all three with the following comments.
- We need some stats on whether our "simple" heuristic is really good enough
in real content. It may well be that it isn't, and that we need to make
allowances for some arrays that have a few holes, notably at the beginning
and end, and possibly temporarily - ie arrays that are dense but not simple
by this definition. Good start though - and it might be enough! Indeed if
it is enough then type-specialized arrays will be a little easier to deal with.
- We need to look carefully at the generated code on x86 and ARM. My gut feeling
is that the m_simple field is not the way to go and that an m_simpleLength field
would be better; it would be nonzero only if the array is simple. Thus
a single load and compare and branch will be necessary, while the code as it
is here now has two loads and will generate an intermediate boolean result
(sometimes expensive - cmov, setcc, whatever) and there is then additional
manipulation of that.
That setup will also come in handy if we think about type-specializing simple
arrays - one length field per type we care about.
- For perf results, please always provide info about the machine they were
obtained on... Anyway in this case I think that you should try a second
architecture. The slowdown on the init benchmarks is probably caused by
cache effects or microarchitecture.
Memo to self: I need to look into whether the Vector optimizations should worry about disabling CSE in the manner you do here.
Updated•14 years ago
|
Attachment #562049 -
Flags: feedback?(lhansen) → feedback+
Updated•14 years ago
|
Attachment #562055 -
Flags: feedback?(lhansen) → feedback+
Updated•14 years ago
|
Attachment #562059 -
Flags: feedback?(lhansen) → feedback+
Assignee | ||
Comment 8•14 years ago
|
||
(In reply to Lars T Hansen from comment #7)
> - For perf results, please always provide info about the machine they were
> obtained on... Anyway in this case I think that you should try a second
> architecture. The slowdown on the init benchmarks is probably caused by
> cache effects or microarchitecture.
My bad, I'm usually better about that.
The results in comment 5 were from Release 32-bit avmshell builds on my MacBook Pro (a 2.8 Ghz Intel Core 2 Duo).
Comment 9•14 years ago
|
||
> Memo to self: I need to look into whether the Vector optimizations should
> worry about disabling CSE in the manner you do here.
As a rule of thumb, if you're in CodegenLIR and generating diamond-shaped code,
then you need to disable/enable CSE around the fork/join flow. Assume that
a label clears the CSE state. Whether or not clearing CSE state really matters
depends more on the surrounding code than on the inlined path itself, so the
effects of resetting CSE will likely be smaller on microbenchmarks than real code.
Assignee | ||
Updated•14 years ago
|
Whiteboard: has-patch → has-patch, PACMAN
Assignee | ||
Comment 10•13 years ago
|
||
(we still need a boolean internally because we need to distinguish 0-length arrays that will remain simple when populated from 0-length arrays that cannot be simple when populated, namely the ones that have been subclassed. But now the jit can just load the simple length flag on its own.)
Attachment #562049 -
Attachment is obsolete: true
Assignee | ||
Comment 11•13 years ago
|
||
(In case its not clear, patch B v2 and patch D v2 are incorporating Lars' suggestion from comment 7 to use a m_simpleLength field.)
I'll post results in a little bit.
Attachment #562059 -
Attachment is obsolete: true
Assignee | ||
Comment 12•13 years ago
|
||
Nexus One results
Executive summary (filtering out results where |%dAvg| < 4%, which I arbitrarily chose, and may distort true nature of results, etc etc):
Tamarin tests started: 2011-10-25 08:26:33.847245
Executing 588 test(s)
avm: ../../objdir-reldroid/shell/android_shell.baseline-r6676.py version: 6679:42790e63d9bc
avm2: ../../objdir-reldroid/shell/android_shell.simple-lenflag.py version: 6685:57759fa81991
iterations: 5
avm avm2
test best avg best avg %dBst %dAvg
Metric: iterations/second
Dir: asmicro/
alloc-2 3.1 2.6 3.0 3.0 -1.1 16.1
array-1 200.8 200.1 462.5 462.3 130.3 131.0 ++
array-read-C-1 22.4 22.3 74.9 74.9 235.2 236.4 ++
array-read-Number-1 23.3 23.3 75.0 74.9 222.0 222.1 ++
array-read-int-1 23.3 23.3 74.9 74.9 221.9 222.2 ++
array-read-int-5 20.0 19.8 46.8 46.7 134.1 135.7 ++
array-read-int-6 19.5 19.1 49.0 49.0 151.7 156.8 ++
array-read-int-7 20.0 19.7 46.8 46.7 134.3 136.9 ++
array-shift-1 16.0 15.8 15.3 15.2 -4.0 -4.2 -
parseInt-1 19.2 19.2 17.7 17.5 -7.9 -8.7 --
string-indexOf-3 19.7 19.7 21.5 21.4 8.7 8.5 ++
vector-nested-read-Number-2 13.9 13.1 13.9 13.9 0.1 5.6
vector-write-C-1 11.9 11.9 10.8 10.8 -9.5 -9.5 --
Metric: iterations/second
Dir: jsmicro/
arguments-3 2.1 2.1 2.2 2.2 4.3 4.2 +
array-shift-1 5.2 5.2 5.5 5.5 4.3 4.8 +
array-sort-1 4.0 3.4 4.0 4.0 0.7 18.4
parseInt-1 13.4 13.4 12.6 12.6 -6.3 -6.5 --
string-indexOf-3 6.5 6.5 6.9 6.9 6.6 6.5 ++
Metric: time
Dir: language/string/
static_ascii_array_50 1914.0 1919.8 1950.0 1998.0 -1.9 -4.1
Dir: language/string/typed/
indexOf 1458.0 1458.6 1518.0 1518.8 -4.1 -4.1 -
search 839.0 889.2 790.0 830.8 5.8 6.6
Metric: time
Dir: scimark/
MonteCarlo 22607.0 22650.8 20918.0 20951.4 7.5 7.5 ++
SparseCompRow 746.0 750.8 597.0 604.4 20.0 19.5 ++
Dir: sunspider/
bitops-nsieve-bits 268.0 269.6 251.0 251.4 6.3 6.8 ++
Dir: sunspider/as3/
access-fannkuch 257.0 258.2 222.0 222.2 13.6 13.9 ++
access-nbody 55.0 55.0 51.0 51.4 7.3 6.5 ++
bitops-bitwise-and 12.0 12.6 12.0 12.0 0.0 4.8
bitops-nsieve-bits 181.0 181.4 166.0 167.2 8.3 7.8 ++
crypto-aes 228.0 228.2 210.0 210.8 7.9 7.6 ++
crypto-md5 196.0 196.6 211.0 211.4 -7.7 -7.5 --
math-cordic 155.0 155.2 149.0 149.0 3.9 4.0 +
Dir: sunspider-0.9.1/typed/
access-fannkuch 327.0 327.8 294.0 294.8 10.1 10.1 ++
access-nbody 39.0 39.0 35.0 35.4 10.3 9.2 ++
bitops-nsieve-bits 185.0 185.2 167.0 168.0 9.7 9.3 ++
crypto-aes 194.0 195.0 175.0 175.8 9.8 9.8 ++
math-cordic 165.0 165.2 155.0 155.8 6.1 5.7 ++
s3d-raytrace 77.0 77.2 73.0 73.2 5.2 5.2 ++
(My caveat at the top was half tongue-in-cheek; but note that there *are* a large number of additional regressions in the transcript; they just all had |%dAvg| < 4.)
Anyway, there are also the significant regressions listed above, which I don't have any explanation for. (Why on earth would vector-write-C-1 regress by 10%?)
Assignee | ||
Comment 13•13 years ago
|
||
This is essentially the same as patch D v2 except that I inverted the direction of the comparison and the corresponding branch in the hopes that it would improve branch prediction.
From what I can tell, the effect of this is either in the noise, or merely dampens the effect of the change (so that most of the regressions I saw seem to boil away to close to nothing, and the improvements may be lessened but only slightly so on the microbenchmarks).
Attachment #569289 -
Attachment is obsolete: true
Assignee | ||
Comment 14•13 years ago
|
||
fklockii-MacBookPro (32-bit release shell, 2.8 Ghz Core 2 Duo) results
Executive summary (filtering out results where |%dAvg| < 4%):
Tamarin tests started: 2011-10-25 13:57:46.694479
Executing 588 test(s)
avm: ../../objdir-rel32/shell/avmshell.baseline-r6676 version: 6679:42790e63d9bc
avm2: ../../objdir-rel32/shell/avmshell.simple-lenflag version: 6683:86dd6b710726
iterations: 5
avm avm2
test best avg best avg %dBst %dAvg
Metric: iterations/second
Dir: asmicro/
alloc-11 13.3 13.2 12.5 12.4 -5.8 -6.0 --
alloc-12 13.2 13.1 12.3 12.3 -6.7 -6.4 --
alloc-4 45.3 45.3 42.6 42.6 -6.0 -6.0 --
alloc-8 8.9 8.3 9.0 8.7 0.4 4.9
arguments-1 724.3 723.3 677.3 676.8 -6.5 -6.4 --
array-1 1707.3 1706.9 4127.9 4124.9 141.8 141.7 ++
array-init-1 395.2 393.8 374.9 374.7 -5.1 -4.9 --
array-init-2 426.1 425.1 396.2 395.8 -7.0 -6.9 --
array-init-3 273.2 272.5 249.8 249.4 -8.6 -8.5 --
array-init-4 259.7 259.5 240.5 240.4 -7.4 -7.4 --
array-read-C-1 174.1 173.7 441.1 440.4 153.3 153.5 ++
array-read-Number-1 177.5 177.3 448.1 447.7 152.5 152.5 ++
array-read-int-1 170.5 170.3 448.6 448.4 163.1 163.4 ++
array-read-int-5 163.3 163.3 321.7 321.5 96.9 96.9 ++
array-read-int-6 163.5 163.2 321.0 320.8 96.3 96.5 ++
array-read-int-7 159.8 159.8 318.0 317.8 99.0 98.9 ++
array-sort-1 25.2 24.9 25.1 23.6 -0.5 -5.2
array-sort-4 8.8 8.7 8.7 8.3 -1.0 -4.7
array-unshift-1 151.2 151.0 149.4 138.9 -1.2 -8.0
array-write-int-1 50.4 47.4 51.6 50.7 2.4 7.0
array-write-int-2 27.5 27.4 27.6 26.3 0.5 -4.2
bytearray-read-byte-3 64.2 56.1 63.9 53.1 -0.5 -5.2
bytearray-read-double-2 22.8 21.4 22.6 22.4 -0.9 4.8
bytearray-read-int-1 19.9 19.8 21.0 20.9 5.5 5.4 ++
bytearray-read-int-2 25.6 25.5 24.0 24.0 -6.1 -6.0 --
bytearray-read-ubyte-1 28.2 28.1 26.3 26.3 -6.7 -6.6 --
bytearray-read-utfbytes-1 27.1 27.1 26.0 26.0 -4.0 -4.0 -
bytearray-write-byte-1 13.4 12.8 13.5 13.4 1.1 4.9
bytearray-write-utf-1 36.0 35.9 30.9 30.9 -14.0 -13.9 --
bytearray-write-utf-2 22.3 22.2 20.2 20.1 -9.6 -9.6 --
bytearray-write-utfbytes-1 47.2 47.2 36.3 36.3 -23.1 -23.1 --
bytearray-write-utfbytes-2 24.9 24.8 22.7 22.6 -8.8 -8.7 --
closedvar-write-1 3115.9 3109.9 3430.6 3394.4 10.1 9.1 ++
funcall-4 107.6 106.9 103.3 102.4 -4.0 -4.2 -
restarg-1 723.3 721.5 675.6 673.9 -6.6 -6.6 --
string-indexOf-1 189.4 189.2 199.8 199.2 5.5 5.3 ++
vector-init-1 213.4 212.2 194.8 194.8 -8.7 -8.2 --
vector-init-2 236.1 235.6 220.3 219.1 -6.7 -7.0 --
vector-init-3 146.4 145.9 130.5 130.3 -10.9 -10.7 --
Dir: jsmicro/
alloc-1 37.2 37.0 35.6 35.6 -4.3 -4.0 -
alloc-10 8.4 8.1 7.9 7.7 -5.5 -4.2
alloc-7 34.0 33.4 33.5 31.8 -1.6 -4.7
arguments-3 15.6 15.4 16.6 16.3 6.7 6.2 ++
array-2 262.2 261.1 283.2 278.5 8.0 6.7 ++
Metric: time
Dir: language/string/
replace2 1105.0 1106.4 1244.0 1245.2 -12.6 -12.5 --
Dir: language/string/typed/
replace2 1101.0 1101.8 1246.0 1247.8 -13.2 -13.3 --
Metric: iterations/second
Dir: misc/
bytearray-to-vector-be 569.9 565.4 526.9 522.0 -7.5 -7.7 --
bytearray-to-vector-le 656.3 654.6 597.4 585.2 -9.0 -10.6 --
bytearray-to-vector-worst 573.4 562.7 523.5 522.6 -8.7 -7.1 --
Metric: time
Dir: scimark/
MonteCarlo 2884.0 2912.6 2702.0 2720.6 6.3 6.6 ++
SparseCompRow 99.0 100.2 80.0 80.0 19.2 20.2 ++
Dir: sunspider/
bitops-bits-in-byte 27.0 27.6 28.0 28.8 -3.7 -4.3 -
Dir: sunspider/as3/
access-fannkuch 42.0 43.2 39.0 39.4 7.1 8.8 +
access-nsieve 14.0 14.0 14.0 14.8 0.0 -5.7
bitops-3bit-bits-in-byte 5.0 5.8 5.0 5.4 0.0 6.9
bitops-bitwise-and 1.0 1.6 1.0 1.2 0.0 25.0
bitops-nsieve-bits 25.0 25.8 24.0 24.2 4.0 6.2 +
controlflow-recursive 4.0 4.4 4.0 4.2 0.0 4.5
crypto-aes 30.0 31.0 28.0 28.8 6.7 7.1 +
crypto-md5 28.0 28.0 29.0 29.8 -3.6 -6.4 -
math-cordic 14.0 15.0 13.0 13.4 7.1 10.7
math-spectral-norm 6.0 6.0 5.0 5.0 16.7 16.7 ++
s3d-morph 29.0 31.4 29.0 29.8 0.0 5.1
Dir: sunspider/as3vector/
access-nbody 4.0 4.0 5.0 5.0 -25.0 -25.0 --
s3d-cube 23.0 23.0 24.0 24.2 -4.3 -5.2 -
Dir: sunspider-0.9.1/typed/
access-fannkuch 57.0 58.0 54.0 54.6 5.3 5.9 +
access-nbody 3.0 4.2 3.0 3.6 0.0 14.3
bitops-nsieve-bits 24.0 24.8 22.0 22.8 8.3 8.1 ++
controlflow-recursive 3.0 3.6 4.0 4.0 -33.3 -11.1 --
crypto-aes 25.0 25.4 23.0 23.8 8.0 6.3 +
crypto-md5 3.0 3.4 4.0 4.0 -33.3 -17.6 --
math-cordic 16.0 16.0 14.0 14.6 12.5 8.8 ++
math-spectral-norm 7.0 7.8 7.0 8.2 0.0 -5.1
s3d-cube 16.0 16.4 15.0 15.4 6.2 6.1 +
Metric: v8 custom v8 normalized metric (hardcoded in the test)
Dir: v8.5/js/
regexp 69.7 69.3 64.8 63.3 -7.0 -8.6 -
richards 259.0 242.4 258.0 254.8 -0.4 5.1
Dir: v8.5/optimized/
regexp 71.8 71.4 66.1 65.8 -7.9 -7.9 --
splay 5197.0 5177.6 4991.0 4939.6 -4.0 -4.6 -
Dir: v8.5/typed/
regexp 71.9 71.6 66.4 66.2 -7.6 -7.5 --
Dir: v8.5/untyped/
regexp 71.1 70.0 65.4 65.0 -8.0 -7.2 --
This was intended as a sanity check. But some of the regressions above are pretty troubling, e.g. bytearray-write-utfbytes-1. (OTOH, there are no regressions where %dAvg<=-10 on the nexus one results).
I'm going to put up a review request just to keep the patch rolling through the pipeline (and maybe get outside insight), but I hope to have an answer for the 9 cases here where %dAvg<=-10 before atttempting to land this.
Assignee | ||
Comment 15•13 years ago
|
||
Comment on attachment 562044 [details] [diff] [review]
patch A v1: drive-by refactor, moving some inlines to .cpp
(if you veto this, that is fine, I'll just rebase the queue. Also, I find it a little strange that ArrayObject::getLength is inline and virtual and that everything still builds without an out-of-line definition for it.)
Attachment #562044 -
Flags: review?(lhansen)
Assignee | ||
Updated•13 years ago
|
Attachment #569288 -
Flags: review?(lhansen)
Assignee | ||
Comment 16•13 years ago
|
||
Comment on attachment 562055 [details] [diff] [review]
patch C: mark Array() constructed arrays as simple.
(I already noted my own displeasure with this in comment 4; but I also don't know a better way to do it yet.)
Attachment #562055 -
Flags: review?(lhansen)
Assignee | ||
Comment 17•13 years ago
|
||
Comment on attachment 569363 [details] [diff] [review]
patch D v3: jit simple lenflag
(I think there's a bug fix for patch B in the portion of the patch to ArrayObject.cpp; I will probably rearrange things so that that part is associated with patch B before landing)
Nanojit experts: Is there a bias in branch-instructions like LIR_jt in that I should choose one direction over the other in terms of branch-prediction?
Attachment #569363 -
Flags: review?(lhansen)
Comment 18•13 years ago
|
||
(In reply to Felix S Klock II from comment #17)
> Comment on attachment 569363 [details] [diff] [review] [diff] [details] [review]
> patch D v3: jit simple lenflag
>
> (I think there's a bug fix for patch B in the portion of the patch to
> ArrayObject.cpp; I will probably rearrange things so that that part is
> associated with patch B before landing)
>
> Nanojit experts: Is there a bias in branch-instructions like LIR_jt in that
> I should choose one direction over the other in terms of branch-prediction?
not really. There are two effects at play that might bias your choice:
1. the cpu's we care about will be happier with the fast-path being the fall-through path, at least for forward branches.
2. nanojit allocates registers bottom up, so register decisions in the else-block of an if/then/else code fragment will tend to take precedence over register decisions in the then-block. The then-block is more likely to spill after the branch position, if there are conflicts.
Which of these two effects is dominant is anyone's guess; my guess is the second one dominates, if either.
Assignee | ||
Comment 19•13 years ago
|
||
Comment on attachment 569288 [details] [diff] [review]
patch B v2: separate simple len and boolean
retracting review request; somethings not quite right with this patch queue.
Attachment #569288 -
Flags: review?(lhansen)
Assignee | ||
Updated•13 years ago
|
Attachment #562055 -
Flags: review?(lhansen)
Assignee | ||
Updated•13 years ago
|
Attachment #569363 -
Flags: review?(lhansen)
Assignee | ||
Updated•13 years ago
|
Attachment #562044 -
Flags: review?(lhansen)
Assignee | ||
Comment 20•13 years ago
|
||
Attachment #562044 -
Attachment is obsolete: true
Assignee | ||
Comment 21•13 years ago
|
||
cleanup, fixed some bugs, added a few more assertions to internal verify mode.
Attachment #569288 -
Attachment is obsolete: true
Assignee | ||
Updated•13 years ago
|
Attachment #570686 -
Attachment description: patch B v2: state to track array simpleness. → patch B v3: state to track array simpleness.
Assignee | ||
Updated•13 years ago
|
Attachment #570683 -
Flags: review?(treilly)
Assignee | ||
Updated•13 years ago
|
Attachment #570686 -
Flags: review?(treilly)
Assignee | ||
Updated•13 years ago
|
Attachment #562055 -
Flags: review?(treilly)
Assignee | ||
Updated•13 years ago
|
Attachment #569363 -
Flags: review?(treilly)
Assignee | ||
Updated•13 years ago
|
Attachment #570683 -
Flags: superreview?(lhansen)
Assignee | ||
Updated•13 years ago
|
Attachment #570686 -
Flags: superreview?(lhansen)
Assignee | ||
Updated•13 years ago
|
Attachment #562055 -
Flags: superreview?(lhansen)
Assignee | ||
Updated•13 years ago
|
Attachment #569363 -
Flags: superreview?(lhansen)
Assignee | ||
Comment 22•13 years ago
|
||
I'm finally restarting the review request cycle (I had aborted the first last week after I discovered bugs in the code).
This code has been tested against the array white box tests in Bug 693431 (which I am planning to put up for review as well, as soon as they go through a buildbot pass).
The one remaining to-do item is to investigate the more troubling regressions that I observed on my MacBookPro, logged in comment 14, namely the following cases:
avm avm2
test best avg best avg %dBst %dAvg
Metric: iterations/second
Dir: asmicro/
bytearray-write-utf-1 36.0 35.9 30.9 30.9 -14.0 -13.9 --
bytearray-write-utfbytes-1 47.2 47.2 36.3 36.3 -23.1 -23.1 --
vector-init-3 146.4 145.9 130.5 130.3 -10.9 -10.7 --
Metric: time
Dir: language/string/
replace2 1105.0 1106.4 1244.0 1245.2 -12.6 -12.5 --
Dir: language/string/typed/
replace2 1101.0 1101.8 1246.0 1247.8 -13.2 -13.3 --
Dir: sunspider/as3vector/
access-nbody 4.0 4.0 5.0 5.0 -25.0 -25.0 --
Dir: sunspider-0.9.1/typed/
controlflow-recursive 3.0 3.6 4.0 4.0 -33.3 -11.1 --
crypto-md5 3.0 3.4 4.0 4.0 -33.3 -17.6 --
I am planning to fire up shark to do this next. But I did not want that to hold up initiating the review process.
Assignee | ||
Comment 23•13 years ago
|
||
FYI: not all of the regressions noted in comment 22 reproduce on fklockii-iMac (2.66 Ghz Intel Core i5):
avm avm2
test best avg best avg %dBst %dAvg
Metric: iterations/second
Dir: asmicro/
bytearray-write-utf-1 36.6 36.4 36.9 36.4 1.0 0.1
bytearray-write-utf-2 24.1 24.1 24.2 24.0 0.3 -0.3
bytearray-write-utfbytes-1 48.0 47.7 47.7 47.2 -0.7 -1.0
bytearray-write-utfbytes-2 27.0 26.6 27.1 26.8 0.6 0.7
Metric: time
Dir: language/string/
replace 506.0 509.2 500.0 505.2 1.2 0.8 +
replace2 853.0 857.4 855.0 861.8 -0.2 -0.5
Dir: language/string/typed/
replace 508.0 512.4 493.0 500.4 3.0 2.3 +
replace2 846.0 852.0 854.0 858.0 -0.9 -0.7
Dir: sunspider/as3vector/
access-nbody 4.0 4.0 4.0 4.0 0.0 0.0
Dir: sunspider-0.9.1/typed/
controlflow-recursive 3.0 3.2 3.0 3.6 0.0 -12.5
crypto-md5 3.0 3.0 3.0 3.4 0.0 -13.3
I haven't done a full run of the performance benchmarks in this context; I just wanted to see if they reproduced somewhere other than my MacBookPro (a 2.8 Ghz Core 2 Duo), because it would probably be nicer to use Shark on the iMac.
Assignee | ||
Comment 24•13 years ago
|
||
trivial rebase of the patch D that I missed yesterday. (there was a bugfix that needed to be shifted backward into patch B to get it to pass my tests on its own.)
Attachment #569363 -
Attachment is obsolete: true
Attachment #569363 -
Flags: superreview?(lhansen)
Attachment #569363 -
Flags: review?(treilly)
Attachment #570950 -
Flags: superreview?(lhansen)
Attachment #570950 -
Flags: review?(treilly)
Assignee | ||
Comment 25•13 years ago
|
||
(In reply to Felix S Klock II from comment #24)
> (there was a bugfix that needed to be shifted backward into patch B ...)
Should have said "needed to be and has been shifted as part of patch B v3"; the rebase of patch D was to accomodate that shift.
Updated•13 years ago
|
Attachment #570683 -
Flags: superreview?(lhansen) → superreview+
Comment 26•13 years ago
|
||
Comment on attachment 570686 [details] [diff] [review]
patch B v3: state to track array simpleness.
I have to read this more carefully (it's early and Starbucks is noisy) but this approach is indeed what I had in mind and you should consider this response an SR+ for practical purposes (ie landing, if it comes to that).
Re the comment in the patch header: I believe we go to various lengths to obey the "length" getter/setters in Array subclasses and we've had bugs in the past when we didn't do that right, so assume that we have to continue to worry about that until proven otherwise.
It might be useful to poll Lego or the forums to ask how many have subclassed Array in real code. Usually people do this to add non-prototype methods, not to muck with the length property. So a more lenient test is whether the instance is Array /or/ a subclass with the original length getter/setter pair.
Comment 27•13 years ago
|
||
Comment on attachment 570950 [details] [diff] [review]
patch D v4: jit simple lenflag
Looks reasonable. I wonder whether it would be more correct to widen before scaling rather than vice versa but with a < 4GB object size limitation it should not matter.
Attachment #570950 -
Flags: superreview?(lhansen) → superreview+
Comment 28•13 years ago
|
||
Comment on attachment 562055 [details] [diff] [review]
patch C: mark Array() constructed arrays as simple.
Effective SR+ based on earlier F+.
Updated•13 years ago
|
Attachment #570683 -
Flags: review?(treilly) → review+
Comment 29•13 years ago
|
||
Comment on attachment 570686 [details] [diff] [review]
patch B v3: state to track array simpleness.
why do we have m_isSimple? Does m_lengthIfSimple != 0 not work? Is there some corner case where a zero length array is simple?
Updated•13 years ago
|
Attachment #562055 -
Flags: review?(treilly) → review+
Assignee | ||
Comment 30•13 years ago
|
||
(In reply to Tommy Reilly from comment #29)
> Comment on attachment 570686 [details] [diff] [review] [diff] [details] [review]
> patch B v3: state to track array simpleness.
>
> why do we have m_isSimple? Does m_lengthIfSimple != 0 not work? Is there
> some corner case where a zero length array is simple?
m_isSimple tracks we are ever allowed to have (m_lengthIfSimple != 0). Under the current design, once an array has been marked as non-simple (i.e. !m_isSimple), it will never become simple. (Perhaps I should name the member 'm_canBeSimple' rather than 'm_isSimple'; the current name is an artifact of a previous variant.)
The main case I am concerned about is array subclasses. Subclassed arrays are never simple (see comment 0 and comment 26). When we construct an array subclass, its m_isSimple flag is set to false. Then when the array subclas goes from empty to non-empty, its m_lengthIfSimple flag is kept at zero.
A bit more explanation:
Most arrays start off empty (even if their length is zero, their denseArray holds no elements). You want to allow such arrays to treated as simple, obviously. If I just used (m_lengthIfSimple != 0), then I would have to handle arrays transitioning from non-simple (when they have m_length == m_lengthIfSimple == 0) to simple.
If I did not have to worry about array subclasees, then I might be able to get rid of the m_isSimple flag, though with that would come a slight policy change, because arrays would then be able to go from simple -> non-simple -> simple again, which they currently cannot do (see my second paragraph above).
(But I think future PACMAN bugs for optimizing Array .length access are going to rely on the simplicity definition used here, or something very close to it. So I'm not yet too interested in trying to revise the representation here until I prove to myself that it doesn't pay off.)
Assignee | ||
Updated•13 years ago
|
Attachment #570686 -
Flags: superreview?(lhansen)
Attachment #570686 -
Flags: review?(treilly)
Attachment #570686 -
Flags: review?(lhansen)
Assignee | ||
Comment 31•13 years ago
|
||
Comment on attachment 570950 [details] [diff] [review]
patch D v4: jit simple lenflag
(Moving treilly R? to lhansen, and then treating lhansen's SR+ as an R+.)
Lars: Obviously I can't push this until patch B lands, so you've got a shot at re-review if you want it. :)
Attachment #570950 -
Flags: review?(treilly)
Comment 32•13 years ago
|
||
Comment on attachment 570686 [details] [diff] [review]
patch B v3: state to track array simpleness.
Nice.
Perhaps file a followup bug to address the expedient solution used in try_splice (not sure that it's worth it).
Attachment #570686 -
Flags: review?(lhansen) → review+
Updated•13 years ago
|
Attachment #562055 -
Flags: superreview?(lhansen) → superreview+
Comment 33•13 years ago
|
||
Comment on attachment 570950 [details] [diff] [review]
patch D v4: jit simple lenflag
This comment in ArrayObject.cpp is a little curious:
// Bugzilla 688486: Should recheck above assertion after
// jit speculative fast-path lands.
Otherwise nothing. R+.
Attachment #570950 -
Flags: review+
Comment 34•13 years ago
|
||
changeset: 6716:2aa4ce80999a
user: Felix S Klock II <fklockii@adobe.com>
summary: Bug 688486: move some inlines to .cpp, ease toggling Array verify mode (r=treilly, sr=lhansen).
http://hg.mozilla.org/tamarin-redux/rev/2aa4ce80999a
Comment 35•13 years ago
|
||
changeset: 6717:2a58c781d85d
user: Felix S Klock II <fklockii@adobe.com>
summary: Bug 688486: state tracking array simpleness (r=lhansen).
http://hg.mozilla.org/tamarin-redux/rev/2a58c781d85d
Comment 36•13 years ago
|
||
changeset: 6718:ef9e442690e0
user: Felix S Klock II <fklockii@adobe.com>
summary: Bug 688486: mark arrays constructed via 'Array()' as simple (r=treilly, sr=lhansen).
http://hg.mozilla.org/tamarin-redux/rev/ef9e442690e0
Assignee | ||
Comment 37•13 years ago
|
||
(In reply to Lars T Hansen from comment #33)
> Comment on attachment 570950 [details] [diff] [review] [diff] [details] [review]
> patch D v4: jit simple lenflag
>
> This comment in ArrayObject.cpp is a little curious:
>
> // Bugzilla 688486: Should recheck above assertion after
> // jit speculative fast-path lands.
The "assertion" its referring to is this:
// This is a hot function.
I didn't want to leave the comment in there unchallenged. But it might well still be a hot function even with this optimization in place, especially for interpreted systems... so I did not want to just delete it.
(But at this point I'm thinking I might just delete it.)
Comment 38•13 years ago
|
||
changeset: 6719:b15c9b7c1bcb
user: Felix S Klock II <fklockii@adobe.com>
summary: Bug 688486: make jit emit fast path for simple arrays (r=lhansen).
http://hg.mozilla.org/tamarin-redux/rev/b15c9b7c1bcb
Assignee | ||
Comment 39•13 years ago
|
||
(In reply to Lars T Hansen from comment #32)
> Perhaps file a followup bug to address the expedient solution used in
> try_splice (not sure that it's worth it).
Forked off as Bug 701097 as a minor enhancement.
Comment 40•13 years ago
|
||
changeset: 6720:cddb276c4350
user: Felix S Klock II <fklockii@adobe.com>
summary: Bug 688486: windows builds broke so backing out to rev 6715:a2224e2b2a43; will revisit tomorrow (r=fklockii).
http://hg.mozilla.org/tamarin-redux/rev/cddb276c4350
Comment 41•13 years ago
|
||
changeset: 6721:4f7123852123
user: Felix S Klock II <fklockii@adobe.com>
summary: Bug 688486: attempt to fix windows build by putting FASTCALL after fcn return type in signature (r=fklockii).
http://hg.mozilla.org/tamarin-redux/rev/4f7123852123
Assignee | ||
Comment 42•13 years ago
|
||
(In reply to Tamarin Bot from comment #35)
> changeset: 6717:2a58c781d85d
> user: Felix S Klock II <fklockii@adobe.com>
> summary: Bug 688486: state tracking array simpleness (r=lhansen).
>
> http://hg.mozilla.org/tamarin-redux/rev/2a58c781d85d
(In reply to Tamarin Bot from comment #36)
> changeset: 6718:ef9e442690e0
> user: Felix S Klock II <fklockii@adobe.com>
> summary: Bug 688486: mark arrays constructed via 'Array()' as simple
> (r=treilly, sr=lhansen).
>
> http://hg.mozilla.org/tamarin-redux/rev/ef9e442690e0
changeset 6718 injected a failure in the deep builds; the actual bug is really in changeset 6717, in my opinion.
Filed as Bug 702262.
Comment 43•13 years ago
|
||
changeset: 6728:ca173c61d9fb
user: Felix S Klock II <fklockii@adobe.com>
summary: Bug 688486: fix template instantiations to fix AOT builds (r=fklockii).
http://hg.mozilla.org/tamarin-redux/rev/ca173c61d9fb
Assignee | ||
Updated•13 years ago
|
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•