631081 - Activation record entries larger than 8 bytes are sometimes not 8 byte-aligned.

Assignee

Description

•

15 years ago

Observed empirically. CodegenLIR pads AS method arguments based on the assumption that space allocated by LIR_allocp is dword aligned when 2 or more words are allocated.

Krzysztof Palacz

Assignee

Comment 1

•

15 years ago

Attached patch patch (obsolete) — Details — Splinter Review

Assignee: nobody → kpalacz

Attachment #509296 - Flags: superreview?

Attachment #509296 - Flags: review?(rreitmai)

Krzysztof Palacz

Assignee

Updated

•

15 years ago

Attachment #509296 - Flags: superreview? → superreview?(edwsmith)

Krzysztof Palacz

Assignee

Comment 2

•

15 years ago

For general amusement, performance impact on jsbench/typed and asmicro benchmarks (5 iterations, MacOS X on a 2.8 intel core 2 Penryn laptop). avm: ./avm-baseline version: cyclone avm2: ./avm version: cyclone iterations: 5 avm avm2 test best avg best avg %dBst %dAvg Metric: time Dir: jsbench/typed/ Crypt 929 929.8 893 893.4 3.9 3.9 + Euler 8439 8456.4 8721 8744.2 -3.3 -3.4 - FFT 1651 1655.4 1643 1648 0.5 0.4 HeapSort 1294 1296.2 1293 1295 0.1 0.1 LUFact 1180 1182.6 1185 1188.4 -0.4 -0.5 - Moldyn 4200 4210.4 4016 4026.2 4.4 4.4 + RayTracer 1383 1384.4 1403 1405.6 -1.4 -1.5 - SOR 4506 4510.6 4524 4526.6 -0.4 -0.4 - Series 7824 7829.8 7825 7832.4 -0.0 -0.0 SparseMatmult 2261 2272.4 2248 2249.6 0.6 1.0 avm avm2 test best avg best avg %dBst %dAvg Metric: iterations/second Dir: asmicro/ alloc-1 38.8 38.7 40.3 40.3 4.0 4.0 + alloc-10 12.7 12.7 12.9 12.8 1.5 1.2 + alloc-11 12.3 12.3 12.3 12.2 -0.4 -0.3 alloc-12 12.4 12.3 12.4 12.4 0.3 0.3 alloc-13 89.4 89.3 88.3 88.2 -1.2 -1.3 - alloc-14 69.5 68.8 68.2 67.3 -1.8 -2.2 alloc-2 19.4 19.4 19.6 19.6 0.9 0.9 alloc-3 15.2 15.1 15.3 15.2 0.7 0.7 alloc-4 45.5 45.4 45.0 44.9 -1.2 -1.2 - alloc-5 32.0 31.9 31.7 31.6 -0.9 -0.9 alloc-6 68.7 68.7 67.9 67.9 -1.2 -1.1 - alloc-7 35.1 35.1 35.1 35.1 0.1 0.1 alloc-8 13.9 13.9 13.9 13.8 -0.2 -0.3 alloc-9 13.9 13.9 13.9 13.9 0.1 -0.0 arguments-1 723.3 722.6 703.3 703.3 -2.8 -2.7 - arguments-2 437.6 437.0 428.1 426.5 -2.2 -2.4 - arguments-3 19.2 19.2 19.2 19.1 0 -0.1 arguments-4 26.5 26.4 26.0 25.9 -1.9 -1.9 - array-1 2209.8 2205.8 2242.8 2240.8 1.5 1.6 + array-2 577.4 575.8 592.8 592.7 2.7 2.9 + array-pop-1 348.0 345.6 355.9 353.9 2.3 2.4 + array-push-1 278.2 277.8 278.4 278.3 0.1 0.2 array-shift-1 137.3 136.2 137.2 136.3 -0.1 0.1 array-slice-1 21.0 21.0 20.6 20.6 -1.6 -1.7 - array-sort-1 28.2 28.2 27.9 27.8 -1.2 -1.3 - array-sort-2 2.5 2.4 2.4 2.4 -1.7 -1.8 - array-sort-3 21.6 21.5 21.7 21.6 0.2 0.0 array-sort-4 9.5 9.5 9.9 9.9 4.0 3.9 + array-unshift-1 156.5 156.4 157.8 157.7 0.8 0.8 closedvar-read-1 4621.4 4617.2 4714.3 4701.5 2.0 1.8 + closedvar-write-1 3850.1 3843.0 3879.1 3875.3 0.8 0.8 + closedvar-write-2 3937.1 3932.5 3622.4 3603.0 -8.0 -8.4 -- do-1 3974.0 3966.6 3973.0 3971.4 -0.0 0.1 for-1 3962.0 3959.4 3961.0 3960.2 -0.0 0.0 for-2 2669.3 2668.9 2669.3 2668.1 0 -0.0 for-3 2776.2 2773.2 2777.2 2774.8 0.0 0.1 for-in-1 341.7 339.3 341.3 339.3 -0.1 -0.0 for-in-2 319.0 318.6 318.4 318.0 -0.2 -0.2 funcall-1 307.7 307.1 307.7 307.7 0 0.2 funcall-2 215.8 215.7 204.4 204.3 -5.3 -5.2 -- funcall-3 285.4 284.7 264.2 264.1 -7.4 -7.2 -- funcall-4 98.5 98.3 99.2 99.2 0.7 0.9 + globalvar-read-1 4793.2 4788.8 4794.2 4779.8 0.0 -0.2 globalvar-write-1 3383.6 3373.6 4056.9 4030.0 19.9 19.5 ++ isNaN-1 3912.1 3910.9 3907.1 3905.5 -0.1 -0.1 - isNaN-2 3961.0 3959.4 3962.0 3958.4 0.0 -0.0 isNaN-3 3956.0 3950.8 3956.0 3955.0 0 0.1 lookup-array-fetch-1 753.2 752.7 763.2 762.9 1.3 1.4 + lookup-array-in-1 1776.2 1774.6 1778.2 1777.8 0.1 0.2 + lookup-negindex-array-1 425.1 424.9 425.6 424.9 0.1 0.0 lookup-negindex-array-2 372.6 372.6 373.3 372.7 0.2 0.0 lookup-negindex-object-1 445.7 445.2 445.6 445.2 -0.0 -0.0 lookup-negindex-object-2 428.7 428.2 428.6 428.4 -0.0 0.0 lookup-object-fetch-1 790.2 789.5 801.4 801.3 1.4 1.5 + lookup-object-in-1 1459.5 1459.1 1458.5 1457.7 -0.1 -0.1 - number-toString-1 5.0 5.0 5 5.0 -0.5 -0.6 number-toString-2 72.3 72.3 68.2 67.9 -5.8 -6.0 -- oop-1 3.9 3.9 3.9 3.9 -1.0 -1.0 parseFloat-1 74.5 74.4 75.3 75.3 1.1 1.2 + parseInt-1 160.2 158.1 161.4 160.1 0.7 1.2 regex-exec-1 53.9 53.8 54.2 54.2 0.4 0.6 regex-exec-2 65.3 65.3 65.3 65.2 0 -0.1 regex-exec-3 99.8 99.7 100.3 100.3 0.5 0.6 regex-exec-4 254.0 253.7 252.7 252.4 -0.5 -0.5 - restarg-1 723.3 721.9 710.3 709.4 -1.8 -1.7 - restarg-2 439.6 439.1 444.1 442.9 1.0 0.9 + restarg-3 37.7 37.7 37.5 37.5 -0.6 -0.6 restarg-4 26.7 26.7 25.1 25.0 -6.1 -6.2 -- string-casechange-1 26.3 26.2 26.2 26.1 -0.4 -0.3 string-casechange-2 26.4 26.4 25.3 25.3 -4.0 -4.0 - string-charAt-1 1432.6 1430.8 1443.6 1442.6 0.8 0.8 + string-charAt-2 78.7 78.6 79.5 79.4 1.1 0.9 + string-charCodeAt-1 957.0 956.0 1389.6 1387.6 45.2 45.1 ++ string-charCodeAt-2 1269.7 1268.9 1267.7 1265.9 -0.2 -0.2 string-charCodeAt-3 1014.0 1013.6 1066.9 1066.9 5.2 5.3 ++ string-charCodeAt-4 1878.1 1877.5 1963.0 1958.4 4.5 4.3 + string-charCodeAt-5 628.7 628.2 930.1 929.3 47.9 47.9 ++ string-charCodeAt-6 957.0 955.7 1347.7 1346.1 40.8 40.9 ++ string-charCodeAt-7 1949.1 1947.3 1942.1 1941.1 -0.4 -0.3 - string-fromCharCode-1 272.9 272.5 272.5 272.3 -0.2 -0.1 - string-fromCharCode-2 64.7 64.7 65.8 65.8 1.6 1.7 + string-indexOf-1 204.4 204.1 203.6 203.0 -0.4 -0.5 string-indexOf-2 123.8 123.7 123.8 123.8 0 0.0 string-indexOf-3 112.4 112.3 109.0 109.0 -3.0 -3.0 - string-lastIndexOf-1 464.5 464.1 525.5 524.8 13.1 13.1 ++ string-lastIndexOf-2 158.2 158.2 160.9 160.8 1.7 1.7 + string-lastIndexOf-3 151.4 151.3 149.7 149.7 -1.1 -1.1 - string-slice-1 118.1 118.0 120.8 120.7 2.3 2.3 + string-split-1 9.5 9.5 9.5 9.4 -0.6 -0.5 string-split-2 9.3 9.3 9.3 9.3 -0.3 -0.2 string-substring-1 123.9 123.9 129.2 128.9 4.3 4.1 + switch-1 372.6 372.1 366.3 366.0 -1.7 -1.7 - switch-2 130.1 130.0 129.6 129.5 -0.4 -0.4 - switch-3 246.3 246.1 246.3 246.2 0 0.1 try-1 202.6 202.2 208.8 208.5 3.1 3.2 + try-2 14.6 14.6 14.6 14.5 0.1 -0.1 try-3 45.8 45.8 45.7 45.7 -0.3 -0.3 vector-push-1 41.0 41.0 41.0 41.0 0 -0.0 while-1 3961.0 3960.2 3962.0 3960.8 0.0 0.0

Edwin Smith

Comment 3

•

15 years ago

whoa. Would another assert be good? I'm thinking of a downstream check in Asssembler::gen() case LIR_alloc, to make sure the displacement assigned to any LIR_alloc >= 8 bytes is in fact 8-aligned.

Rick Reitmaier

Comment 4

•

15 years ago

Comment on attachment 509296 [details] [diff] [review] patch Fix looks good; amazing this didn't come up earlier...adding Nick. BTW, I found the current logic confusing and re-wrote the routine as part of reviewing this fix. If others feel the same I can post it. Also if someone could explain why we start the probe at nStackSlots rather than at 0. I.e. ' uint32_t const start = nStackSlots + (nStackSlots & 1); ' We certainly don't preserve this 'alignment' if we need to take the space past the high-water mark, so its unclear to me why we start the probe this way.

Attachment #509296 - Flags: review?(rreitmai)

Attachment #509296 - Flags: review+

Attachment #509296 - Flags: feedback?(nnethercote)

Nicholas Nethercote [inactive]

Comment 5

•

15 years ago

Comment on attachment 509296 [details] [diff] [review] patch I'm grumpy today. Apologies in advance. I don't like this code. I want this patch to improve. Specifically I hate the fact that _entries is tracking integers 0,1,2,3,4 but these actually end up as negative offsets like 0,-4,-8,-12,-16, which means that the first aligned 8-byte entry covers 1 and 2 because that translates to -4 and -8, rather than 0 and 1 which is what you'd expect. I don't that I had to spend 15 minutes reading the code and studying the TMFLAGS=native,activation output to understand this. This whole piece of code needs better comments.

Attachment #509296 - Flags: feedback?(nnethercote) → feedback-

Edwin Smith

Comment 6

•

15 years ago

Comment on attachment 509296 [details] [diff] [review] patch sr- based on njn's feedback. How about we try to get this patch to a state where the code after is more "obviously correct" than before, even though it will have one more suble tweak in it. Short of overhauling the AR structure, which seems out of scope for this fix, could we beef up the commenting so it is more clear whats going on? Rick, your help would be appreciated if you have the time.

Attachment #509296 - Flags: superreview?(edwsmith) → superreview-

Dan Smith

Updated

•

14 years ago

Severity: minor → normal

Flags: flashplayer-qrb+

Priority: -- → P2

Target Milestone: --- → Q3 11 - Serrano

Edwin Smith

Updated

•

14 years ago

Blocks: 638533

Steven Johnson

Comment 7

•

14 years ago

(In reply to comment #5) > I don't like this code. I want this patch to improve. Specifically I hate > the fact that _entries is tracking integers 0,1,2,3,4 but these actually end > up as negative offsets like 0,-4,-8,-12,-16, which means that the first > aligned 8-byte entry covers 1 and 2 because that translates to -4 and -8, > rather than 0 and 1 which is what you'd expect. Changing that will require nontrivial reworking of the way _entries is used. It's worth pointing out that the existing constraints on _entries is well-documented in Assembler.h (though, unfortunately, the comments call it "_entry" -- bit rot in the comments?). I agree that the situation is suboptimal, but as this is now actually blocking a real bug (on MIPS machines), IMHO we should augment this patch with more commenting and land it, rather than entertain an overhaul of this structure.

Severity: normal → minor

Priority: P2 → --

Target Milestone: Q3 11 - Serrano → ---

Steven Johnson

Comment 8

•

14 years ago

(In reply to comment #4) > BTW, I found the current logic confusing and re-wrote the routine as part of > reviewing this fix. If others feel the same I can post it. If you have an alternate patch that achieves the same result, but with clearer code,by all means post it ASAP -- this is essentially a blocking bug for Serrano so I'm eager to get consensus on a fix. > Also if someone could explain why we start the probe at nStackSlots rather > than at 0. I.e. ' uint32_t const start = nStackSlots + (nStackSlots & 1); No clue.

Edwin Smith

Comment 9

•

14 years ago

(In reply to comment #8) > (In reply to comment #4) > > BTW, I found the current logic confusing and re-wrote the routine as part of > > reviewing this fix. If others feel the same I can post it. > > If you have an alternate patch that achieves the same result, but with > clearer code,by all means post it ASAP -- this is essentially a blocking bug > for Serrano so I'm eager to get consensus on a fix. > > > Also if someone could explain why we start the probe at nStackSlots rather > > than at 0. I.e. ' uint32_t const start = nStackSlots + (nStackSlots & 1); > > No clue. Not sure, but probably so slots are re-used in low-to-high address order. If the search order matters, then we sure need a comment explaining why.

Krzysztof Palacz

Assignee

Comment 10

•

14 years ago

Attached patch Rick's cleanup of AR::reserveEntry() (obsolete) — Details — Splinter Review

The patch is a refactoring, the other patch still needed to fix behavior.

Attachment #535117 - Flags: review?(edwsmith)

Trevor Baker

Updated

•

14 years ago

Severity: minor → normal

Priority: -- → P2

Target Milestone: --- → Q3 11 - Serrano

Steven Johnson

Comment 12

•

14 years ago

Comment on attachment 535117 [details] [diff] [review] Rick's cleanup of AR::reserveEntry() Review of attachment 535117 [details] [diff] [review]: ----------------------------------------------------------------- ::: nanojit/Assembler.cpp @@ +2275,4 @@ > else > { > // alloc larger block on 8byte boundary. > + uint32_t const slotsNeeded = nStackSlots + (nStackSlots & 1); if nStackSlots = 1, this always rounds up to 2. Seems pointlessly wasteful. @@ +2276,5 @@ > { > // alloc larger block on 8byte boundary. > + uint32_t const slotsNeeded = nStackSlots + (nStackSlots & 1); > + uint32_t foundIt = 0; > + for (uint32_t i = 0; i <= _highWaterMark; i += 2) We previously began searching at "start", not zero. Intentional? @@ +2284,4 @@ > } > > + // No empty entry found in the middle so tack it on the end. > + if (foundIt == 0) The above loop starts searching at 0, so 0 is not a valid not-found marker. @@ +2288,5 @@ > { > + // Make sure there is enough room after aligning > + foundIt = _highWaterMark + (_highWaterMark & 1); > + if (foundIt + slotsNeeded >= NJ_MAX_STACK_ENTRY) > + return 0; // sorry no room. Invoking njn here: "sorry no room" -> "Sorry, no room." @@ +2290,5 @@ > + foundIt = _highWaterMark + (_highWaterMark & 1); > + if (foundIt + slotsNeeded >= NJ_MAX_STACK_ENTRY) > + return 0; // sorry no room. > + > + debug_only (if (_highWaterMark & 1) _entries[_highWaterMark] = NULL; ) We lost the assert that the entry was previously BAD_ENTRY. Intentional?

Rick Reitmaier

Comment 13

•

14 years ago

(In reply to comment #9) > (In reply to comment #8) > > (In reply to comment #4) > > > Also if someone could explain why we start the probe at nStackSlots rather > > > than at 0. I.e. ' uint32_t const start = nStackSlots + (nStackSlots & 1); > > Not sure, but probably so slots are re-used in low-to-high address order. > If the search order matters, then we sure need a comment explaining why. I believe the code is attempting to maintain alignment. E.g. if a request is made for 4 slots; @4B/slot=16Bytes, we start the search at 16. Added my patch; see attachment 535117 [details] [diff] [review]

Rick Reitmaier

Comment 14

•

14 years ago

(In reply to comment #12) > Comment on attachment 535117 [details] [diff] [review] [review] > Rick's cleanup of AR::reserveEntry() > I should have mentioned that attachment 535117 [details] [diff] [review] is untested and was a result of my own re-working of the code on how I believe it ought to function, not necessarily how it is current working.

Krzysztof Palacz

Assignee

Comment 15

•

14 years ago

Patch in attachment 509296 [details] [diff] [review] pushed to tr-serrano, changeset 6253:b05930b0733 *Not* pushed to nanojit-central or tr, pending resolution.

Andre Kruetzfeldt

Updated

•

14 years ago

Whiteboard: fixed-in-tr-serrano

Trevor Baker

Updated

•

14 years ago

Whiteboard: fixed-in-tr-serrano → fixed-in-serrano

Edwin Smith

Comment 16

•

14 years ago

Comment on attachment 535117 [details] [diff] [review] Rick's cleanup of AR::reserveEntry() Rick's comment that this is just a work-in-progress inclines me to R- until something done-ish is ready for nanojit-central/tamarin-redux. Should we reassign to Rick? What is the general status?

Rick Reitmaier

Comment 17

•

14 years ago

> What is the general status? Good question. I'd like to see Krzysztof's straightforward fix land, since it seems reckless not to correct this behaviour ASAP. As for the more general clean-up code (my wip patch), I'd suggest opening a new bug and posting the patch there.

Dan Smith

Updated

•

14 years ago

Target Milestone: Q3 11 - Serrano → Q4 11 - Anza

Rick Reitmaier

Updated

•

14 years ago

Attachment #535117 - Flags: review?(edwsmith)

Rick Reitmaier

Comment 18

•

14 years ago

Attached patch base fix — Details — Splinter Review

Pulling back from all the discussion around fixing this code in general for the long term, which we all agree is a great idea, but seeing as we haven't made any progress on that front let's get a fix in for the original, immediate issue. Opened tracker bug 687582 for managing the larger effort.

Attachment #509296 - Attachment is obsolete: true

Attachment #535117 - Attachment is obsolete: true

Attachment #560982 - Flags: review?(nnethercote)

Nicholas Nethercote [inactive]

Updated

•

14 years ago

Attachment #560982 - Attachment is patch: true

Nicholas Nethercote [inactive]

Updated

•

14 years ago

Attachment #560982 - Flags: review?(nnethercote) → review+

Dan Smith

Updated

•

14 years ago

Flags: flashplayer-injection-

Flags: flashplayer-bug-

Target Milestone: Q4 11 - Anza → Q1 12 - Brannan

patch 15 years ago Krzysztof Palacz 1.19 KB, patch	rreitmai : review+ edwsmith : superreview- n.nethercote : feedback-	Details \| Diff \| Splinter Review
Rick's cleanup of AR::reserveEntry() 14 years ago Krzysztof Palacz 3.06 KB, patch		Details \| Diff \| Splinter Review
base fix 14 years ago Rick Reitmaier 1.60 KB, patch	n.nethercote : review+	Details \| Diff \| Splinter Review
just like attachment 560982, generalized to 16- and 8-byte alignments. 14 years ago Felix S. Klock II [:pnkfelix, :fklock] 2.29 KB, patch		Details \| Diff \| Splinter Review
just like attachment 560982, generalized to 16- and 8-byte alignments. 14 years ago Felix S. Klock II [:pnkfelix, :fklock] 2.29 KB, patch	wmaddox : feedback+	Details \| Diff \| Splinter Review
more like attachment 560982 (less agressive, less general) 14 years ago Felix S. Klock II [:pnkfelix, :fklock] 1.62 KB, patch	wmaddox : feedback-	Details \| Diff \| Splinter Review
partial reserveEntry() cleanup 14 years ago William Maddox 5.94 KB, patch		Details \| Diff \| Splinter Review