Bugzilla

Reporter

Comment 1

•

13 years ago

Maybe the title is a little too broad, I have only seen this issue with the above code, but I have not been doing day-day work on windows.

Reporter

Comment 2

•

13 years ago

Ok I am also seeing an intermittent failure in as3/Types/Float4/flt4_4_6_5_1.as

  as3/Types/Float4/flt4_4_6_5_1.abc : float4Array.every() sum up = 8,8,8,8 FAILED! expected: 6,6,6,6
  as3/Types/Float4/flt4_4_6_5_1.abc : float4Array.forEach() sum up = 8,8,8,8 FAILED! expected: 6,6,6,6
  as3/Types/Float4/flt4_4_6_5_1.abc : float4Array.some() sum up = 8,8,8,8 FAILED! expected: 6,6,6,6
  as3/Types/Float4/flt4_4_6_5_1.abc : float4Array.join() = 0,0,0,0|2,2,2,2|2,2,2,2|2,2,2,2 FAILED! expected: 0,0,0,0|1,1,1,1|2,2,2,2|3,3,3,3
  as3/Types/Float4/flt4_4_6_5_1.abc : float4Array.indexOf(float4(2f)) = 1 FAILED! expected: 2
  as3/Types/Float4/flt4_4_6_5_1.abc : float4Array.lastIndexOf(float4(2f)) = 4 FAILED! expected: 3

Reporter

Comment 3

•

13 years ago

On Virgil's suggestion I also checked on a debug build to see if there was an assertion being fired. I was able to repo the issue with a debug build but no assertion was fired.

Assignee

Comment 4

•

13 years ago

Seems to come from the bibopFloat4Allocator - it returns unaligned values on x32 in Debugger builds, causing issues at runtime.
Easiest to reproduce - run with -Dverbose, you're likely to get a crash during the parsing of the float4 pool.

Comment 5

•

13 years ago

Specifically the objects are not 16-byte aligned, and there's no reliable way to get the C++ compilers (not the jit) to use load-unaligned SSE instructions.

In general MMgc has not been guaranteeing anything but 8-byte alignment.

In practice in Release and Release-Debugger builds, if the object size is divisible by 2^k for k >= 3 then GCAlloc delivers 2^k-byte alignment due to how the block is carved: the block is carved into objects starting from the end of the block and going toward lower addresses, until we run out of space, as opposed to starting at an 8-byte boundary after the header and going forward until we run out of space. This is true for both 32-bit and 64-bit builds.

In Debug and Debug-Debugger builds we add space for debugging information at the beginning and end of the object: notably we add 8 bytes at the beginning and so the pointer returned from the GC to the client is 8-byte aligned always.

I'm worried about some of the information in this bug report because it conflicts with my view that alignment should be a problem only in Debug builds: Brent reports failures in Release shells, Virgil talks about Debugger builds. I'm pretty sure my mental model of GC and GCAlloc is correct so there may be other things going on too.

I see two ways of dealing with this. One is to make DebugSize 16 bytes (half of which is unused); this wastes space but only in Debug builds and that may be good enough. The other is to make DebugSize a variable and store it in the block and/or the alloc - both block and alloc are easily available given an object without needing to know DebugSize, both in FixedAlloc and GCAlloc, by masking off the low bits. There would be added cost, but again only in Debug builds - a small space cost and a slightly larger time cost to read and add/subtract the value, instead of a constant. For a quick experiment it might be sufficient to make DebugSize 16 bytes and make some adjustments throughout the code to deal with the ramifications, then we can discuss the ideal solution later.

Comment 6

•

13 years ago

Ah, additionally the object sizes in Debug builds will need to be divisible by 16 to guarantee alignment of course.

On 32-bit builds the DebugSize has been 16 and float4 objects have been 32 bytes.  With DebugSize=24 then the object size is 40 bytes which is not divisible by 16; the objects will thus have to be 48 bytes (four times larger than the payload - I think the "ideal solution" is something else).

On 64-bit builds the DebugSize has been 24 but will now be 32; 32+16=48 bytes again.

(It still remains to be investigated why there would be a problem in Release builds.)

Comment 7

•

13 years ago

This is useful:

     template<>
     REALLY_INLINE void*
     GC::AllocBibopType<avmplus::AtomConstants::kBibopFloat4Type>()
     {
-        return AllocBibop(bibopAllocFloat4);
+        void* p = AllocBibop(bibopAllocFloat4);
+        if (uintptr_t(p) & 15)
+        {
+            printf("Bad alignment of float4\n");
+            GCHeap::GetGCHeap()->Abort();
+        }
+        return p;
     }
 
As expected it never fires in Release or Release-Debugger builds (MacOS X), but fires in Debug and Debug-Debugger builds.

Comment 8

•

13 years ago

I can repro the issue with -Ojit on both Mac and Windows 7 in Release-Debugger builds but as the previous remark makes clear, in non-Debug builds is not an object alignment issue, it is something else.  In fact with -Dinterp the tests pass as expected, and the interpreter boxes everything and should hit the allocator the hardest.

Assignee

Comment 9

•

13 years ago

Hmmm.. I'll look into it some more, seems to be two different issues then.

Comment 10

•

13 years ago

Attached patch Align float4 payload on 16-byte boundary in Debug builds — Details — Splinter Review

The patch is a stopgap solution to the alignment issue in Debug builds: it inserts four words of metadata at the beginning of an object instead of two.  When the allocation size is then rounded up to 16 bytes we're ensured that the user data for a float4 box is aligned at a 16-byte boundary.  There's a space cost to this, which I think we should ignore for now - we can afford to use this patch while we think about other solutions.

Suggest you incorporate this into the patch queue to see how it works for you (it passes testing here).

Reporter

Comment 11

•

13 years ago

Comment on attachment 569618 [details] [diff] [review]
Align float4 payload on 16-byte boundary in Debug builds

Is this patch only expected to fix "debug" builds or should it also be a stopgap for release builds? I am only wondering because I am seeing a lot of random failures in windows32 when the JIT is enabled. When running the float4 tests. I will hit between 10 -> 500 failures depending on how my machine feels.

This patch has not made its way into the patch queue yet.

Reporter

Comment 12

•

13 years ago

(In reply to Brent Baker from comment #11)
> Is this patch only expected to fix "debug" builds or should it also be a
> stopgap for release builds? I am only wondering because I am seeing a lot of
> random failures in windows32 when the JIT is enabled. When running the
> float4 tests. I will hit between 10 -> 500 failures depending on how my
> machine feels.
> 

To answer my own question, no this does not help the random failures I see when running a release shell with -Ojit against the as3/Types/Float4 tests

Assignee

Comment 13

•

13 years ago

Fixed into changeset 298:1b58f0aa83fc (which also imports Lars' patch)

Assignee

Updated

•

13 years ago

Status: ASSIGNED → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Comment 14

•

13 years ago

I'll create a separate bug to track the padding issue and the proposed fix.

Comment 15

•

13 years ago

(In reply to Lars T Hansen from comment #14)
> I'll create a separate bug to track the padding issue and the proposed fix.

Forked as bug #697672.