Closed
Bug 689433
Opened 13 years ago
Closed 13 years ago
Align nsAutoTArray<E>'s elements to their natural alignment
Categories
(Core :: XPCOM, defect)
Core
XPCOM
Tracking
()
RESOLVED
FIXED
mozilla10
People
(Reporter: justin.lebar+bug, Assigned: justin.lebar+bug)
References
Details
Attachments
(2 files, 4 obsolete files)
29.07 KB,
patch
|
roc
:
review+
emorley
:
checkin+
|
Details | Diff | Splinter Review |
998 bytes,
patch
|
roc
:
review+
|
Details | Diff | Splinter Review |
nsAutoTArray<E> aligns its elements to 8-byte boundaries. Or at least, it tries to do this; it apparently aligns to 4-bytes on Mac (see bug 681755) and maybe on Linux. This wastes 4 bytes of space on Windows 32 for each nsAutoTArray, which is annoying. But it also means that nsAutoTArray<E> itself must be aligned to 8-byte boundaries, which increases the size of structs containing auto arrays.
Assignee | ||
Updated•13 years ago
|
Assignee: nobody → justin.lebar+bug
Assignee | ||
Comment 1•13 years ago
|
||
Attachment #563195 -
Flags: review?(roc)
Assignee | ||
Comment 2•13 years ago
|
||
Oops; v1 is an old version of the patch.
Attachment #563198 -
Flags: review?(roc)
Assignee | ||
Updated•13 years ago
|
Attachment #563195 -
Attachment is obsolete: true
Attachment #563195 -
Flags: review?(roc)
Assignee | ||
Comment 3•13 years ago
|
||
The change nsBaseAppShell.cpp is to prevent a build error. ipdl includes its own set of types and suppresses their redeclaration. If we don't make the ipdl header the first one, nsTArray pulls in mfbt/Util.h and we redeclare uint8 and friends.
As I understand it, there are really only two auto-array layouts we care about in 32-bit systems, and one layout for 64-bit. On 64-bit we have Header* mHdr; // 0-7 Header header; // 8-15 T[] data; // 16- Everything's 8-byte aligned and we don't support 16-byte alignment, so we're all good. On 32-bit we want Header* mHdr; // 0-3 Header header; // 4-11 T[] data; // 12- when alignof(T) is <= 4, otherwise Header* mHdr; // 0-3 Header header; // 4-11 char[4] padding;// 12-15 T[] data; // 16- when alignof(T) is 8. So what if we passed the alignment as an extra template parameter to nsAutoArrayBase, statically assert that it's 1, 2, 4, or 8, and then for the "32-bit platform, align-8" case, have a specialized implementation that aligns mAutoBuf appropriately? And, add the alignment as a parameter to nsTArray_base as well and specialize GetAutoArrayBufferUnsafe() based on it, to add 4 bytes to the offset in the 32-bit platform, align-8 case? It seems to me that would be a bit simpler, and less reliant on the compiler optimizing away lots of checks in GetAutoArrayBufferUnsafe().
Actually we don't need to add any new parameters to nsTArray_base. We should be able to write an expression using align_of on the element type.
Assignee | ||
Comment 6•13 years ago
|
||
> It seems to me that would be a bit simpler, and less reliant on the compiler optimizing away lots of > checks in GetAutoArrayBufferUnsafe(). Just to be clear, I don't think there's a speed issue. We don't call GetAutoArrayBuffer except in expensive operations like ShrinkCapacity and SwapArrayElements. > Actually we don't need to add any new parameters to nsTArray_base. We should be able to write an > expression using align_of on the element type. I don't understand what you're suggesting here.
Assignee | ||
Comment 7•13 years ago
|
||
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #5) > Actually we don't need to add any new parameters to nsTArray_base. We should > be able to write an expression using align_of on the element type. Do you mean just that we could get rid of the |elemAlign == 1| and |elemAlign == 2| checks in GetAutoArrayBuffer? I think we could. I left them in because I thought removing them was a premature optimization; I was trying to make nsTArray_base rely on as few implementation details of nsAutoArrayBase as possible, although of course I didn't get very far with that goal. :)
(In reply to Justin Lebar [:jlebar] from comment #6) > > It seems to me that would be a bit simpler, and less reliant on the compiler > > optimizing away lots of checks in GetAutoArrayBufferUnsafe(). > > Just to be clear, I don't think there's a speed issue. We don't call > GetAutoArrayBuffer except in expensive operations like ShrinkCapacity and > SwapArrayElements. Er, yeah. Right. > > Actually we don't need to add any new parameters to nsTArray_base. We should be > > able to write an expression using align_of on the element type. > > I don't understand what you're suggesting here. Yeah, it made no sense. I mean change AutoArray to be struct AutoArray { Header *mHdr; PRUint32 maybeAligned; }; and then Header* GetAutoArrayBufferUnsafe(PRUint32 aAlignment) { PRUint32* buf = &(reinterpret_cast<AutoArray*>(&mHdr))->maybeAligned; #ifdef IS_32BIT // 4 bytes of padding needed for 8-byte-aligned data on 32-bit systems buf += aAlignment == 8; #endif return reinterpret_cast<Header*>(buf)); }
Assignee | ||
Comment 9•13 years ago
|
||
I'm happy to do it this way, but isn't this basically the same as dropping the unnecessary elemAlign checks? +nsTArrayHeader* nsTArray_base<Alloc>::GetAutoArrayBufferUnsafe(size_t elemAlign) { +# define GET_AUTOBUF(align) \ + &(reinterpret_cast<nsAutoArrayBase<nsTArray<mozilla::AlignedElem<align> >, 1>*>(this)->mAutoBuf); + + NS_ASSERTION(elemAlign <= 8, "Unsupported alignment."); + void* autoBuf = nsnull; + if (sizeof(void*) == 4 && elemAlign <= 4) { + autoBuf = GET_AUTOBUF(4); + } + else { + autoBuf = GET_AUTOBUF(8); + }
Assignee | ||
Comment 10•13 years ago
|
||
Attachment #563529 -
Flags: review?(roc)
Assignee | ||
Updated•13 years ago
|
Attachment #563198 -
Attachment is obsolete: true
Attachment #563198 -
Flags: review?(roc)
Assignee | ||
Comment 11•13 years ago
|
||
Roc, I'm not in a rush here, but I wanted to make sure this patch didn't fall through the cracks.
Comment on attachment 563529 [details] [diff] [review] Patch v3 Review of attachment 563529 [details] [diff] [review]: ----------------------------------------------------------------- ::: xpcom/glue/nsTArray-inl.h @@ +76,5 @@ > + > + if (MOZ_ALIGNOF(void*) == 4 && elemAlign <= 4) { > + return reinterpret_cast<Header*>(&THIS_AS_AUTOARRAY(4)->mAutoBuf); > + } > + return reinterpret_cast<Header*>(&THIS_AS_AUTOARRAY(8)->mAutoBuf); I had a comment which I lost ... Here, I think you can just make nsAutoArrayBase assume 4-byte element alignment and write ... get pointer to elements in nsAutoArrayBase ... if (sizeof(void*) == 4 && elemAlign == 8) { ... add 4 to pointer ... } You don't actually need AlignedElem.
Assignee | ||
Comment 13•13 years ago
|
||
> You don't actually need AlignedElem.
I see. This concerns me some because AIUI |alignof(void*) == 4 && elemAlign == 8| does not necessarily mean we'll have an extra 4 bytes of structure padding to skip over on all architectures. I'm not sure we have many auto arrays with 8-byte-aligned elements, so code wouldn't necessarily break spectacularly.
We could add a test to the TArray tests, which is probably good enough, so long as they don't get turned off again...
Unfortunately we still need AlignedElem in nsAutoArrayBase because the argument to __attribute__((aligned(foo)) must be a parse-time constant (or something like that, anyway).
+ // Declare mAutoBuf aligned to the maximum of the header's alignment and
+ // elem_type's alignment. We need to use a union rather than
+ // MOZ_ALIGNED_DECL because GCC is picky about what goes into
+ // __attribute__((aligned(foo))).
union {
- char mAutoBuf[sizeof(Header) + N * sizeof(elem_type)];
- PRUint64 dummy;
+ char mAutoBuf[sizeof(nsTArrayHeader) + N * sizeof(elem_type)];
+ mozilla::AlignedElem<PR_MAX(MOZ_ALIGNOF(Header), MOZ_ALIGNOF(elem_type))> mAlign;
+ }
(In reply to Justin Lebar [:jlebar] from comment #13) > > You don't actually need AlignedElem. > > I see. This concerns me some because AIUI |alignof(void*) == 4 && elemAlign > == 8| does not necessarily mean we'll have an extra 4 bytes of structure > padding to skip over on all architectures. This is true in principle, but not in practice. > I'm not sure we have many auto > arrays with 8-byte-aligned elements, so code wouldn't necessarily break > spectacularly. You can statically assert that on a 32-bit architecture, sizeof(nsAutoTArray<double,1>) is exactly what we expect it to be. > Unfortunately we still need AlignedElem in nsAutoArrayBase because the > argument to __attribute__((aligned(foo)) must be a parse-time constant (or > something like that, anyway). I don't know why you need __attribute__((aligned(foo)).
Assignee | ||
Comment 15•13 years ago
|
||
> You can statically assert that on a 32-bit architecture, sizeof(nsAutoTArray<double,1>) > is exactly what we expect it to be. Sold. > I don't know why you need __attribute__((aligned(foo)). When we're declaring storage for the auto buffer, we need to align the storage to 4 or 8 bytes. We could do this directly, as __attribute__((aligned(PR_MAX(...))) char mAutoBuf[N * sizeof...] except that what goes inside __attribute__((aligned)) has to be a parse-time constant, or something. So instead, we need to use a union. Which means, I think, we need to use template specialization to choose a type to union with which has alignment 4 or 8 bytes. That is, we need to write union { char mAutoBuf[...]; AlignedType<PR_MAX(...)> mAlign; }; So we don't need the general-purpose mozilla::AlignedElem specifically, but we do need something similar, which is defined for 4 and 8 byte alignment.
Assignee | ||
Comment 16•13 years ago
|
||
Attachment #564837 -
Flags: review?(roc)
Assignee | ||
Updated•13 years ago
|
Attachment #563529 -
Attachment is obsolete: true
Attachment #563529 -
Flags: review?(roc)
Assignee | ||
Comment 17•13 years ago
|
||
Comment on attachment 564837 [details] [diff] [review] Patch v4 This is unexpectedly burning on mac...
Attachment #564837 -
Flags: review?(roc) → review-
Assignee | ||
Comment 18•13 years ago
|
||
...and it's burning because MOZ_ALIGNOF(double) == 4 on *nix x86-32.
Assignee | ||
Comment 19•13 years ago
|
||
Assignee | ||
Updated•13 years ago
|
Attachment #564837 -
Attachment is obsolete: true
Assignee | ||
Comment 20•13 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=15aa916a0a58
Assignee | ||
Comment 21•13 years ago
|
||
Comment on attachment 564962 [details] [diff] [review] Patch v5 Latest push seems to be fine on try. But I'm not sure if this is what you're looking for, Roc.
Attachment #564962 -
Flags: review?(roc)
Comment on attachment 564962 [details] [diff] [review] Patch v5 Review of attachment 564962 [details] [diff] [review]: ----------------------------------------------------------------- This'll do!
Attachment #564962 -
Flags: review?(roc) → review+
Assignee | ||
Comment 23•13 years ago
|
||
Thanks! Inbound: https://hg.mozilla.org/integration/mozilla-inbound/rev/9adcf4cbfc71
Whiteboard: [inbound]
Comment 24•13 years ago
|
||
You added a bunch of PR_STATIC_ASSERTs, but also a definition for MOZ_STATIC_ASSERT. Did you mean to?
Assignee | ||
Comment 25•13 years ago
|
||
I'd been using the MOZ_STATIC_ASSERT in mfbt/Util.h, but it later became unnecessary. I did mean to use PR_STATIC_ASSERT in the TArray code. I can get rid of MOZ_STATIC_ASSERT; yagni and all that.
Assignee | ||
Comment 26•13 years ago
|
||
Attachment #565219 -
Flags: review?(roc)
Comment 27•13 years ago
|
||
Comment on attachment 564962 [details] [diff] [review] Patch v5 https://hg.mozilla.org/mozilla-central/rev/9adcf4cbfc71
Attachment #564962 -
Flags: checkin+
Comment 28•13 years ago
|
||
Leaving open for remaining patch.
Status: NEW → ASSIGNED
Whiteboard: [inbound]
Updated•13 years ago
|
Target Milestone: --- → mozilla10
Attachment #565219 -
Flags: review?(roc) → review+
Comment 29•13 years ago
|
||
Followup: https://hg.mozilla.org/mozilla-central/rev/831fcf93a1fe
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•