The default bug view has changed. See this FAQ.

Align nsAutoTArray<E>'s elements to their natural alignment

RESOLVED FIXED in mozilla10

Status

()

Core
XPCOM
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: Justin Lebar (not reading bugmail), Assigned: Justin Lebar (not reading bugmail))

Tracking

unspecified
mozilla10
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 4 obsolete attachments)

(Assignee)

Description

6 years ago
nsAutoTArray<E> aligns its elements to 8-byte boundaries.  Or at least, it tries to do this; it apparently aligns to 4-bytes on Mac (see bug 681755) and maybe on Linux.

This wastes 4 bytes of space on Windows 32 for each nsAutoTArray, which is annoying.  But it also means that nsAutoTArray<E> itself must be aligned to 8-byte boundaries, which increases the size of structs containing auto arrays.
(Assignee)

Updated

6 years ago
Assignee: nobody → justin.lebar+bug
(Assignee)

Comment 1

6 years ago
Created attachment 563195 [details] [diff] [review]
Patch v1
Attachment #563195 - Flags: review?(roc)
(Assignee)

Comment 2

6 years ago
Created attachment 563198 [details] [diff] [review]
Patch v2

Oops; v1 is an old version of the patch.
Attachment #563198 - Flags: review?(roc)
(Assignee)

Updated

6 years ago
Attachment #563195 - Attachment is obsolete: true
Attachment #563195 - Flags: review?(roc)
(Assignee)

Comment 3

6 years ago
The change nsBaseAppShell.cpp is to prevent a build error.  ipdl includes its own set of types and suppresses their redeclaration.  If we don't make the ipdl header the first one, nsTArray pulls in mfbt/Util.h and we redeclare uint8 and friends.
As I understand it, there are really only two auto-array layouts we care about in 32-bit systems, and one layout for 64-bit.

On 64-bit we have
  Header* mHdr;   // 0-7
  Header  header; // 8-15
  T[]     data;   // 16-
Everything's 8-byte aligned and we don't support 16-byte alignment, so we're all good.

On 32-bit we want
  Header* mHdr;   // 0-3
  Header  header; // 4-11
  T[]     data;   // 12-
when alignof(T) is <= 4, otherwise
  Header* mHdr;   // 0-3
  Header  header; // 4-11
  char[4] padding;// 12-15
  T[]     data;   // 16-
when alignof(T) is 8.

So what if we passed the alignment as an extra template parameter to nsAutoArrayBase, statically assert that it's 1, 2, 4, or 8, and then for the "32-bit platform, align-8" case, have a specialized implementation that aligns mAutoBuf appropriately? And, add the alignment as a parameter to nsTArray_base as well and specialize GetAutoArrayBufferUnsafe() based on it, to add 4 bytes to the offset in the 32-bit platform, align-8 case?

It seems to me that would be a bit simpler, and less reliant on the compiler optimizing away lots of checks in GetAutoArrayBufferUnsafe().
Actually we don't need to add any new parameters to nsTArray_base. We should be able to write an expression using align_of on the element type.
(Assignee)

Comment 6

6 years ago
> It seems to me that would be a bit simpler, and less reliant on the compiler optimizing away lots of 
> checks in GetAutoArrayBufferUnsafe().

Just to be clear, I don't think there's a speed issue.  We don't call GetAutoArrayBuffer except in expensive operations like ShrinkCapacity and SwapArrayElements.

> Actually we don't need to add any new parameters to nsTArray_base. We should be able to write an 
> expression using align_of on the element type.

I don't understand what you're suggesting here.
(Assignee)

Comment 7

6 years ago
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #5)
> Actually we don't need to add any new parameters to nsTArray_base. We should
> be able to write an expression using align_of on the element type.

Do you mean just that we could get rid of the |elemAlign == 1| and |elemAlign == 2| checks in GetAutoArrayBuffer?  I think we could.  I left them in because I thought removing them was a premature optimization; I was trying to make nsTArray_base rely on as few implementation details of nsAutoArrayBase as possible, although of course I didn't get very far with that goal.  :)
(In reply to Justin Lebar [:jlebar] from comment #6)
> > It seems to me that would be a bit simpler, and less reliant on the compiler
> > optimizing away lots of checks in GetAutoArrayBufferUnsafe().
> 
> Just to be clear, I don't think there's a speed issue.  We don't call
> GetAutoArrayBuffer except in expensive operations like ShrinkCapacity and
> SwapArrayElements.

Er, yeah. Right.

> > Actually we don't need to add any new parameters to nsTArray_base. We should be
> > able to write an expression using align_of on the element type.
> 
> I don't understand what you're suggesting here.

Yeah, it made no sense.

I mean change AutoArray to be
  struct AutoArray {
    Header *mHdr;
    PRUint32 maybeAligned;
  };

and then
  Header* GetAutoArrayBufferUnsafe(PRUint32 aAlignment) {
    PRUint32* buf = &(reinterpret_cast<AutoArray*>(&mHdr))->maybeAligned;
#ifdef IS_32BIT
    // 4 bytes of padding needed for 8-byte-aligned data on 32-bit systems
    buf += aAlignment == 8;
#endif
    return reinterpret_cast<Header*>(buf));
  }
(Assignee)

Comment 9

6 years ago
I'm happy to do it this way, but isn't this basically the same as dropping the unnecessary elemAlign checks?

+nsTArrayHeader* nsTArray_base<Alloc>::GetAutoArrayBufferUnsafe(size_t elemAlign) {
+# define GET_AUTOBUF(align) \
+    &(reinterpret_cast<nsAutoArrayBase<nsTArray<mozilla::AlignedElem<align> >, 1>*>(this)->mAutoBuf);
+
+  NS_ASSERTION(elemAlign <= 8, "Unsupported alignment.");
+  void* autoBuf = nsnull;
+  if (sizeof(void*) == 4 && elemAlign <= 4) {
+    autoBuf = GET_AUTOBUF(4);
+  }
+  else {
+    autoBuf = GET_AUTOBUF(8);
+  }
(Assignee)

Comment 10

6 years ago
Created attachment 563529 [details] [diff] [review]
Patch v3
Attachment #563529 - Flags: review?(roc)
(Assignee)

Updated

6 years ago
Attachment #563198 - Attachment is obsolete: true
Attachment #563198 - Flags: review?(roc)
(Assignee)

Comment 11

6 years ago
Roc, I'm not in a rush here, but I wanted to make sure this patch didn't fall through the cracks.
Comment on attachment 563529 [details] [diff] [review]
Patch v3

Review of attachment 563529 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/glue/nsTArray-inl.h
@@ +76,5 @@
> +
> +  if (MOZ_ALIGNOF(void*) == 4 && elemAlign <= 4) {
> +    return reinterpret_cast<Header*>(&THIS_AS_AUTOARRAY(4)->mAutoBuf);
> +  }
> +  return reinterpret_cast<Header*>(&THIS_AS_AUTOARRAY(8)->mAutoBuf);

I had a comment which I lost ...

Here, I think you can just make nsAutoArrayBase assume 4-byte element alignment and write
  
  ... get pointer to elements in nsAutoArrayBase ...
  if (sizeof(void*) == 4 && elemAlign == 8) {
    ... add 4 to pointer ...
  }

You don't actually need AlignedElem.
(Assignee)

Comment 13

6 years ago
> You don't actually need AlignedElem.

I see.  This concerns me some because AIUI |alignof(void*) == 4 && elemAlign == 8| does not necessarily mean we'll have an extra 4 bytes of structure padding to skip over on all architectures.  I'm not sure we have many auto arrays with 8-byte-aligned elements, so code wouldn't necessarily break spectacularly.

We could add a test to the TArray tests, which is probably good enough, so long as they don't get turned off again...

Unfortunately we still need AlignedElem in nsAutoArrayBase because the argument to __attribute__((aligned(foo)) must be a parse-time constant (or something like that, anyway).

+  // Declare mAutoBuf aligned to the maximum of the header's alignment and
+  // elem_type's alignment.  We need to use a union rather than
+  // MOZ_ALIGNED_DECL because GCC is picky about what goes into
+  // __attribute__((aligned(foo))).
   union {
-    char mAutoBuf[sizeof(Header) + N * sizeof(elem_type)];
-    PRUint64 dummy;
+    char mAutoBuf[sizeof(nsTArrayHeader) + N * sizeof(elem_type)];
+    mozilla::AlignedElem<PR_MAX(MOZ_ALIGNOF(Header), MOZ_ALIGNOF(elem_type))> mAlign;
+  }
(In reply to Justin Lebar [:jlebar] from comment #13)
> > You don't actually need AlignedElem.
> 
> I see.  This concerns me some because AIUI |alignof(void*) == 4 && elemAlign
> == 8| does not necessarily mean we'll have an extra 4 bytes of structure
> padding to skip over on all architectures.

This is true in principle, but not in practice.

> I'm not sure we have many auto
> arrays with 8-byte-aligned elements, so code wouldn't necessarily break
> spectacularly.

You can statically assert that on a 32-bit architecture, sizeof(nsAutoTArray<double,1>) is exactly what we expect it to be.

> Unfortunately we still need AlignedElem in nsAutoArrayBase because the
> argument to __attribute__((aligned(foo)) must be a parse-time constant (or
> something like that, anyway).

I don't know why you need __attribute__((aligned(foo)).
(Assignee)

Comment 15

6 years ago
> You can statically assert that on a 32-bit architecture, sizeof(nsAutoTArray<double,1>) 
> is exactly what we expect it to be.

Sold.

> I don't know why you need __attribute__((aligned(foo)).

When we're declaring storage for the auto buffer, we need to align the storage to 4 or 8 bytes.

We could do this directly, as

  __attribute__((aligned(PR_MAX(...))) char mAutoBuf[N * sizeof...]

except that what goes inside __attribute__((aligned)) has to be a parse-time constant, or something.

So instead, we need to use a union.  Which means, I think, we need to use template specialization to choose a type to union with which has alignment 4 or 8 bytes.  That is, we need to write

  union {
    char mAutoBuf[...];
    AlignedType<PR_MAX(...)> mAlign;
  };

So we don't need the general-purpose mozilla::AlignedElem specifically, but we do need something similar, which is defined for 4 and 8 byte alignment.
(Assignee)

Comment 16

6 years ago
Created attachment 564837 [details] [diff] [review]
Patch v4
Attachment #564837 - Flags: review?(roc)
(Assignee)

Updated

6 years ago
Attachment #563529 - Attachment is obsolete: true
Attachment #563529 - Flags: review?(roc)
(Assignee)

Comment 17

6 years ago
Comment on attachment 564837 [details] [diff] [review]
Patch v4

This is unexpectedly burning on mac...
Attachment #564837 - Flags: review?(roc) → review-
(Assignee)

Comment 18

6 years ago
...and it's burning because MOZ_ALIGNOF(double) == 4 on *nix x86-32.
(Assignee)

Comment 19

6 years ago
Created attachment 564962 [details] [diff] [review]
Patch v5
(Assignee)

Updated

6 years ago
Attachment #564837 - Attachment is obsolete: true
(Assignee)

Comment 20

6 years ago
https://tbpl.mozilla.org/?tree=Try&rev=15aa916a0a58
(Assignee)

Comment 21

6 years ago
Comment on attachment 564962 [details] [diff] [review]
Patch v5

Latest push seems to be fine on try.  But I'm not sure if this is what you're looking for, Roc.
Attachment #564962 - Flags: review?(roc)
Comment on attachment 564962 [details] [diff] [review]
Patch v5

Review of attachment 564962 [details] [diff] [review]:
-----------------------------------------------------------------

This'll do!
Attachment #564962 - Flags: review?(roc) → review+
(Assignee)

Comment 23

6 years ago
Thanks!  Inbound: https://hg.mozilla.org/integration/mozilla-inbound/rev/9adcf4cbfc71
Whiteboard: [inbound]
You added a bunch of PR_STATIC_ASSERTs, but also a definition for MOZ_STATIC_ASSERT. Did you mean to?
(Assignee)

Comment 25

6 years ago
I'd been using the MOZ_STATIC_ASSERT in mfbt/Util.h, but it later became unnecessary.  I did mean to use PR_STATIC_ASSERT in the TArray code.

I can get rid of MOZ_STATIC_ASSERT; yagni and all that.
(Assignee)

Comment 26

6 years ago
Created attachment 565219 [details] [diff] [review]
Remove MOZ_STATIC_ASSERT
Attachment #565219 - Flags: review?(roc)
Comment on attachment 564962 [details] [diff] [review]
Patch v5

https://hg.mozilla.org/mozilla-central/rev/9adcf4cbfc71
Attachment #564962 - Flags: checkin+
Leaving open for remaining patch.
Status: NEW → ASSIGNED
Whiteboard: [inbound]
Target Milestone: --- → mozilla10
Attachment #565219 - Flags: review?(roc) → review+
Followup: https://hg.mozilla.org/mozilla-central/rev/831fcf93a1fe
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.