Closed
Bug 871596
Opened 12 years ago
Closed 11 years ago
investigate serialization/deserialization IPC overhead
Categories
(Core :: IPC, defect)
Core
IPC
Tracking
()
RESOLVED
FIXED
mozilla24
People
(Reporter: vlad, Unassigned)
References
(Blocks 1 open bug)
Details
Attachments
(3 files, 4 obsolete files)
5.50 KB,
patch
|
bent.mozilla
:
review+
|
Details | Diff | Splinter Review |
7.13 KB,
patch
|
bent.mozilla
:
review+
|
Details | Diff | Splinter Review |
10.00 KB,
patch
|
bent.mozilla
:
review+
|
Details | Diff | Splinter Review |
Our IPC code is directly in the critical path, especially on B2G (e.g. time from finger down to the action being reflected on the screen has IPC right in the middle of it). For some unrelated things, I was looking at the serialization/deserialization code.
The higher-level code (e.g. structs etc.) calls down to lower level primitives, which eventually get to code in Pickle (e.g. ReadInt32) and friends. This code isn't in a header file, so likely doesn't get inlined, and each one looks something like:
bool Pickle::ReadInt32(void** iter, int32_t* result) const {
DCHECK(iter);
if (!*iter)
*iter = const_cast<char*>(payload());
if (!IteratorHasRoomFor(*iter, sizeof(*result)))
return false;
memcpy(result, *iter, sizeof(*result));
UpdateIter(iter, sizeof(*result));
return true;
}
IteratorHasRoomFor() does checks on the current iter pointer, the message header, and the remaining size to make sure that we can read sizeof(*result) from it.
This all seems like a large amount of overhead to me. There's no reason why we shouldn't be able to generate code to directly [de]serialize all the primitive data types straight in the higher-level generated message/struct code for this, including doing one size check early on, instead of calling multiple layers to eventually get to a ReadInt32 above.
Word of warning, this code hasn't ever showed up on profiles, and my gut intuition is it's not hot enough to bother with. There are other parts of IPC that are more expensive than this, like the work covered by bug 787363. Would definitely want to start here with a testcase.
Reporter | ||
Comment 2•12 years ago
|
||
Yeah, I haven't seen it in profiles either, which is a good argument for not poking into it.. but at the same time, by inspection, it's not how you'd want to do it for performance. Testcases would definitely be good (I may poke, and get a standalone ipc setup going), as would just force-inlining all the Pickle functions and looking at the generated code that the compiler spits out for the higher level message sending functions.
It might also be useful to make sure that we have some ipc stuff in our pgo automation script. Maybe then we could just let the compiler take care of this.
Comment 4•12 years ago
|
||
PGO won't help on b2g, but it's true that our current PGO profile gathering script doesn't do a whole lot of useful things (including examining the IPC code.)
Comment 5•12 years ago
|
||
The Pickle::Read* functions have a couple of issues. Take a look at Pickle::ReadInt64:
bool Pickle::ReadInt64(void** iter, int64_t* result) const {
DCHECK(iter);
if (!*iter)
*iter = const_cast<char*>(payload());
if (!IteratorHasRoomFor(*iter, sizeof(*result)))
return false;
memcpy(result, *iter, sizeof(*result));
UpdateIter(iter, sizeof(*result));
return true;
}
which compiles to the following assembly on m-c:
ca7744: b570 push {r4, r5, r6, lr}
ca7746: 680b ldr r3, [r1, #0]
ca7748: 460c mov r4, r1
ca774a: 4616 mov r6, r2
ca774c: b91b cbnz r3, ca7756 <NS_InvokeByIndex+0x23c5a> (0)
ca774e: 6802 ldr r2, [r0, #0]
ca7750: 6843 ldr r3, [r0, #4]
ca7752: 18d3 adds r3, r2, r3
ca7754: 600b str r3, [r1, #0]
ca7756: 6825 ldr r5, [r4, #0]
ca7758: 2208 movs r2, #8
ca775a: 4629 mov r1, r5
ca775c: f7ff fe84 bl ca7468 <NS_InvokeByIndex+0x2396c> (1)
ca7760: b160 cbz r0, ca777c <NS_InvokeByIndex+0x23c80>
ca7762: 4629 mov r1, r5
ca7764: 2208 movs r2, #8
ca7766: 4630 mov r0, r6
ca7768: f632 cf76 blx 2da658 <JSD_JSContextInUse-0x51c0> (2)
ca776c: 2008 movs r0, #8
ca776e: 2104 movs r1, #4
ca7770: f7ff fe90 bl ca7494 <NS_InvokeByIndex+0x23998> (3)
ca7774: 6823 ldr r3, [r4, #0]
ca7776: 1818 adds r0, r3, r0
ca7778: 6020 str r0, [r4, #0]
ca777a: 2001 movs r0, #1
ca777c: bd70 pop {r4, r5, r6, pc}
(0) is a check for NULL *iter. See below.
(1) is a call to IteratorHasRoomFor. This is a complicated function, so can be
out-of-line. I don't see a good way to improve on this or (0) short of
rejiggering all of the IPC code so that we're only checking bounds once, rather
than on every read/write.
(2) is a call to memcpy that goes through the PLT. This could be better optimized
with newer versions of GCC, I think (since they are more aware of ARM's
unaligned memory access instructions), but given that we're aligning all of
Pickle's writes anyway, we ought to be able to do a better job here. Still
working out the patch for this.
(3) is a call to UpdateIter. We shouldn't be calling this out-of-line, except that
the sole function called in UpdateIter, AlignInt, is written in a too-general
fashion:
// Aligns 'i' by rounding it up to the next multiple of 'alignment'
static uint32_t AlignInt(uint32_t i, int alignment) {
return i + (alignment - (i % alignment)) % alignment;
}
which results in the following assembly:
ca7494: b570 push {r4, r5, r6, lr}
ca7496: 460c mov r4, r1
ca7498: 4605 mov r5, r0
ca749a: f633 ca9a blx 2da9d0 <JSD_JSContextInUse-0x4e48>
ca749e: 1a60 subs r0, r4, r1
ca74a0: 4621 mov r1, r4
ca74a2: f633 ca96 blx 2da9d0 <JSD_JSContextInUse-0x4e48>
ca74a6: 1948 adds r0, r1, r5
ca74a8: bd70 pop {r4, r5, r6, pc}
Those blx instructions are calls--through the PLT (!)--to __aeabi_idivmod, I
believe. Which is stupid, because |alignment| here is only ever a power of 2,
and we ought to have inlined the whole ball of wax. (AlignInt as written works
for any number, but I believe the inliner is reluctant to inline because the
divisions are out-of-line function calls and/or because of compiling with -Os.)
The posted patch reworks the alignment bits so that we can verify at compile-time
that we're aligning to a power of 2. And then we can do an efficient alignment
which the compiler appears to be more amenable to inlining.
I'll double-check with a try push, but this seems to DTRT on opt+debug builds on
x86-64 Linux with GCC 4.4. Like Chris said, doesn't show up on profiles, but a
couple fewer instructions plus less jumping out of libxul can't be a bad thing,
right?
Writes are another story. Writes all go through a general mechanism
based on memcpy, so optimizing things down to single-instruction writes
or thereabouts will take a little work.
Attachment #750576 -
Flags: review?(bent.mozilla)
Comment 6•12 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #5)
> The posted patch reworks the alignment bits so that we can verify at
> compile-time
> that we're aligning to a power of 2. And then we can do an efficient
> alignment
> which the compiler appears to be more amenable to inlining.
>
> I'll double-check with a try push, but this seems to DTRT on opt+debug
> builds on
> x86-64 Linux with GCC 4.4. Like Chris said, doesn't show up on profiles,
> but a
> couple fewer instructions plus less jumping out of libxul can't be a bad
> thing,
> right?
Assembly from ParseInt64 on a try push:
caa628: b570 push {r4, r5, r6, lr}
caa62a: 680b ldr r3, [r1, #0]
caa62c: 460c mov r4, r1
caa62e: 4616 mov r6, r2
caa630: b91b cbnz r3, caa63a <NS_InvokeByIndex+0x23c96>
caa632: 6802 ldr r2, [r0, #0]
caa634: 6843 ldr r3, [r0, #4]
caa636: 18d3 adds r3, r2, r3
caa638: 600b str r3, [r1, #0]
caa63a: 6825 ldr r5, [r4, #0]
caa63c: 2208 movs r2, #8
caa63e: 4629 mov r1, r5
caa640: f7ff fe9e bl caa380 <NS_InvokeByIndex+0x239dc>
caa644: b140 cbz r0, caa658 <NS_InvokeByIndex+0x23cb4>
caa646: 4630 mov r0, r6
caa648: 4629 mov r1, r5
caa64a: 2208 movs r2, #8
caa64c: f630 cef8 blx 2db440 <JSD_JSContextInUse-0x5218>
caa650: 6823 ldr r3, [r4, #0] (*)
caa652: 2001 movs r0, #1
caa654: 3308 adds r3, #8
caa656: 6023 str r3, [r4, #0]
caa658: bd70 pop {r4, r5, r6, pc}
The code at (*) and following is the new hotness of UpdateIter with better alignment code. So the optimization does work.
Comment on attachment 750576 [details] [diff] [review]
rewrite Pickle's alignment mechanism to be more obviously optimizable
Review of attachment 750576 [details] [diff] [review]:
-----------------------------------------------------------------
::: ipc/chromium/src/base/pickle.h
@@ +247,5 @@
> bool Resize(uint32_t new_capacity);
>
> + // Round 'bytes' up to the next multiple of 'alignment'. 'alignment' must be
> + // a power of 2.
> + template<uint32_t alignment> struct ConstantAligner {
So does the optimizer just do a better job with the addition of this struct, the static assert, or the reworked rounding operation? Are all three changes needed?
@@ +257,5 @@
> + };
> + // Round 'bytes' up to the next multiple of 'sizeof(T)'. 'sizeof(T)' must be
> + // a power of 2.
> + template<typename T> struct Aligner {
> + static uint32_t align(int bytes) { return ConstantAligner<sizeof(T)>::align(bytes); }
Instead of this (which forces you to write |Aligner<uint32>::align(foo)| all over) how about:
static uint32 AlignInt(int bytes) { return ConstantAligner<sizeof(uint32_t)>::align(bytes); }
There's no real reason for the extra struct I don't think.
Comment 8•12 years ago
|
||
(In reply to ben turner [:bent] from comment #7)
> > + // Round 'bytes' up to the next multiple of 'alignment'. 'alignment' must be
> > + // a power of 2.
> > + template<uint32_t alignment> struct ConstantAligner {
>
> So does the optimizer just do a better job with the addition of this struct,
> the static assert, or the reworked rounding operation? Are all three changes
> needed?
I believe the optimizer would do a better job with just the reworked rounding optimization. The struct and static assert are so we don't accidentally try to align to non-powers-of-two. Maybe that's overkill for this code, but it still seemed like a useful thing to do.
> @@ +257,5 @@
> > + };
> > + // Round 'bytes' up to the next multiple of 'sizeof(T)'. 'sizeof(T)' must be
> > + // a power of 2.
> > + template<typename T> struct Aligner {
> > + static uint32_t align(int bytes) { return ConstantAligner<sizeof(T)>::align(bytes); }
>
> Instead of this (which forces you to write |Aligner<uint32>::align(foo)| all
> over) how about:
>
> static uint32 AlignInt(int bytes) { return
> ConstantAligner<sizeof(uint32_t)>::align(bytes); }
>
> There's no real reason for the extra struct I don't think.
Good point. Will fix.
Comment on attachment 750576 [details] [diff] [review]
rewrite Pickle's alignment mechanism to be more obviously optimizable
Ok, I'll stamp that patch
Attachment #750576 -
Flags: review?(bent.mozilla)
Reporter | ||
Comment 10•12 years ago
|
||
Awesome, thanks for looking into this! Is there value in moving the Read* functions into the header? Give the compiler more opportunity to inline/optimize across functions, especially without PGO.
Comment 11•12 years ago
|
||
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #10)
> Awesome, thanks for looking into this! Is there value in moving the Read*
> functions into the header? Give the compiler more opportunity to
> inline/optimize across functions, especially without PGO.
I don't think it's worth trying to think about moving these functions into the header until the call to memcpy() goes away (not too hard, need to post patches when I'm on a different machine). There also needs to be a better way of reading datatypes from the packet without checking for NULL *iter and IteratorHasRoomFor on every read. Some sort of RawRead/RawWrite interface that doesn't require the NULL/IteratorHasRoomFor checks, for instance.
I think doing clever things with Tuple is the way to safely call RawRead/RawWrite under the hood (Read/Write of Tuple can just figure out how long the entire Tuple is going to be once, and then dispatch to their Raw* counterparts), but I haven't figured out how to do that in a transparent way (or if it can be done, or if it's worth trying to protect clients of Read/Write from themselves by not exposing RawRead/RawWrite).
Comment 12•11 years ago
|
||
Now with hiding everything behind AlignInt.
Attachment #756020 -
Flags: review?(bent.mozilla)
Comment on attachment 756020 [details] [diff] [review]
part 1 - rewrite Pickle's alignment mechanism to be more obviously optimizable
Review of attachment 756020 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good, thanks!
Attachment #756020 -
Flags: review?(bent.mozilla) → review+
Comment 14•11 years ago
|
||
This is just a cleanup, but since we're adding a few things that are going to require
specific alignments, it seemed good to give the type we're aligning against a name.
Attachment #756880 -
Flags: review?(bent.mozilla)
Comment 15•11 years ago
|
||
This patch is mostly for ARM (any machine without unaligned load support, really);
GCC won't optimize:
memcpy(aligned_ptr, void_star_ptr, sizeof(*aligned_ptr));
to inline loads and stores--at least not with the GCC version B2G (and maybe even
Fennec) is currently compiling with, I think. But we know something about the
alignment of void_star_ptr, so we can help the compiler realize what's going on.
Ideally the comment in front of Copier covers all this. Please let me know if you
don't think it's clear enough.
Attachment #756883 -
Flags: review?(bent.mozilla)
Comment 16•11 years ago
|
||
This patch is a minor optimization: ReadBytes, BeginWrite, and WriteBytes all take
an alignment parameter that defaults to sizeof(memberAlignmentType). It's mostly
used in modulus operations and since the functions are too big to inline, we wind
up doing divisions because the alignment is technically unknown at runtime. Since
we never use the alignment parameter, just delete it, say that it's always going
to be equal to sizeof(memberAlignmentType), and let the compiler turn those modulus
operations into something smaller and faster (particularly on ARM).
Attachment #756884 -
Flags: review?(bent.mozilla)
Updated•11 years ago
|
Attachment #750576 -
Attachment is obsolete: true
Comment 17•11 years ago
|
||
Comment on attachment 756884 [details] [diff] [review]
part 4 - un-generalize alignment required for Read*/Write* in Pickle
Bleh, I hadn't looked far enough; we rely on 64-bit alignment for SerializedStructuredCloneBuffer.
Attachment #756884 -
Attachment is obsolete: true
Attachment #756884 -
Flags: review?(bent.mozilla)
Comment on attachment 756880 [details] [diff] [review]
part 2 - add a new memberAlignmentType to replace the scattered uint32_t alignments
Looks good.
Attachment #756880 -
Flags: review?(bent.mozilla) → review+
Comment on attachment 756883 [details] [diff] [review]
part 3 - replace memcpys of POD datatypes in Pickle with something smarter
Review of attachment 756883 [details] [diff] [review]:
-----------------------------------------------------------------
::: ipc/chromium/src/base/pickle.cc
@@ +34,5 @@
> +
> +template<typename T, bool hasSufficientAlignment>
> +struct Copier
> +{
> + static void copy(T* dest, void** iter) {
Nit: Functions names should begin with capital letters. There are two 'copy' functions here, and then 'copyFromIter' too.
@@ +39,5 @@
> + memcpy(dest, *iter, sizeof(T));
> + }
> +};
> +template<typename T>
> +struct Copier<T, true>
I wonder if we shouldn't #ifdef this to the platforms where we know we need it. Opting in to an optimization like this seems safer, doesn't it?
@@ +45,5 @@
> + static void copy(T* dest, void** iter) {
> + // The reinterpret_cast is only safe if two conditions hold:
> + // (1) If the alignment of T* is the same as void*;
> + // (2) The alignment of the data in *iter is at least as
> + // bit as MOZ_ALIGNOF(T).
s/bit/big/
@@ +55,5 @@
> + }
> +};
> +
> +template<typename T>
> +void copyFromIter(T* dest, void** iter) {
Is it worth specializing this for 64-bit ints?
@@ +76,5 @@
> : header_(NULL),
> header_size_(AlignInt(header_size)),
> capacity_(0),
> variable_buffer_offset_(0) {
> + DCHECK(static_cast<memberAlignmentType>(header_size) >= sizeof(Header));
This belongs in the part 2 patch fwiw.
Comment 20•11 years ago
|
||
Will fix the capitalization. I'm not sure why I thought local style in this file was lowercaseCamelCase, but it's clearly not.
(In reply to ben turner [:bent] from comment #19)
> @@ +39,5 @@
> > + memcpy(dest, *iter, sizeof(T));
> > + }
> > +};
> > +template<typename T>
> > +struct Copier<T, true>
>
> I wonder if we shouldn't #ifdef this to the platforms where we know we need
> it. Opting in to an optimization like this seems safer, doesn't it?
If this optimization is valid on $PLATFORM, then it's valid everywhere else, too. Seems weird to just enable it for a particular platform, especially since none of the other code depends on details of the platform.
> @@ +55,5 @@
> > + }
> > +};
> > +
> > +template<typename T>
> > +void copyFromIter(T* dest, void** iter) {
>
> Is it worth specializing this for 64-bit ints?
What benefit would you get out of that?
> @@ +76,5 @@
> > : header_(NULL),
> > header_size_(AlignInt(header_size)),
> > capacity_(0),
> > variable_buffer_offset_(0) {
> > + DCHECK(static_cast<memberAlignmentType>(header_size) >= sizeof(Header));
>
> This belongs in the part 2 patch fwiw.
Whoops, sorry about that. Will move.
(In reply to Nathan Froyd (:froydnj) from comment #20)
> If this optimization is valid on $PLATFORM, then it's valid everywhere else,
> too. Seems weird to just enable it for a particular platform, especially
> since none of the other code depends on details of the platform.
We're basically telling all compilers to not use memcpy but instead use normal load/store for types that fit. Is it not possible that some compilers have their own specialized memcpy that could do something different here? I can't really come up with any idea about what they would do, but it just seems wrong to say we always know best. That's why I said opting in seems safer. What do you think?
> > Is it worth specializing this for 64-bit ints?
>
> What benefit would you get out of that?
We have several calls below that copy 64-bit integers, and that will always use memcpy with your current patch. Is that faster or slower than doing two 32-bit stores?
Comment 22•11 years ago
|
||
(In reply to ben turner [:bent] from comment #21)
> (In reply to Nathan Froyd (:froydnj) from comment #20)
> > If this optimization is valid on $PLATFORM, then it's valid everywhere else,
> > too. Seems weird to just enable it for a particular platform, especially
> > since none of the other code depends on details of the platform.
>
> We're basically telling all compilers to not use memcpy but instead use
> normal load/store for types that fit. Is it not possible that some compilers
> have their own specialized memcpy that could do something different here? I
> can't really come up with any idea about what they would do, but it just
> seems wrong to say we always know best. That's why I said opting in seems
> safer. What do you think?
I know the ARM AEABI has specialized memcpy functions (with different names, obviously), but those are for copying when the pointer alignments are already known, so they don't apply here.
I have a hard time imagining how any out-of-line memcpy (especially one that doesn't live in libxul) is going to win over inline loads and stores (two, maybe four, instructions, which is about as many instructions as it's going to take just to get to the beginning of memcpy--and some of those instructions are going to be jumps). I think selectively enabling this is being overly cautious.
> > > Is it worth specializing this for 64-bit ints?
> >
> > What benefit would you get out of that?
>
> We have several calls below that copy 64-bit integers, and that will always
> use memcpy with your current patch. Is that faster or slower than doing two
> 32-bit stores?
OK, bizarre: x86 only requires 4-byte alignment for 64-bit ints, but ARM and other 32-bit platforms appear to require 8-byte alignment (why?!). I assumed 64-bit ints would only require 4-byte alignment on all 32-bit platforms. My mistake.
I imagine two inline stores would be faster than calling out to memcpy--but the single inline store generated on a 64-bit platform would be faster than two 32-bit stores. Can probably take care of that with some template magic.
(In reply to Nathan Froyd (:froydnj) from comment #22)
> I have a hard time imagining how any out-of-line memcpy
I didn't mean out-of-line, I would expect inlines and maybe some compiler intrinsics for some of this stuff. Can we at least look at the disassembly on Windows and Mac with and without this to make sure we're not shooting ourselves in the foot here? As long as the output isn't much different I am fine here.
> I assumed 64-bit ints would only require 4-byte alignment on all 32-bit
> platforms. My mistake.
Ha! I assumed the opposite! :)
Looks like we're both sorta wrong though:
http://en.wikipedia.org/wiki/Data_structure_alignment#Typical_alignment_of_C_structs_on_x86
(In reply to ben turner [:bent] from comment #23)
> http://en.wikipedia.org/wiki/
> Data_structure_alignment#Typical_alignment_of_C_structs_on_x86
Assuming that is accurate...
Comment 25•11 years ago
|
||
(In reply to ben turner [:bent] from comment #23)
> (In reply to Nathan Froyd (:froydnj) from comment #22)
> > I have a hard time imagining how any out-of-line memcpy
>
> I didn't mean out-of-line, I would expect inlines and maybe some compiler
> intrinsics for some of this stuff. Can we at least look at the disassembly
> on Windows and Mac with and without this to make sure we're not shooting
> ourselves in the foot here? As long as the output isn't much different I am
> fine here.
For a 32-bit int, the best that a compiler can do with the original memcpy is to turn it into a load and a store. The new optimization should do just that, too. Similarly for other sizes (assuming they don't have to be aligned any more than a uint32_t). I will confirm Mac and Windows disassembly, though.
> > I assumed 64-bit ints would only require 4-byte alignment on all 32-bit
> > platforms. My mistake.
>
> Ha! I assumed the opposite! :)
>
> Looks like we're both sorta wrong though:
>
> http://en.wikipedia.org/wiki/
> Data_structure_alignment#Typical_alignment_of_C_structs_on_x86
Well, I know long longs are 4-byte aligned on Linux and Mac:
struct s {
char c;
long long ll;
};
int foo()
{
return __builtin_offsetof(struct s, ll);
}
-O2 -m32 -S -o - output:
pushl %ebp
movl %esp, %ebp
movl $4, %eax
popl %ebp
ret
but I could believe Windows is different (again, why?!).
Comment 26•11 years ago
|
||
(In reply to ben turner [:bent] from comment #23)
> (In reply to Nathan Froyd (:froydnj) from comment #22)
> > I have a hard time imagining how any out-of-line memcpy
>
> I didn't mean out-of-line, I would expect inlines and maybe some compiler
> intrinsics for some of this stuff. Can we at least look at the disassembly
> on Windows and Mac with and without this to make sure we're not shooting
> ourselves in the foot here? As long as the output isn't much different I am
> fine here.
On Mac (x86/x86-64), clang already turns the memcpys into inline assembly. So these patches don't make things any worse or any better.
On Windows, MSVC does not inline the memcpys here. So these patches improve codesize and performance.
Comment 27•11 years ago
|
||
OK, here's something that seems to generate sane assembly on clang and
gcc for x86{,-64} and ARM.
Attachment #756883 -
Attachment is obsolete: true
Attachment #756883 -
Flags: review?(bent.mozilla)
Attachment #762652 -
Flags: review?(bent.mozilla)
Comment 28•11 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #27)
> Created attachment 762652 [details] [diff] [review]
> part 3 - replace memcpys of POD datatypes in Pickle with something smarter
>
> OK, here's something that seems to generate sane assembly on clang and
> gcc for x86{,-64} and ARM.
The HAVE_64BIT_OS check is not perfect (I think), but it's good enough for the platforms we care about.
Comment 29•11 years ago
|
||
Here's something that might actually compile on non-x86oid and avoids any language
lawyering around how unions work.
Attachment #762652 -
Attachment is obsolete: true
Attachment #762652 -
Flags: review?(bent.mozilla)
Attachment #763908 -
Flags: review?(bent.mozilla)
Comment on attachment 763908 [details] [diff] [review]
part 3 - replace memcpys of POD datatypes in Pickle with something smarter
Review of attachment 763908 [details] [diff] [review]:
-----------------------------------------------------------------
r=me with these addressed:
::: ipc/chromium/src/base/pickle.cc
@@ +48,5 @@
> +// if 64-bit types aren't sufficiently aligned; the alignment
> +// requirements for them vary between 32-bit platforms.
> +#ifndef HAVE_64BIT_OS
> +template<typename T>
> +struct Copier<T, 8, false>
Nit: s/8/sizeof(int64_t)/ is more readable
@@ +52,5 @@
> +struct Copier<T, 8, false>
> +{
> + static void Copy(T* dest, void** iter) {
> + MOZ_STATIC_ASSERT(MOZ_ALIGNOF(uint32_t) <= MOZ_ALIGNOF(Pickle::memberAlignmentType),
> + "Insufficient alignment");
Can you move this somewhere else (like, top of the file or something)? It should always be true given our current memberAlignmentType definition, not just if we're copying a 64 bit integer.
@@ +54,5 @@
> + static void Copy(T* dest, void** iter) {
> + MOZ_STATIC_ASSERT(MOZ_ALIGNOF(uint32_t) <= MOZ_ALIGNOF(Pickle::memberAlignmentType),
> + "Insufficient alignment");
> + MOZ_STATIC_ASSERT(MOZ_ALIGNOF(T) >= MOZ_ALIGNOF(uint32_t),
> + "Insufficient alignment");
This looks insufficient to me... If you match this template then |MOZ_ALIGNOF(T) <= sizeof(Pickle::memberAlignmentType)| must be false, and since memberAlignmentType is just a uint32_t the only useful thing you could enforce here is |MOZ_ALIGNOF(T) > sizeof(Pickle::memberAlignmentType)|, right?
@@ +56,5 @@
> + "Insufficient alignment");
> + MOZ_STATIC_ASSERT(MOZ_ALIGNOF(T) >= MOZ_ALIGNOF(uint32_t),
> + "Insufficient alignment");
> +#if MOZ_LITTLE_ENDIAN
> + const int loIndex = 0, hiIndex = 1;
Nit: static here and below
@@ +61,5 @@
> +#else
> + const int loIndex = 1, hiIndex = 0;
> +#endif
> + // We're not using memcpy here because some compilers don't inline
> + // properly.
Nit: This comment is redundant.
@@ +62,5 @@
> + const int loIndex = 1, hiIndex = 0;
> +#endif
> + // We're not using memcpy here because some compilers don't inline
> + // properly.
> + uint32_t* src = *reinterpret_cast<uint32_t**>(iter);
Do you think it's worth adding:
MOZ_STATIC_ASSERT(MOZ_ALIGNOF(uint32_t*) == MOZ_ALIGNOF(void*))
here?
@@ +88,5 @@
> +};
> +
> +template<typename T>
> +void CopyFromIter(T* dest, void** iter) {
> + MOZ_STATIC_ASSERT(mozilla::IsPod<T>::value, "copied type must be a pod type");
Nit: s/copied/Copied/ and s/pod/POD/
Attachment #763908 -
Flags: review?(bent.mozilla) → review+
Comment 31•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/ca677b5c6229
https://hg.mozilla.org/integration/mozilla-inbound/rev/3e8212722888
https://hg.mozilla.org/integration/mozilla-inbound/rev/e582f705dafa
Marking as [leave open] for future enhancements.
Whiteboard: [leave open]
(In reply to Nathan Froyd (:froydnj) from comment #31)
> Marking as [leave open] for future enhancements.
Eh, let's please just file a followup if you have additional things to add? Once things land it gets complicated to remember which patches are in, etc.
Comment 33•11 years ago
|
||
(In reply to ben turner [:bent] from comment #32)
> (In reply to Nathan Froyd (:froydnj) from comment #31)
> > Marking as [leave open] for future enhancements.
>
> Eh, let's please just file a followup if you have additional things to add?
> Once things land it gets complicated to remember which patches are in, etc.
OK, new bug created then for trying to remove redundant checking.
Whiteboard: [leave open]
Comment 34•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/ca677b5c6229
https://hg.mozilla.org/mozilla-central/rev/3e8212722888
https://hg.mozilla.org/mozilla-central/rev/e582f705dafa
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
You need to log in
before you can comment on or make changes to this bug.
Description
•