Closed
Bug 660335
Opened 14 years ago
Closed 13 years ago
Force nsID to align to 64-bit boundary
Categories
(Core :: XPCOM, defect)
Core
XPCOM
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: justin.lebar+bug, Assigned: justin.lebar+bug)
Details
Attachments
(1 file, 2 obsolete files)
2.06 KB,
patch
|
benjamin
:
review+
|
Details | Diff | Splinter Review |
Per the discussion in bug 659456, this might be necessary. Anyway, it shouldn't hurt.
Patch in a moment.
Assignee | ||
Comment 1•14 years ago
|
||
Seems to work on Linux. Haven't tested on Windows.
Assignee | ||
Comment 2•14 years ago
|
||
Typo'ed the SunPro declaration.
Assignee | ||
Updated•14 years ago
|
Attachment #535744 -
Attachment is obsolete: true
Comment 3•14 years ago
|
||
Comment on attachment 535745 [details] [diff] [review]
Patch v2
I don't think we should try to do the #pragma thing with sunpro: pragmas in defines are not portable in general and very icky. Just leave them to the default case.
Assignee | ||
Comment 4•14 years ago
|
||
This really only needs to align to 32 bits on 32-bit platforms (except weird Sparc platforms which have 32-bit address spaces but also a 64-bit load instruction).
I doubt it makes a difference either way, though.
This seems to work on Windows (that is, the compile doesn't error out in the first few minutes).
Assignee | ||
Comment 5•14 years ago
|
||
Attachment #535750 -
Flags: review?(benjamin)
Assignee | ||
Updated•14 years ago
|
Attachment #535745 -
Attachment is obsolete: true
Updated•14 years ago
|
Attachment #535750 -
Flags: review?(benjamin) → review+
Comment 6•14 years ago
|
||
Is it also a good idea to add a PR_STATIC_ASSERT(sizeof(nsID)==64)?
Comment 7•14 years ago
|
||
That does sound useful, but not essential.
Assignee | ||
Updated•14 years ago
|
Assignee: nobody → justin.lebar+bug
Assignee | ||
Comment 8•14 years ago
|
||
This landed then bounced due to red on 32-bit systems. Apparently forcing 64-bit alignment does something there that I didn't anticipate.
http://hg.mozilla.org/mozilla-central/rev/a968740930d3
http://hg.mozilla.org/mozilla-central/rev/e0cf03a641cf
http://hg.mozilla.org/mozilla-central/rev/6a90d831b11c
Assignee | ||
Comment 9•14 years ago
|
||
I wish we could just use memcmp here and be done with it. gcc 4.5 on Linux uses an x86 string operation for the memcmp here, which is sensible (I dunno if it's faster). But Mac gcc calls memcmp, which isn't helpful.
Things apparently break very hard when the align does something, so I'm not sure it's worth putting in, even only on 64-bit systems. I think we should either leave it be, or explicitly use 64-bit reads on 64-bit systems and 32-bit reads on 32-bit systems.
Comment 10•14 years ago
|
||
Is there any case where the size of the data structure is correct but the alignment is not?
Assignee | ||
Comment 11•14 years ago
|
||
The string operation appears to be Very Slow. AFAICT it's slower than glibc's memcmp for both small and large inputs.
> Is there any case where the size of the data structure is correct but the alignment is not?
What do you mean?
Comment 12•14 years ago
|
||
(In reply to comment #11)
> What do you mean?
I mean, if we had the static assert, under what situations would a compiler pass that and still get the wrong alignment for the data structure?
Assignee | ||
Comment 13•14 years ago
|
||
I haven't investigated deeply, but I suspect that what's going wrong is that the compiler inserts unexpected gaps into other data structures to force nsID to be aligned.
Comment 14•14 years ago
|
||
(In reply to comment #13)
> I haven't investigated deeply, but I suspect that what's going wrong is that
> the compiler inserts unexpected gaps into other data structures to force nsID
> to be aligned.
I think so too. My question is: doesn't the static assert on the total size of the data structure implicitly guarantee that there are no extra padding bytes?
Assignee | ||
Comment 15•14 years ago
|
||
To be clear, I think the gaps are in *other* data structures -- that is, they're not in nsID. All instances of a class must have the same layout.
Assignee | ||
Comment 16•13 years ago
|
||
I don't think we're going to fix this. If the 64-bit reads are a problem on some new system, we'll find out very quickly. For now, things seem to work...
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
Comment 17•13 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #16)
> I don't think we're going to fix this. If the 64-bit reads are a problem on
> some new system, we'll find out very quickly. For now, things seem to
> work...
It's a problem on ia64.
Assignee | ||
Comment 18•13 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #17)
> (In reply to Justin Lebar [:jlebar] from comment #16)
> > I don't think we're going to fix this. If the 64-bit reads are a problem on
> > some new system, we'll find out very quickly. For now, things seem to
> > work...
>
> It's a problem on ia64.
But the fact that we're not crashing here means that it's properly aligned, right?
Comment 19•13 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #18)
> (In reply to Mike Hommey [:glandium] from comment #17)
> > (In reply to Justin Lebar [:jlebar] from comment #16)
> > > I don't think we're going to fix this. If the 64-bit reads are a problem on
> > > some new system, we'll find out very quickly. For now, things seem to
> > > work...
> >
> > It's a problem on ia64.
>
> But the fact that we're not crashing here means that it's properly aligned,
> right?
ia64 allows different behaviors on unaligned accesses (and so does ARM, for that matter). So it pretty much depends how the kernel is configured. In most cases, we're filling kernel logs with unaligned access warnings on ia64, and on ARM, there are risks of random failures when e.g. using Equals, if e.g. one of the two values being compared is aligned and the other is not (ARM likes to invert msb and lsb on unaligned accesses, or to pick next word, i don't remember exactly).
FWIW, when I tried building with 64-bit aligned IIDs, the resulting binary only grew by 64 bytes, which suggests only 16 IIDs were originally unaligned. Random luck.
Assignee | ||
Comment 20•13 years ago
|
||
So do you think we should back out bug 164580?
Comment 21•13 years ago
|
||
I think bug 659546 shouldn't have been ignored (which its fix really is about), and IIDs be aligned.
I'll double check comment 19 about ARM, see if it may get us actual problems already. I do know it is a problem on ia64, and i think it should be a problem on sparc too.
Comment 22•13 years ago
|
||
So, it's not a problem on ARM. Two-word operations don't actually require more than 32-bits alignment. It however is a problem on ia64, sparc and sparc64.
Comment 23•13 years ago
|
||
It looks like the patch confuses gcc 4.4 on x86 :(
(Getting a crash in xptiInterfaceInfoManager::VerifyAndAddEntryIfNew)
Assignee | ||
Comment 24•13 years ago
|
||
I'm not sure we actually want nsID to be 64-bit aligned on 32-bit systems, since that means that any structure which contains an nsID must itself be 64-bit aligned. The extra alignment can only hurt us...
Comment 25•13 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #24)
> I'm not sure we actually want nsID to be 64-bit aligned on 32-bit systems
sparc is 32-bit, and non-aligned 64-bit accesses hurt there. I guess we could have an AC_TRY_RUN test to choose if we need 64-bit alignment, though that wouldn't work when cross compiling... That wouldn't solve the compiler problem, though (which, btw, is not limited to gcc 4.4... 4.6 is affected, too). Interestingly, there is no problem with the patch on FF7, only with FF8 onwards.
Assignee | ||
Comment 26•13 years ago
|
||
I don't mind if we fall back to something slow on non-tier-1 platforms. We can do a byte-by-byte comparison, memcmp, or return the first bit of /dev/random. :)
What matters is figuring out something for x86, x64, and ARM.
If we care about this being performant -- and honestly, I'm not so sure we do -- we could use SSE2 on x64. I think it's four instructions. We could just use 32-bit loads on x86/ARM.
Or we could use 32-bit loads everywhere...
Comment 27•13 years ago
|
||
Nowadays, I'd expect gcc to do the right thing when inlining memcmp, though I seem to remember there were pathetical cases where the inlined version was slower than actually calling the optimized glibc implementation.
Assignee | ||
Comment 28•13 years ago
|
||
gcc tries to be fast, but it appears they never benchmarked it. It's very slow, and I don't think they've fixed it yet.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
Comment 29•13 years ago
|
||
Er, I was going to post the same link :)
Comment 30•13 years ago
|
||
Though, 16-bytes comparisons may be treated well.
Comment 31•13 years ago
|
||
Turns out, gcc doesn't even inline memcmp anymore in versions >= 4.5 (at least those in debian), even for sizes as small as 4 bytes, but 4.4 did.
You need to log in
before you can comment on or make changes to this bug.
Description
•