Closed Bug 660335 Opened 10 years ago Closed 10 years ago
ID to align to 64-bit boundary
Per the discussion in bug 659456, this might be necessary. Anyway, it shouldn't hurt. Patch in a moment.
Seems to work on Linux. Haven't tested on Windows.
Typo'ed the SunPro declaration.
Attachment #535744 - Attachment is obsolete: true
Comment on attachment 535745 [details] [diff] [review] Patch v2 I don't think we should try to do the #pragma thing with sunpro: pragmas in defines are not portable in general and very icky. Just leave them to the default case.
This really only needs to align to 32 bits on 32-bit platforms (except weird Sparc platforms which have 32-bit address spaces but also a 64-bit load instruction). I doubt it makes a difference either way, though. This seems to work on Windows (that is, the compile doesn't error out in the first few minutes).
Attachment #535745 - Attachment is obsolete: true
Is it also a good idea to add a PR_STATIC_ASSERT(sizeof(nsID)==64)?
That does sound useful, but not essential.
Assignee: nobody → justin.lebar+bug
This landed then bounced due to red on 32-bit systems. Apparently forcing 64-bit alignment does something there that I didn't anticipate. http://hg.mozilla.org/mozilla-central/rev/a968740930d3 http://hg.mozilla.org/mozilla-central/rev/e0cf03a641cf http://hg.mozilla.org/mozilla-central/rev/6a90d831b11c
I wish we could just use memcmp here and be done with it. gcc 4.5 on Linux uses an x86 string operation for the memcmp here, which is sensible (I dunno if it's faster). But Mac gcc calls memcmp, which isn't helpful. Things apparently break very hard when the align does something, so I'm not sure it's worth putting in, even only on 64-bit systems. I think we should either leave it be, or explicitly use 64-bit reads on 64-bit systems and 32-bit reads on 32-bit systems.
Is there any case where the size of the data structure is correct but the alignment is not?
The string operation appears to be Very Slow. AFAICT it's slower than glibc's memcmp for both small and large inputs. > Is there any case where the size of the data structure is correct but the alignment is not? What do you mean?
(In reply to comment #11) > What do you mean? I mean, if we had the static assert, under what situations would a compiler pass that and still get the wrong alignment for the data structure?
I haven't investigated deeply, but I suspect that what's going wrong is that the compiler inserts unexpected gaps into other data structures to force nsID to be aligned.
(In reply to comment #13) > I haven't investigated deeply, but I suspect that what's going wrong is that > the compiler inserts unexpected gaps into other data structures to force nsID > to be aligned. I think so too. My question is: doesn't the static assert on the total size of the data structure implicitly guarantee that there are no extra padding bytes?
To be clear, I think the gaps are in *other* data structures -- that is, they're not in nsID. All instances of a class must have the same layout.
I don't think we're going to fix this. If the 64-bit reads are a problem on some new system, we'll find out very quickly. For now, things seem to work...
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
(In reply to Justin Lebar [:jlebar] from comment #16) > I don't think we're going to fix this. If the 64-bit reads are a problem on > some new system, we'll find out very quickly. For now, things seem to > work... It's a problem on ia64.
(In reply to Mike Hommey [:glandium] from comment #17) > (In reply to Justin Lebar [:jlebar] from comment #16) > > I don't think we're going to fix this. If the 64-bit reads are a problem on > > some new system, we'll find out very quickly. For now, things seem to > > work... > > It's a problem on ia64. But the fact that we're not crashing here means that it's properly aligned, right?
(In reply to Justin Lebar [:jlebar] from comment #18) > (In reply to Mike Hommey [:glandium] from comment #17) > > (In reply to Justin Lebar [:jlebar] from comment #16) > > > I don't think we're going to fix this. If the 64-bit reads are a problem on > > > some new system, we'll find out very quickly. For now, things seem to > > > work... > > > > It's a problem on ia64. > > But the fact that we're not crashing here means that it's properly aligned, > right? ia64 allows different behaviors on unaligned accesses (and so does ARM, for that matter). So it pretty much depends how the kernel is configured. In most cases, we're filling kernel logs with unaligned access warnings on ia64, and on ARM, there are risks of random failures when e.g. using Equals, if e.g. one of the two values being compared is aligned and the other is not (ARM likes to invert msb and lsb on unaligned accesses, or to pick next word, i don't remember exactly). FWIW, when I tried building with 64-bit aligned IIDs, the resulting binary only grew by 64 bytes, which suggests only 16 IIDs were originally unaligned. Random luck.
So do you think we should back out bug 164580?
I think bug 659546 shouldn't have been ignored (which its fix really is about), and IIDs be aligned. I'll double check comment 19 about ARM, see if it may get us actual problems already. I do know it is a problem on ia64, and i think it should be a problem on sparc too.
So, it's not a problem on ARM. Two-word operations don't actually require more than 32-bits alignment. It however is a problem on ia64, sparc and sparc64.
It looks like the patch confuses gcc 4.4 on x86 :( (Getting a crash in xptiInterfaceInfoManager::VerifyAndAddEntryIfNew)
I'm not sure we actually want nsID to be 64-bit aligned on 32-bit systems, since that means that any structure which contains an nsID must itself be 64-bit aligned. The extra alignment can only hurt us...
(In reply to Justin Lebar [:jlebar] from comment #24) > I'm not sure we actually want nsID to be 64-bit aligned on 32-bit systems sparc is 32-bit, and non-aligned 64-bit accesses hurt there. I guess we could have an AC_TRY_RUN test to choose if we need 64-bit alignment, though that wouldn't work when cross compiling... That wouldn't solve the compiler problem, though (which, btw, is not limited to gcc 4.4... 4.6 is affected, too). Interestingly, there is no problem with the patch on FF7, only with FF8 onwards.
I don't mind if we fall back to something slow on non-tier-1 platforms. We can do a byte-by-byte comparison, memcmp, or return the first bit of /dev/random. :) What matters is figuring out something for x86, x64, and ARM. If we care about this being performant -- and honestly, I'm not so sure we do -- we could use SSE2 on x64. I think it's four instructions. We could just use 32-bit loads on x86/ARM. Or we could use 32-bit loads everywhere...
Nowadays, I'd expect gcc to do the right thing when inlining memcmp, though I seem to remember there were pathetical cases where the inlined version was slower than actually calling the optimized glibc implementation.
gcc tries to be fast, but it appears they never benchmarked it. It's very slow, and I don't think they've fixed it yet. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
Er, I was going to post the same link :)
Though, 16-bytes comparisons may be treated well.
Turns out, gcc doesn't even inline memcmp anymore in versions >= 4.5 (at least those in debian), even for sizes as small as 4 bytes, but 4.4 did.
You need to log in before you can comment on or make changes to this bug.