In bug 145975, I encountered a loop that ran 25% slower when I made code changes elsewhere in my file. Disassembling the code showed no difference in the function save its placement in the object file (and thus its alignment). (There's a comment in that bug about a loop that I thought was suffering from alignment problems but wasn't. The loop I refer to here is different.) We currently compile with -Os on Linux, which disables -falign-functions, among other things. Perhaps this isn't the right thing to do.
I just verified that my problem is fixed by using -Os -freorder-blocks -falign-functions -falign-jumps -falign-loops -freorder-blocks-and-partition (i.e., -Os plus the flags it disables from -O2, except -fprefetch-loop-arrays, which is incompatible with -Os, and -ftree-vect-loop-version) In fact, parts of my benchmark run 15% faster with these settings than the previous best case. Unless we're sure we're going to upgrade to GCC 4.5 (plus, perhaps PGO) for FF4, it might be worth experimenting to find an optimal set of flags.
blocking2.0: --- → ?
GCC 4.5 probably wont happen for ff4. Even with 4.5 we need to switch flags, so this is irrelevant of gcc version. I suspect this is a DUP of bug 590181
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 590181
You need to log in before you can comment on or make changes to this bug.