Enable x86 alignment checking for JITted code to catch misaligned reads/writes

REOPENED
Assigned to

Status

()

Core
JavaScript Engine
--
minor
REOPENED
10 years ago
7 years ago

People

(Reporter: Waldo, Assigned: Waldo)

Tracking

Trunk
x86
All
Points:
---
Bug Flags:
blocking1.9.1 -

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Assignee)

Description

10 years ago
Bug 472791 was a large slowdown due to a misalignment problem which would have been caught if x86 alignment checking were enabled.  This checking can be enabled by setting bit 18 of eflags, assuming the operating system sets the alignment mask bit (bit 18) in cr0.  I'm not certain yet (but should be very soon), but I think Windows and OS X don't set this bit while Linux does; we should enable checking so that we learn of such problems when they occur, at least on Linux if not elsewhere.

It's conceivable that some CPUs, when both alignment-check bits are set, can perform pipeline optimizations by speculating that all accesses are aligned, but I have no idea whether anyone does it.  If this were true, tho, it could conceivably be a performance win to enable this.

See also bug 472791 comment 11.
(Assignee)

Comment 1

10 years ago
jst verifies that the AM bit in cr0 is 1 on AMD in Linux, which exactly matches the behavior I expected there and the values demonstrated on Linux on an Intel machine, so I think we can do this but should expect it to only work on Linux.  Still, that'd be much better than now where we only discover misalignment after laboriously investigating perf losses.
(Assignee)

Comment 2

10 years ago
Playing whack-a-mole with perf bugs that just happen to be due to misalignment isn't a good strategy; we need to learn about these when they happen, not when we happen to stumble across them because they make a noticeable performance difference.  This shouldn't be too much more work than figuring out the right inline assembly for a few different compilers and #ifdef-ing appropriately, but it would mean anyone testing on Linux would fail-fast if an alignment problem happens.
Flags: blocking1.9.1?

Comment 3

10 years ago
It might be good to fuzz-test with this change before we unleash it on nightly users.
(Assignee)

Comment 4

10 years ago
Created attachment 359866 [details] [diff] [review]
Patch

This compiles on OS X, but Linux is the place to really test it, since that's the only place where it has any effect.  The Windows change isn't necessary, but in the event Windows ever allows alignment checking this could be useful.

Need to fire up the Linux VM and run with this before doing anything else here, for sure...anyone else who wants to build and play along, feel free.
for(let y in ['', '', '', null, ]) for(let y in []);

crashes dbg js shell in TM with this patch (doesn't occur without this patch) at NumberToStringWithBase:

(gdb) bt
#0  0x080b49d3 in NumberToStringWithBase (cx=0x982cd90, d=3, base=10) at ../jsnum.cpp:833
#1  0x080b4a71 in js_NumberToString (cx=0x982cd90, d=3) at ../jsnum.cpp:845
#2  0x0810d476 in js_ValueToString (cx=0x982cd90, v=7) at ../jsstr.cpp:3065
#3  0x080b086f in CallEnumeratorNext (cx=0x982cd90, iterobj=0x9830220, flags=1, rval=0xbfb12e80) at ../jsiter.cpp:582
#4  0x080b0ba7 in js_CallIteratorNext (cx=0x982cd90, iterobj=0x9830220, rval=0xbfb12e80) at ../jsiter.cpp:604
#5  0x081357e6 in CallIteratorNext_tn (cx=0x982cd90, iterobj=0x9830220) at ../jstracer.cpp:9023
#6  0xb7b25fd8 in ?? ()
#7  0xbfb15548 in ?? ()
#8  0x081634c1 in js_MonitorLoopEdge (cx=0x982cd90, inlineCallCount=@0xbfb15cfc) at ../jstracer.cpp:4243
#9  0x081a8e62 in js_Interpret (cx=0x982cd90) at ../jsinterp.cpp:3094
#10 0x080acd50 in js_Execute (cx=0x982cd90, chain=0x9830000, script=0x98372a8, down=0x0, flags=0, result=0xbfb15ea4) at ../jsinterp.cpp:1564
#11 0x0805800b in JS_ExecuteScript (cx=0x982cd90, obj=0x9830000, script=0x98372a8, rval=0xbfb15ea4) at ../jsapi.cpp:5135
#12 0x0805125e in Process (cx=0x982cd90, obj=0x9830000, filename=0x0, forceTTY=0) at ../../shell/js.cpp:491
#13 0x08051a55 in ProcessArgs (cx=0x982cd90, obj=0x9830000, argv=0xbfb16018, argc=1) at ../../shell/js.cpp:791
#14 0x08051cfd in main (argc=1, argv=0xbfb16018, envp=0xbfb16020) at ../../shell/js.cpp:4556
(gdb)
for(let y in ['', '', '', null]) for(let y in []);

is a slightly-tidied up testcase. Also, the crash doesn't occur when TM is disabled, i.e. running ./js without -j doesn't crash.
(Assignee)

Comment 7

10 years ago
gcc is compiling that code to copy a double (size 8 bytes) to a 4-byte-aligned address that's not also 8-byte-aligned.  I wonder if we're missing a compiler flag somewhere; I wouldn't have expected it to do that by default.  The other possibility is that it's an ABI issue that I'm not aware of; I'm not actually sure how doubles are marshalled for function calls in C on x86.
(In reply to comment #6)
> for(let y in ['', '', '', null]) for(let y in []);
> 
> is a slightly-tidied up testcase. Also, the crash doesn't occur when TM is
> disabled, i.e. running ./js without -j doesn't crash.

Forgot to add, this result was from a 32-bit Ubuntu. jsfunfuzz kept on hitting this crash in debug js shell for every run.
(Assignee)

Comment 9

10 years ago
http://refspecs.freestandards.org/elf/abi386-4.pdf

The note and caution on 3-18 say:

NOTE The Intel386 architecture does not require doubleword alignment for double-precision values. Nevertheless, for data structure compatibility with other Intel 
architectures, compilers may provide a method to align double-precision 
values on doubleword boundaries.

CAUTION A compiler that provides the doubleword alignment mentioned above would have to maintain doubleword alignment for the stack. Moreover, the arguments in the preceding example would appear in different positions. Programs built with the doubleword alignment facility would not conform to the Intel386 ABI, and their function calling sequence would not be compatible with conforming Intel386 programs.

So excepting x86-64, where any stacked values are 8-byte-aligned, it looks like we can't do this with the patch provided here.  However, if we set and unset the bit for every builtin call, we certainly could still do this.  That might be worthwhile as a debug-only measure, as it'd be a more significant slowdown than just a single enable/disable per trace execution, but it probably couldn't be used by default.

Updated

10 years ago
Flags: blocking1.9.1? → blocking1.9.1-
Obsolete with the removal of tracejit.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → WONTFIX
(Assignee)

Comment 11

7 years ago
Not obsolete due to tracejit removal, although I'm not sure if compilers generate properly-aligned code such that it'd be possible to flip this bit...
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Summary: Enable x86 alignment checking for traced code to catch misaligned reads/writes → Enable x86 alignment checking for JITted code to catch misaligned reads/writes
You need to log in before you can comment on or make changes to this bug.