Presumably this happens when we have to go from int to double, and presumably the various elemInc changes to make it work on different types of objects since then fixed this in Fx4...
I consistently get what looks like a null deref in a slightly different location with this testcase, e.g.: bp-2252357e-3e0e-42e6-bf97-cbd242110509 But I'm using a 3.6.18 nightly--there might be some fixes in it that affect this crash (seems unlikely but I haven't checked what has landed yet).
dmandelin: please have someone on the JS team investigate this.
Yarr update is about done, so I can get back to this now.
I think I have the cause. We end up OOMing on (obj)++, specifically the ELEMINC op, and then we crash trying to finish it in the interpreter. Just before the ELEMINC, the stack looks like this: object: obj int: 0 The code the tracer generates for ELEMINC is: 1. guard that obj is a dense array 2. guard that index is in bounds 3. do the increment, storing result to stack 4. box the result -- may OOM_EXIT 5. store result to the array Here, we OOM at step 4, which means the stack is now: double: obj + 1 int: 0 But by design, when the tracer exits, it wants to restart the op in the interpreter, thus it assumes the stack contents are the same as they were at the start of the op. So it copies half of the bits of the double value to the stack as |obj|, and we go into that in the interpreter. In this test, we've heap-sprayed '0x0c0c0c0c' all over, and the fake |obj| pointer happens to point into that region, so we crash trying to call '0x0c0c0c0c' via the map. This affects eleminc and propinc ops. It doesn't affect 2.0 because we don't have to allocate memory to box a double, so we can't OOM on step 4. Options for a fix: A. Don't trace these ops. The seem to be uncommon, so it may have a negligible effect on perf. B. From dvander: don't trace these ops if the array/object contains a double. That's the only case where we can OOM. Even less likely to affect perf. C. Refactor the code to box before storing to the stack. Doesn't look too hard, maybe half a day's work or so. I think I'm going to start with B. If that's OK on benchmarks, should be good enough, and is a safer fix than C. Otherwise I'll do C.
dmandelin: thanks for the patch! Does this affect the 1.9.1 branch code as well, or is this new in 3.6? I think I confused this with another bug when I marked it "unaffected", I'm not sure I tested this there.
Comment on attachment 533128 [details] [diff] [review] Patch, strategy B What are those weird JSVAL things? (Just kidding)
(In reply to comment #8) > dmandelin: thanks for the patch! Does this affect the 1.9.1 branch code as > well, or is this new in 3.6? I think I confused this with another bug when I > marked it "unaffected", I'm not sure I tested this there. Yes, the bug is present in 1.9.1 as well. The same patch should apply cleanly there.
Comment on attachment 533128 [details] [diff] [review] Patch, strategy B Approved for 126.96.36.199 and 188.8.131.52, a=dveditz for release-drivers
Created attachment 537000 [details] new testcase I want to appeal the bounty for this bug. The reason given to me for the amount of the bounty ($1,000) is the low reliability of my exploit. The original testcase uses %cccc%cccc as shellcode, which will execute an "int 3" instruction. When attached to a debugger, the program breaks. Attached is a new testcase (I just modified the original PoC) that executes "calc.exe" in windows XP, if successful. Please test. I tested the reliability of the exploit proof-of-concept this way: I changed the shellcode in the PoC to launch calc.exe. Also, I made a batch script that rebooted an XP virtual machine (from inside the VM), made firefox open the PoC, and then sent some information about success/failure to the host machine. From my test, I got like 50% of success. On windows Vista+, the exploit will only crash firefox, because of DEP. Best regards, secenv.
On WinXP running a pre-fixed 3.6.x the testcases crash nearly immediately, sometimes in exploitable-looking ways: bp-5b123ed9-8e09-4432-91bf-17ccf2110602 Process monitor didn't show the memory use spiking before the crash but maybe it was just a single super-large request that failed (oom) rather than the overall process being OOM. Never got the "calc.exe" testcase to actually run calc.exe but did see evidence it was at least making it to the 0x0c0c NOP sled. After the fix I get stuck looping in the slow-script dialog and memory use does grow a lot. Had to kill the process.
Thanks Daniel Veditz for your help. I suppose I understand what's happening here: hardware DEP will prevent the execution of the exploit payload ("new testcase") on XP SP2+ (that's why it worked in my VM: no software or hardware DEP). So, a good way to test the reliability of this PoC is running it on windows 2000 / XP SP0 or SP1, or, as Daniel Veditz said, by looking if there is some evidence of "hitting" the NOP sled (attaching a debugger and looking at the eip register value). By controlling eip, the exploit controls the program flow. There are some techniques to bypass DEP (without ASLR) on these systems, but I'm just learning about them. Also, most advanced techniques bypass DEP+ASLR too, but those are more difficult to apply, or not applicable at all. Best regards, secenv.