Closed
Bug 1013056
Opened 11 years ago
Closed 10 years ago
Crash at a weird memory address involving RegExp and filterPar
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
VERIFIED
FIXED
mozilla33
People
(Reporter: gkw, Assigned: mjrosenb)
References
Details
(4 keywords, Whiteboard: [jsbugmon:][adv-main31+])
Attachments
(3 files, 4 obsolete files)
1.25 KB,
text/plain
|
Details | |
2.68 KB,
patch
|
dougc
:
review+
abillings
:
approval-mozilla-aurora+
abillings
:
approval-mozilla-beta+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
1.49 KB,
patch
|
Details | Diff | Splinter Review |
b = RegExp("(?=())")
y = new Object;
a = b.exec('')
a[3] = undefined
Array.prototype.unshift.call(a, undefined)
try {
a.filterPar(function() {})
} catch (e) {}
crashes js opt shell on m-c changeset 2893f60d5903 with --ion-parallel-compile=off --ion-eager at a weird memory address intermittently.
Setting s-s and assuming sec-critical prior to further analysis, as crashing at a weird address seems bad.
My configure flags are:
CC="gcc-4.7 -mfloat-abi=softfp -B/usr/lib/gcc/arm-linux-gnueabi/4.7" CXX="g++-4.7 -mfloat-abi=softfp -B/usr/lib/gcc/arm-linux-gnueabi/4.7" AR=ar sh /home/fuzz5lin/trees/mozilla-central/js/src/configure --target=arm-linux-gnueabi --enable-optimize --disable-debug --enable-profiling --enable-gczeal --enable-debug-symbols --disable-tests --enable-more-deterministic --with-ccache --enable-threadsafe <other NSPR options>
Run this anytime from 1-100 times, or (set disable-randomization if necessary) within gdb, repeatedly run it till it crashes.
I will try to see if I can get a bisection, but no promises yet.
Setting needinfo? from Marty to see what's going on here.
Flags: needinfo?(mrosenberg)
Reporter | ||
Comment 1•11 years ago
|
||
autoBisect shows this is probably related to the following changeset:
The first bad revision is:
changeset: http://hg.mozilla.org/mozilla-central/rev/7a1c696cade6
user: Shu-yu Guo
date: Mon Apr 07 13:02:20 2014 -0700
summary: Bug 974201 - Remove filterPar chunking. (r=nmatsakis)
Shu-yu, is bug 974201 likely related in any way?
Blocks: 974201
Flags: needinfo?(shu)
Summary: Crash at a weird memory address involving RegExp → Crash at a weird memory address involving RegExp and filterPar
Reporter | ||
Updated•11 years ago
|
status-firefox30:
--- → unaffected
status-firefox31:
--- → affected
status-firefox32:
--- → affected
Comment 2•11 years ago
|
||
I don't have any ARM hardware to try to reproduce this. Did this start after irregexp landed?
Comment 3•11 years ago
|
||
(In reply to Shu-yu Guo [:shu] from comment #2)
> I don't have any ARM hardware to try to reproduce this. Did this start after
> irregexp landed?
That is, can you reproduce with YARR?
Comment 4•11 years ago
|
||
I am unable to reproduce this with the ARM simulator after running the test in a loop for about half an hour.
Comment 5•11 years ago
|
||
(In reply to Shu-yu Guo [:shu] from comment #2)
> I don't have any ARM hardware to try to reproduce this. Did this start after
> irregexp landed?
The bisection says bug 974201 is where it starts happening, and that was before irregexp landed.
Comment 6•10 years ago
|
||
Clearing NI because I have no idea how to reproduce this. Gary, please re-NI me if you come up with a machine I can ssh into or something.
Flags: needinfo?(shu)
Reporter | ||
Comment 7•10 years ago
|
||
(Shu-yu and I are trying to sort this out asap)
Updated•10 years ago
|
Whiteboard: [jsbugmon:update]
Updated•10 years ago
|
Whiteboard: [jsbugmon:update] → [jsbugmon:]
Comment 8•10 years ago
|
||
JSBugMon: Cannot process bug: Unable to automatically reproduce, please track manually.
Updated•10 years ago
|
status-firefox33:
--- → affected
Reporter | ||
Comment 10•10 years ago
|
||
This still reproduces with m-c rev 4876594bacaa.
Attachment #8425230 -
Attachment is obsolete: true
Flags: needinfo?(gary)
Reporter | ||
Comment 11•10 years ago
|
||
(In reply to Shu-yu Guo [:shu] from comment #6)
> Clearing NI because I have no idea how to reproduce this. Gary, please re-NI
> me if you come up with a machine I can ssh into or something.
Shu-yu, I now have a machine with access and an updated rev - let's sync up on IRC sometime soon.
Flags: needinfo?(shu)
Reporter | ||
Comment 12•10 years ago
|
||
Shu-yu handed this off to Marty, who took a look at it on my machine. Any discoveries?
Flags: needinfo?(shu)
Assignee | ||
Comment 13•10 years ago
|
||
; so here isthe code that was likely executed right before the code that crashed:
0xb52feec4: beq 0xb52feedc
0xb52feec8: cmn r3, #123 ; 0x7b
0xb52feecc: beq 0xb52feedc
0xb52feed0: ldr r9, [r9, #4]
0xb52feed4: ldr r0, [r9]
0xb52feed8: bx r0
0xb52feedc: mov pc, lr
0xb52feee0: andeq r0, r0, r0
0xb52feee4: andeq r0, r0, r0
0xb52feee8: andeq r0, r0, r0
0xb52feeec: ldrlt r0, [r4, #-2936]! ; 0xb78
0xb52feef0: ldr lr, [r9]
0xb52feef4: movw r12, #59424 ; 0xe820
0xb52feef8: movt r12, #46387 ; 0xb533
0xb52feefc: cmp lr, r12
0xb52fef00: bne 0xb52fef44
0xb52fef04: cmn r10, #127 ; 0x7f
0xb52fef08: bne 0xb52fef44
0xb52fef0c: mov r11, r3
0xb52fef10: push {r9} ; (str r9, [sp, #-4]!)
0xb52fef14: ldr r9, [r9, #12]
0xb52fef18: ldr r12, [r9, #-12]
0xb52fef1c: cmp r12, r11
0xb52fef20: bls 0xb52fef40
0xb52fef24: add r12, r9, r11, lsl #3
0xb52fef28: ldr r11, [r12]
0xb52fef2c: ldr r0, [r12, #4]
0xb52fef30: cmn r0, #124 ; 0x7c
0xb52fef34: beq 0xb52fef40
0xb52fef38: pop {r9} ; (ldr r9, [sp], #4)
0xb52fef3c: b 0xb52ee224
0xb52fef40: pop {r9} ; (ldr r9, [sp], #4)
0xb52fef44: b 0xb52ee8a4
0xb52fef48: b 0xb52fef58
0xb52fef4c: ; <UNDEFINED> instruction: 0xffff0003
0xb52fef50: cdple 14, 10, cr11, cr13, cr15, {7}
0xb52fef54: cdple 14, 10, cr11, cr13, cr15, {7}
0xb52fef58: andeq r0, r0, r8
0xb52fef5c: andeq r0, r0, r0
0xb52fef60: andeq r0, r0, r0
0xb52fef64: andeq r0, r0, r0
0xb52fef68: andeq r0, r0, r0
0xb52fef6c: andeq r0, r0, r0
---Type <return> to continue, or q <return> to quit---
0xb52fef70: andeq r0, r0, r0
0xb52fef74: andeq r0, r0, r0
0xb52fef78: andeq r0, r0, r0
0xb52fef7c: andeq r0, r0, r0
0xb52fef80: andeq r0, r0, r0
0xb52fef84: andeq r0, r0, r0
0xb52fef88: andeq r0, r0, r0
0xb52fef8c: andeq r0, r0, r0
0xb52fef90: andeq r0, r0, r0
0xb52fef94: andeq r0, r0, r0
0xb52fef98: andeq r0, r0, r0
0xb52fef9c: andeq r0, r0, r0
0xb52fefa0: andeq r0, r0, r0
0xb52fefa4: andeq r0, r0, r0
0xb52fefa8: andeq r0, r0, r0
0xb52fefac: andeq r0, r0, r0
0xb52fefb0: andeq r0, r0, r0
0xb52fefb4: andeq r0, r0, r0
0xb52fefb8: andeq r0, r0, r0
0xb52fefbc: andeq r0, r0, r0
0xb52fefc0: andeq r0, r0, r0
0xb52fefc4: andeq r0, r0, r0
0xb52fefc8: andeq r0, r0, r0
0xb52fefcc: andeq r0, r0, r0
0xb52fefd0: andeq r0, r0, r0
0xb52fefd4: andeq r0, r0, r0
0xb52fefd8: andeq r0, r0, r0
0xb52fefdc: andeq r0, r0, r0
0xb52fefe0: andeq r0, r0, r0
0xb52fefe4: andeq r0, r0, r0
0xb52fefe8: andeq r0, r0, r0
0xb52fefec: andeq r0, r0, r0
0xb52feff0: andeq r0, r0, r0
0xb52feff4: andeq r0, r0, r0
0xb52feff8: andeq r0, r0, r0
0xb52feffc: andeq r0, r0, r0
the tldr is normal code; normal code; branch around pool; goose eggs.
My current guesses are 1) for whatever reason, we got here and the copying never finished.
or 2) the correct code was stomped over with all zeros, which are a pretty nice nop. I'll look into this more on monday.
Flags: needinfo?(mrosenberg)
Reporter | ||
Comment 14•10 years ago
|
||
(In reply to Marty Rosenberg [:mjrosenb] from comment #13)
> the tldr is normal code; normal code; branch around pool; goose eggs.
> My current guesses are 1) for whatever reason, we got here and the copying
> never finished.
> or 2) the correct code was stomped over with all zeros, which are a pretty
> nice nop. I'll look into this more on monday.
Thanks Marty! Assigning to you...
Assignee: nobody → mrosenberg
Status: NEW → ASSIGNED
Assignee | ||
Updated•10 years ago
|
Flags: needinfo?(mrosenberg)
Assignee | ||
Comment 16•10 years ago
|
||
interesting that adding a cc cleared needinfo.
After lots of debugging, I now suspect that the issue is a lack of cache flushing.
Changing the code to always flush the caches immediately seems to make the issue go away (currently in the middle of running it 1,000 times to verify). gary, can you possibly test this with a simulator build, and ARM_SIM_ICACHE_CHECKS=1 (it is an environment variable)?
At this point, I can just munge the auto flush cache code until this issue goes away, or I can attempt to actually figure out what is going wrong beyond "caches aren't flushed." I'm leaning towards the latter, but the former is certainly doable, if we want this to go away asap.
Flags: needinfo?(gary)
Reporter | ||
Comment 17•10 years ago
|
||
(In reply to Marty Rosenberg [:mjrosenb] from comment #16)
> gary, can you possibly test this with a simulator build, and
> ARM_SIM_ICACHE_CHECKS=1 (it is an environment variable)?
No, this does not reproduce with a simulator build with or without that environment variable. (I ran about 500 times) Tested on m-c rev da1dbcff9493.
Flags: needinfo?(gary)
Assignee | ||
Comment 18•10 years ago
|
||
dougc for the refactoring of the AutoFlushICache. I didn't see any way to make the current flusher flush its contents. I suspect this may not actually be needed, but it certainly seems simpler than adding in a new AFIC for ever call between the IC top level, and here where code can be modified.
shu for the actual location of forceFlush. I perused the code, and I *think* that is the only place where it is needed. I ran the test 100 times with this in place, and didn't see any crashes.
Attachment #8446176 -
Flags: review?(shu)
Attachment #8446176 -
Flags: review?(dtc-moz)
Comment 19•10 years ago
|
||
Comment on attachment 8446176 [details] [diff] [review]
fixCacheFlusher-r0.patch
Review of attachment 8446176 [details] [diff] [review]:
-----------------------------------------------------------------
Great work, Marty.
::: js/src/jit/IonCaches.cpp
@@ +334,5 @@
> // have not yet executed the code we're patching the jump in.
> nextStubOffset_.fixup(&masm);
> CodeLocationJump nextStubJump(code, nextStubOffset_);
> PatchJump(nextStubJump, CodeLocationLabel(cache_.firstStub_));
> + AutoFlushICache::ForceFlush();
Nit: keep the newline after the ForceFlush
::: js/src/jit/IonCode.h
@@ +787,5 @@
>
> public:
> static void setRange(uintptr_t p, size_t len);
> static void flush(uintptr_t p, size_t len);
> + static void ForceFlush();
Does SM capitalize static methods?
Attachment #8446176 -
Flags: review?(shu) → review+
Comment 20•10 years ago
|
||
Comment on attachment 8446176 [details] [diff] [review]
fixCacheFlusher-r0.patch
Review of attachment 8446176 [details] [diff] [review]:
-----------------------------------------------------------------
There should be no forcing of the instruction cache flushing because the region pending flushing should never be executed. Flushing that occurs outside the pending region occurs immediately. The PatchJump should be flushing the cache immediately, unless it is in a pending region in which case it should not be executed until after the AutoFlushICache destructor is called. The patch might be a good clue, but the problem needs to be analysed first and there might be some other issue causing problems here.
Updated•10 years ago
|
Attachment #8446176 -
Flags: review?(dtc-moz) → review-
Comment 21•10 years ago
|
||
Comment on attachment 8446176 [details] [diff] [review]
fixCacheFlusher-r0.patch
Review of attachment 8446176 [details] [diff] [review]:
-----------------------------------------------------------------
In IonCache::attachStub, |attacher.patchStubCodePointer(masm, code);| should be called before |attacher.patchNextStubJump(masm, code);| too, since patchNextStubJump is what makes the new stub reachable.
Comment 22•10 years ago
|
||
Here's part of the problem. I would prefer to fix it this way, to keep the code as simple as possible. Strange that the ARM simulator did not pick this up, maybe because it's a race.
Shu: could you combine this with the changes you advised in comment 21?
Flags: needinfo?(shu)
Reporter | ||
Comment 24•10 years ago
|
||
(In reply to Douglas Crosher [:dougc] from comment #23)
> Sorry, I created this problem in Bug 988789.
Updating flags - bug 988789 landed in 30, 31 and 32, but too late for 30. Thanks for fixing this!
No longer blocks: 974201
Updated•10 years ago
|
status-b2g-v1.3:
--- → unaffected
status-b2g-v1.3T:
--- → unaffected
status-b2g-v1.4:
--- → affected
status-b2g-v2.0:
--- → affected
status-b2g-v2.1:
--- → affected
status-firefox-esr24:
--- → unaffected
Comment 25•10 years ago
|
||
Add the changes advised in comment 21, and add some comments.
Attachment #8446272 -
Attachment is obsolete: true
Flags: needinfo?(shu)
Comment 26•10 years ago
|
||
Are you able to verify that the patch in comment 25 corrects this issue?
Flags: needinfo?(gary)
Reporter | ||
Comment 27•10 years ago
|
||
I'm behind on stuff, but nonetheless this is on my list.
Reporter | ||
Comment 28•10 years ago
|
||
(In reply to Douglas Crosher [:dougc] from comment #26)
> Are you able to verify that the patch in comment 25 corrects this issue?
Yes, without the patch, it crashes within 40-50 runs most of the time. However, with the patch, it did not crash after 400-odd runs.
Flags: needinfo?(gary) → needinfo?(dtc-moz)
Updated•10 years ago
|
Attachment #8446280 -
Flags: review?(shu)
Flags: needinfo?(dtc-moz)
Updated•10 years ago
|
Attachment #8446176 -
Attachment is obsolete: true
Comment 29•10 years ago
|
||
The same patch also applies to Aurora and Beta. Try builds:
m-c: https://tbpl.mozilla.org/?tree=Try&rev=478254293a5b
Aurora: https://tbpl.mozilla.org/?tree=Try&rev=cc6035bbbd70
Beta: https://tbpl.mozilla.org/?tree=Try&rev=3fe8d322ab28
Comment 30•10 years ago
|
||
Comment on attachment 8446280 [details] [diff] [review]
Exit an AutoFlushICache dynamic context to ensure the cache has been flushed before the code is executed.
Review of attachment 8446280 [details] [diff] [review]:
-----------------------------------------------------------------
Clean fix. Thanks!
::: js/src/jit/IonCaches.cpp
@@ +393,5 @@
> // as it can be kept alive even if the cache is flushed (see
> // MarkJitExitFrame).
> attacher.patchStubCodePointer(masm, code);
> +
> + // Update the failure path.
Could you add a comment for posterity that this patch is what makes the stub accessible for parallel ICs, and should not be moved unless the author really knows what's going on?
Attachment #8446280 -
Flags: review?(shu) → review+
Comment 31•10 years ago
|
||
Address reviewer feedback. Carrying forward r+.
Attachment #8446280 -
Attachment is obsolete: true
Attachment #8452036 -
Flags: review+
Comment 32•10 years ago
|
||
Comment on attachment 8452036 [details] [diff] [review]
Exit an AutoFlushICache dynamic context to ensure the cache has been flushed before the code is executed.
[Security approval request comment]
How easily could an exploit be constructed based on the patch? Difficult, it's a thread race issue.
Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? For someone looking, yes.
Which older supported branches are affected by this flaw? Release 30, Beta 21, Aurora 32.
If not all supported branches, which bug introduced the flaw? bug 988789.
Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? The same patch applies to beta 31 and aurora 32.
How likely is this patch to cause regressions; how much testing does it need? It was hard to reproduce, and has been tested to resolve the issue, see comment 28.
Attachment #8452036 -
Flags: sec-approval?
Comment 33•10 years ago
|
||
Comment on attachment 8452036 [details] [diff] [review]
Exit an AutoFlushICache dynamic context to ensure the cache has been flushed before the code is executed.
Approval Request Comment
[Feature/regressing bug #]: bug 988789.
[User impact if declined]: Intermittent crashes in some JS code.
[Describe test coverage new/current, TBPL]: Tested locally, tbpl try builds.
[Risks and why]: Low. It was an obvious programming mistake. The problem has been reproduced, and the patch tested to resolve the issue.
[String/UUID change made/needed]: n/a
Attachment #8452036 -
Flags: approval-mozilla-beta?
Attachment #8452036 -
Flags: approval-mozilla-aurora?
Comment 34•10 years ago
|
||
Comment on attachment 8452036 [details] [diff] [review]
Exit an AutoFlushICache dynamic context to ensure the cache has been flushed before the code is executed.
sec-approval+ for trunk. I approved Aurora and Branch for after trunk is green.
Does this affect ESR24?
Attachment #8452036 -
Flags: sec-approval?
Attachment #8452036 -
Flags: sec-approval+
Attachment #8452036 -
Flags: approval-mozilla-beta?
Attachment #8452036 -
Flags: approval-mozilla-beta+
Attachment #8452036 -
Flags: approval-mozilla-aurora?
Attachment #8452036 -
Flags: approval-mozilla-aurora+
Reporter | ||
Comment 35•10 years ago
|
||
Landed after obfuscating commit message:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5c805d803e11
(and no, looking at bug 988789, ESR24 should not be affected.)
Target Milestone: --- → mozilla33
Comment 36•10 years ago
|
||
Looking at this further, part of the fix addresses an issue introduced in bug 849469 which landed on 22. This patch includes only the part of the patch that needs backporting to 29 to 22 inclusive. This patch is relative to m-c but should be easy to backport if there is any interest.
Reporter | ||
Comment 37•10 years ago
|
||
Changing ESR24 to be "affected" then, based on comment 36.
Comment 38•10 years ago
|
||
Might as well request esr24 approval on the patch then.
Comment 39•10 years ago
|
||
Comment on attachment 8452807 [details] [diff] [review]
Sub-patch for backporting to ESR 24.
[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: It is thread race with a simple fix.
User impact if declined: Low. Crashes in some rare JS code. No attempt has been made to reproduce a crash for this issue alone and it might be hard to exploit.
Fix Landed on Version: 33
Risk to taking this patch (and alternatives if risky): Low because it's a small obvious fix.
String or UUID changes made by this patch: n/a
shu: could you double check this?
Attachment #8452807 -
Flags: feedback?(shu)
Attachment #8452807 -
Flags: approval-mozilla-esr24?
Comment 40•10 years ago
|
||
Comment on attachment 8452807 [details] [diff] [review]
Sub-patch for backporting to ESR 24.
Review of attachment 8452807 [details] [diff] [review]:
-----------------------------------------------------------------
You don't need to backport this. PJS isn't turned on anywhere except on Nightly, for precisely this reason of not wanting to have to backport.
Attachment #8452807 -
Flags: feedback?(shu)
Attachment #8452807 -
Flags: approval-mozilla-esr24?
Comment 41•10 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment 42•10 years ago
|
||
Updated•10 years ago
|
Updated•10 years ago
|
Whiteboard: [jsbugmon:] → [jsbugmon:][adv-main31+]
Updated•10 years ago
|
Group: javascript-core-security
Comment 43•10 years ago
|
||
Gary, can we assume that this was verified on Fx31 based on comment 28?
Flags: needinfo?(gary)
Reporter | ||
Comment 44•10 years ago
|
||
(In reply to Matt Wobensmith from comment #43)
> Gary, can we assume that this was verified on Fx31 based on comment 28?
I just retested that the intermittent crash does not occur on mozilla-beta tip rev 6befadcaa685 but occurs on mozilla-beta regressor rev 7a1c696cade6, so yes, verified.
Status: RESOLVED → VERIFIED
Flags: needinfo?(gary)
Updated•9 years ago
|
Group: core-security → core-security-release
Updated•8 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•