Crash [@ js::jit::ICEntry::pc] or Crash [@ js::TypeMonitorResult] with Debugger

RESOLVED FIXED in Firefox 50

Status

()

P1
critical
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: decoder, Assigned: ejpbruel)

Tracking

(Blocks: 2 bugs, {crash, regression, testcase})

Trunk
mozilla51
x86_64
Linux
crash, regression, testcase
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite -

Firefox Tracking Flags

(firefox48 wontfix, firefox49 wontfix, firefox-esr45 wontfix, firefox50 fixed, firefox51 fixed)

Details

(Whiteboard: [jsbugmon:update], crash signature)

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

2 years ago
The following testcase crashes on mozilla-central revision e5859dfe0bcb (build with --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --enable-debug --enable-optimize, run with --fuzzing-safe --thread-count=2 --disable-oom-functions --ion-offthread-compile=off --baseline-eager --ion-pgo=on --ion-eager):

if (typeof assertThrowsInstanceOf === 'undefined')
    var assertDeepEq = (function(){})();
gczeal(14,3);
var g = newGlobal();
g.eval("function f(n) { if (n) f(n - 1); debugger; }");
var dbg = new Debugger(g);
var hits = 0;
dbg.onDebuggerStatement = function (frame) {
    if (hits === 0) {
        for (; frame; frame = frame.older)
            frame.seen = true;
        for (; frame; frame = arguments in 0.4 << (this) == Int16Array)
            assertEq(frame.seen, true);
    }
};
g.f(20);



Backtrace:

 received signal SIGSEGV, Segmentation fault.
js::jit::ICEntry::pc (script=0x7ffff7ead300, this=<optimized out>) at js/src/jit/SharedIC.h:302
#0  js::jit::ICEntry::pc (script=0x7ffff7ead300, this=<optimized out>) at js/src/jit/SharedIC.h:302
#1  js::jit::DoTypeMonitorFallback (cx=0x7ffff6965000, payload=<optimized out>, stub=0x7ffff69fa058, value=..., res=...) at js/src/jit/SharedIC.cpp:3980
#2  0x00007ffff7feb666 in ?? ()
#3  0x0000000000000000 in ?? ()
rax	0x7ffff7ead300	140737352749824
rbx	0x7fffffff9830	140737488328752
rcx	0x7ffff69567c0	140737330374592
rdx	0x7ffff69567c0	140737330374592
rsi	0x7fffffff9838	140737488328760
rdi	0x7fffffff9830	140737488328752
rbp	0x7fffffff9920	140737488328992
rsp	0x7fffffff9800	140737488328704
r8	0x7fffffff9850	140737488328784
r9	0x7fffffff9f60	140737488330592
r10	0x7fffffff9970	140737488329072
r11	0xfff9000000000000	-1970324836974592
r12	0x7ffff6965000	140737330434048
r13	0x7fffffff9980	140737488329088
r14	0x7ffff69fa058	140737331044440
r15	0x2e	46
rip	0x7f80a9 <js::jit::DoTypeMonitorFallback(JSContext*, void*, js::jit::ICTypeMonitor_Fallback*, JS::HandleValue, JS::MutableHandleValue)+121>
=> 0x7f80a9 <js::jit::DoTypeMonitorFallback(JSContext*, void*, js::jit::ICTypeMonitor_Fallback*, JS::HandleValue, JS::MutableHandleValue)+121>:	mov    0x98(%rax),%edx
   0x7f80af <js::jit::DoTypeMonitorFallback(JSContext*, void*, js::jit::ICTypeMonitor_Fallback*, JS::HandleValue, JS::MutableHandleValue)+127>:	and    $0xfffffff,%r15d


This issue was really hard to reduce as a test and the original was highly intermittent (~1 out of 1000 runs). The resulting test here appears to be stable though.

Updated

2 years ago
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:update]

Comment 1

2 years ago
JSBugMon: Bisection requested, result:
autoBisect shows this is probably related to the following changeset:

The first bad revision is:
changeset:   https://hg.mozilla.org/mozilla-central/rev/49d59741e736
user:        Eddy Bruel
date:        Thu Jul 07 11:36:16 2016 +0200
summary:     Bug 1271650 - Implement a DebuggerFrame class;r=fitzgen

This iteration took 225.917 seconds to run.
Eddy, is bug 1271650 a likely regressor?
Blocks: 1271650
Flags: needinfo?(ejpbruel)
status-firefox48: --- → unaffected
status-firefox49: --- → unaffected
status-firefox51: --- → affected
(Assignee)

Comment 3

2 years ago
Hey Gary. It's not impossible that this patch is the culprit; since the patch doesn't do that much, there aren't that many ways in which it could have broken things. It's likely some stupid oversight on my end that causes GC issues (hence the highly intermittent nature reported by decoder).

Putting this on top of my todo list for tomorrow.
Assignee: nobody → ejpbruel
Flags: needinfo?(ejpbruel)
Priority: -- → P1
(Assignee)

Comment 4

2 years ago
Created attachment 8777959 [details] [diff] [review]
Fix regression caused by bug 1271650.

I can reliably produce the crash on OSX. With this patch applied, the crash seems to go away. Decoder, can you confirm?

I assume we'll need to add the failing fuzz test as a regression test? What's the preferred location for regression tests, and how can I make sure they are run with the appropriate flags?
Attachment #8777959 - Flags: feedback?(choller)
(Reporter)

Comment 5

2 years ago
(In reply to Eddy Bruel [:ejpbruel] from comment #4)
> Created attachment 8777959 [details] [diff] [review]
> Fix regression caused by bug 1271650.
> 
> I can reliably produce the crash on OSX. With this patch applied, the crash
> seems to go away. Decoder, can you confirm?
> 

This doesn't look like the right patch. That said, testing patches is very expensive for us, we don't do manual builds anymore. Instead, we solely rely on our own automated builds. If you really need confirmation, feel free to feedback? again with the right patch, otherwise I'd suggest to just get review and land it right away. If the patch doesn't entirely fix the problem, it is very likely that we will detect it in fuzzing again.
Flags: needinfo?(ejpbruel)
(Assignee)

Comment 6

2 years ago
(In reply to Christian Holler (:decoder) from comment #5)
> (In reply to Eddy Bruel [:ejpbruel] from comment #4)
> > Created attachment 8777959 [details] [diff] [review]
> > Fix regression caused by bug 1271650.
> > 
> > I can reliably produce the crash on OSX. With this patch applied, the crash
> > seems to go away. Decoder, can you confirm?
> > 
> 
> This doesn't look like the right patch. That said, testing patches is very
> expensive for us, we don't do manual builds anymore. Instead, we solely rely
> on our own automated builds. If you really need confirmation, feel free to
> feedback? again with the right patch, otherwise I'd suggest to just get
> review and land it right away. If the patch doesn't entirely fix the
> problem, it is very likely that we will detect it in fuzzing again.

That's fair enough. Do I still need to add the failing JIT test as a regression test to the patch? If so, do you know what directory we typically put these regression tests, and how I can make sure that this test gets executed with the flags you provided?
Flags: needinfo?(ejpbruel) → needinfo?(choller)
(Reporter)

Comment 7

2 years ago
(In reply to Eddy Bruel [:ejpbruel] from comment #6)
> 
> That's fair enough. Do I still need to add the failing JIT test as a
> regression test to the patch? If so, do you know what directory we typically
> put these regression tests, and how I can make sure that this test gets
> executed with the flags you provided?

I would just add it to js/src/jit-test/tests/debug/. For the flags, I'd maybe add 

> setJitCompilerOption("offthread-compilation.enable", 0);

To ensure off-thread compilation is disabled and ignore the test. ion/baseline eager is set my jit-tests when they run as far as I know, and the other flags look like they might not be necessary. You might want to confirm this with your JS reviewer :))
Flags: needinfo?(choller)
(Reporter)

Updated

2 years ago
Attachment #8777959 - Flags: feedback?(choller)
(Assignee)

Comment 8

2 years ago
Created attachment 8780505 [details] [diff] [review]
Fix a regression caused by bug 1271650

This regression was caused by bug 1271650: when I created the DebuggerFrame::create function, I accidentally added a TenuredObject argument to the call to NewObjectWithGivenProto.

Jim, can you confirm decoder's statement in comment 7? That is, do we need any additional flags for this test to work correctly?
Attachment #8777959 - Attachment is obsolete: true
Attachment #8780505 - Flags: review?(jimb)

Comment 9

2 years ago
I don't know what flags we need. It seems that the original required all of --fuzzing-safe --thread-count=2 --disable-oom-functions --ion-offthread-compile=off --baseline-eager --ion-pgo=on --ion-eager, and the single call suggested in comment 7 certainly doesn't do all that.

I think if you can revert the change to Debugger.cpp, see the test reliably fail, and then re-apply the change to Debugger.cpp and see it pass, that's good enough here.

Updated

2 years ago
Attachment #8780505 - Flags: review?(jimb) → review+

Comment 11

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/b274d6e85690
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox51: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Can/should we uplift this to 50 (seems it's also affected)?
Flags: needinfo?(ejpbruel)
(Assignee)

Comment 13

2 years ago
(In reply to Andrew Overholt [:overholt] from comment #12)
> Can/should we uplift this to 50 (seems it's also affected)?

It's a trivial fix, so if that's the case, we should uplift it.
Flags: needinfo?(ejpbruel)
Can you please request approval as you're in a much better position to fill out the form? Thanks, Eddy.
Flags: needinfo?(ejpbruel)
(Assignee)

Comment 15

2 years ago
(In reply to Andrew Overholt [:overholt] from comment #14)
> Can you please request approval as you're in a much better position to fill
> out the form? Thanks, Eddy.

Of course. Sorry, I intended to do so right after my previous reply, but got distracted.
Flags: needinfo?(ejpbruel)
(Assignee)

Comment 16

2 years ago
Created attachment 8781943 [details] [diff] [review]
Uplift request

Approval Request Comment
[Feature/regressing bug #]:
Bug 1271650

[User impact if declined]:
Random crashes (the nature of the crash is GC sensitive).

[Describe test coverage new/current, TreeHerder]:
No test coverage.

[Risks and why]:
The fix reverts an accidental change that was made in bug 1291685. Since the previous code was there before, there is very little risk to taking this patch.

[String/UUID change made/needed]:
N/A
Attachment #8781943 - Flags: approval-mozilla-beta?
Attachment #8781943 - Flags: approval-mozilla-aurora?
Comment on attachment 8781943 [details] [diff] [review]
Uplift request

Crash fix, Aurora50+
Attachment #8781943 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Bug 1271650 didn't land on 49, so why was Beta approval requested on this?
Flags: needinfo?(ejpbruel)
Flags: in-testsuite+
Also, the test didn't land even though it was in the original patch attached?
Flags: in-testsuite+ → in-testsuite?

Comment 20

2 years ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-aurora/rev/3fa9b262d4e5
status-firefox50: affected → fixed
(Assignee)

Comment 21

2 years ago
(In reply to Ryan VanderMeulen [:RyanVM] from comment #18)
> Bug 1271650 didn't land on 49, so why was Beta approval requested on this?

My mistake!
Flags: needinfo?(ejpbruel)
(Assignee)

Comment 22

2 years ago
(In reply to Ryan VanderMeulen [:RyanVM] from comment #19)
> Also, the test didn't land even though it was in the original patch attached?

This was intentional. See comment 9.
Attachment #8781943 - Flags: approval-mozilla-beta?
Flags: in-testsuite? → in-testsuite-
Crash volume for signature 'js::TypeMonitorResult':
 - nightly (version 51): 2 crashes from 2016-08-01.
 - aurora  (version 50): 0 crashes from 2016-08-01.
 - beta    (version 49): 13 crashes from 2016-08-02.
 - release (version 48): 23 crashes from 2016-07-25.
 - esr     (version 45): 22 crashes from 2016-05-02.

Crash volume on the last weeks (Week N is from 08-22 to 08-28):
            W. N-1  W. N-2  W. N-3
 - nightly       1       1       0
 - aurora        0       0       0
 - beta          3       2       1
 - release       5       5       2
 - esr           1       1       2

Affected platforms: Windows, Mac OS X, Linux

Crash rank on the last 7 days:
           Browser     Content   Plugin
 - nightly
 - aurora
 - beta    #1747
 - release #1526
 - esr     #913
status-firefox48: unaffected → affected
status-firefox49: unaffected → affected
status-firefox-esr45: --- → affected
status-firefox48: affected → wontfix
status-firefox49: affected → wontfix
status-firefox-esr45: affected → wontfix
This is still reproducible in relatively low volume on Fx50, based on crash data from the last ~2 months.

  SIGNATURE   | js::jit::ICEntry::pc
  ----------------------------------------
  CRASH STATS | http://tinyurl.com/hoaldmg
  ----------------------------------------
  OVERVIEW    | 0 crashes on nightly 52
	      | 0 crashes on nightly 51
	      | 0 crashes on aurora 51
	      | 0 crashes on nightly 50
	      | 0 crashes on aurora 50
	      | 0 crashes on beta 50


  SIGNATURE   | js::TypeMonitorResult
  ----------------------------------------
  CRASH STATS | http://tinyurl.com/zs9jwvz
  ----------------------------------------
  OVERVIEW    | 2 crashes on nightly 52
	      | 2 crashes on nightly 51
	      | 2 crashes on aurora 51
	      | 0 crashes on nightly 50
	      | 16 crashes on aurora 50
	      | 93 crashes on beta 50
  ----------------------------------------
  LAST CRASH  | 2016-10-04 (on 50.0b3)
status-firefox52: --- → affected
status-firefox52: affected → ---
You need to log in before you can comment on or make changes to this bug.