Closed Bug 1402701 Opened 7 years ago Closed 3 years ago

Crash in mozilla::dom::GenericBindingGetter

Categories

(Core :: JavaScript Engine: JIT, defect, P2)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: jseward, Unassigned)

Details

(Keywords: crash)

Crash Data

This bug was filed from the Socorro interface and is 
report bp-eafae362-ff85-4d42-ac2b-a4fe80170922.
=============================================================

This is topcrash #21 in the Windows nightly 20170921100141.
Flags: needinfo?(bzbarsky)
Not too much to go on here...  There's a ton of EXCEPTION_ACCESS_VIOLATION_READ at 0xffffffffffffffff and some EXCEPTION_ACCESS_VIOLATION_EXEC at various addresses (0x22eea984500, 0x1a121a4e580, 0x255dbeeef00).

All the ones I spot-checked are crashing on this line:

  bool ok = getter(cx, obj, self, JSJitGetterCallArgs(args));

which presumably means one of "args", "obj", "self", "cx", or "getter" is bogus.

All the spot-check stacks show only jitcode above the GenericBindingGetter call, afaict.

"cx" and "args" are just arguments passed by the jit.  "obj" is args.thisv().  "getter" comes from args.calleev() and its jitinfo.  "self" comes from "obj" (maybe-unwrapping it, then extracting the C++ pointer).

So best guesses are either the JIT passing in bogus information or the bindings generating incorrect jitinfo or bogus binding objects.

Looking at just the binding end, the most recent checkins to Codegen.py, in reverse chronological order, as of right now are:

1) Bug 1369533.  This didn't land on m-c until 2017-09-22, so presumably not present in the above-linked nightly.
2) Bug 1396613.  This landed on 2017-09-19 mid-day.  I would have expected problems from it to show up in the 2017-09-20 nightly.
3) Bug 1400139.  Also landed 2017-09-19.
4) Bug 991271 -- same.
5) Bug 1400275 -- landed 2017-09-16.

There were various spidermonkey changes on 2017-09-20; still digging through those.
Apparently the minidump for the above-linked crash is not debuggable with msvc.  :(

I'd love it if someone could sanity-check which JS changes might be new in the Sept 21 nightly.
Flags: needinfo?(jdemooij)
Flags: needinfo?(bzbarsky)
Flags: needinfo?(andrebargull)
OK.  So if I'm using https://dbaron.org/mozilla/crashes-by-build correctly, the regression range here (crashes first appear in the 20170921100141 build) is: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=319a34bea9e4f3459886b5b9e835bd338320f1fd&tochange=47f7b6c64265bc7bdd22eef7ab71abc97cf3f8bf

Definitely lots of JS changes in there and no DOM changes.  In particular, 

  hg log -r "ancestors(47f7b6c64265bc7bdd22eef7ab71abc97cf3f8bf) - ancestors(319a34bea9e4f3459886b5b9e835bd338320f1fd)" js/src/

shows 22 JS changesets landing on that day.  That said, none of those are jumping out at me as causing something like this...
dmajor says, on irc, "either getter is garbage, or the entire stack is garbage and
+GenericBindingGetter is an innocent bystander (maybe leftover from a previous
+call?)"
Looking at padenot computer, the code around the PC looks likes an out-of-line code generated by IonMonkey.
This can be identify by the push & jump which are a bit before, which are generated to push the snapshot key identifying how to bailout from a specific code location.

While looking briefly at the stack, I can identify a few JitFrames, which should help figuring out the nearest C++ caller if needed.

"GenericBindingGetter" frame is likely to be a left-over from Baseline stubs.
Nicolas, is it likely that this is a spidermonkey jit bug?
Flags: needinfo?(nicolas.b.pierron)
(In reply to Boris Zbarsky [:bz] (still digging out from vacation mail) from comment #6)
> Nicolas, is it likely that this is a spidermonkey jit bug?

I cannot tell for sure with the little information I saw with this crash dump, but this is so far the most likely hypothesis.
I would say more specifically IonMonkey bailouts / out-of-line code paths, or the code which is branching to it.
Flags: needinfo?(nicolas.b.pierron)
Yeah I agree the stack is likely bogus. It's also curious the 20170921220243 Nightly was crashy but not newer Nightly builds. I've seen the same thing with other crash signatures - PGO issue?

CC'ing tcampbell, who has been looking into Nightly JIT crashes this week.
Component: DOM → JavaScript Engine: JIT
Flags: needinfo?(jdemooij)
I've seen weird issues with 20170921220243 in cases like:
https://crash-stats.mozilla.org/report/index/6acc47e5-ae16-455a-9605-300130170923
https://crash-stats.mozilla.org/report/index/c0a55d73-a9c1-4de5-9825-364930170923

There are two of these crashes, but they seem to be same user. In both of them, the we are executing at js::jit::BytecodeAnalysis::init+0x214, but the nearest real instruction is js::jit::BytecodeAnalysis::init+0x213. Trying to execute this misaligned instruction stream generates large address writes causing the crash. PGO issue seems possible.
Priority: -- → P2
Flags: needinfo?(andrebargull)

According to the crash stats, there is only one crash in the last 6 months and that on Firefox 60.9.0esr (not on the latest Firefox versions).
Closing this as Resolved Worksforme, please do re-open it if this crash will appear on the latest version of Firefox.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.