crash in mozilla::dom::CallbackObject::CallSetup::CallSetup(mozilla::dom::CallbackObject*, mozilla::ErrorResult&, mozilla::dom::CallbackObject::ExceptionHandling, JSCompartment*, bool)

RESOLVED FIXED in Firefox 36

Status

()

--
critical
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: dmajor, Assigned: padenot)

Tracking

({crash})

unspecified
mozilla38
x86
Windows NT
crash
Points:
---

Firefox Tracking Flags

(firefox35 unaffected, firefox36+ fixed, firefox37+ fixed, firefox38+ fixed)

Details

(crash signature)

(Reporter)

Description

4 years ago
This bug was filed from the Socorro interface and is 
report bp-2b3bdf59-5bad-4535-85d5-1408e2150215.
=============================================================

New top crash in 36b9.

There is a null pointer somewhere in here (hard to say because of inlining):
js::UncheckedUnwrap(aCallback->CallbackPreserveColor());


0 	xul.dll 	mozilla::dom::CallbackObject::CallSetup::CallSetup(mozilla::dom::CallbackObject*, mozilla::ErrorResult&, mozilla::dom::CallbackObject::ExceptionHandling, JSCompartment*, bool) 	dom/bindings/CallbackObject.cpp
1 	xul.dll 	mozilla::dom::Function::Call<nsCOMPtr<nsISupports> >(nsCOMPtr<nsISupports> const&, nsTArray<JS::Value> const&, JS::MutableHandle<JS::Value>, mozilla::ErrorResult&, mozilla::dom::CallbackObject::ExceptionHandling) 	obj-firefox/dist/include/mozilla/dom/FunctionBinding.h
2 	xul.dll 	nsGlobalWindow::RunTimeoutHandler(nsTimeout*, nsIScriptContext*) 	dom/base/nsGlobalWindow.cpp
3 	xul.dll 	nsGlobalWindow::RunTimeout(nsTimeout*) 	dom/base/nsGlobalWindow.cpp
4 	xul.dll 	nsGlobalWindow::TimerCallback(nsITimer*, void*) 	dom/base/nsGlobalWindow.cpp
(Reporter)

Comment 1

4 years ago
[Tracking Requested - why for this release]: Regression in 36b9. 

Unfortunately aurora and nightly don't have enough volume to draw any conclusions about whether they are affected or when this was introduced.
status-firefox36: --- → affected
status-firefox37: --- → ?
status-firefox38: --- → ?
tracking-firefox36: --- → ?
(Reporter)

Comment 2

4 years ago
Looking at this list of changes, nothing really stands out. http://release.mozilla.org/statistics/36/2015/02/13/fx-36-b8-to-b9.html -- maaaaybe froydnj's timer change but it doesn't fit super well.

So I'm totally guessing on these ni's. Any help redirecting to a better owner would be appreciated.
Flags: needinfo?(nfroyd)
Flags: needinfo?(bzbarsky)
The timer change really shouldn't have effects like this...but I don't see any significantly better options.  Maybe the JS changes?  The crash reports look like the crashing thread doesn't unwind beyond nsGlobalWindow::TimerCallback; is that because I just don't know how to look at crash reports (i.e. there's options for expanding the stacks), or because we're calling those bits from JIT code?
Flags: needinfo?(nfroyd) → needinfo?(dmajor)
(Reporter)

Comment 4

4 years ago
Neither -- sometimes Socorro just doesn't do a great job with the stack. Here's the stack for bp-2b3bdf59-5bad-4535-85d5-1408e2150215 according to my debugger:

0032ee14 67abe183 xul!mozilla::dom::CallbackObject::CallSetup::CallSetup+0x72
0032ef24 677d46ff xul!mozilla::dom::Function::Call<nsCOMPtr<nsISupports> >+0x43
0032f068 677d49f6 xul!nsGlobalWindow::RunTimeoutHandler+0x112
0032f0dc 67bcbab7 xul!nsGlobalWindow::RunTimeout+0x1ed
0032f0f0 67a56e93 xul!nsGlobalWindow::TimerCallback+0x21
0032f1e0 678d1701 xul!nsTimerImpl::Fire+0x194
0032f214 67a585c7 xul!nsTimerEvent::Run+0x37
0032f32c 67a57aa0 xul!nsThread::ProcessNextEvent+0x631
0032f35c 67a57549 xul!mozilla::ipc::MessagePump::Run+0x5f
0032f394 67a57a30 xul!MessageLoop::RunHandler+0x20
0032f3b4 67a59091 xul!MessageLoop::Run+0x19
0032f3c0 67a59c2a xul!nsBaseAppShell::Run+0x32
0032f3cc 6786d733 xul!nsAppShell::Run+0x1b
0032f3dc 67aea34e xul!nsAppStartup::Run+0x20
0032f4b8 67ae9e20 xul!XREMain::XRE_mainRun+0x46f
0032f4d4 67c086c7 xul!XREMain::XRE_main+0x108
0032f658 01001635 xul!XRE_main+0x39
0032f7ec 010012dc firefox!do_main+0x125
0032f884 010010dc firefox!NS_internal_main+0xec
0032f898 01002484 firefox!wmain+0xbc
Flags: needinfo?(dmajor)
The stack in comment 4 is what I'd expect for a setTimout/setInterval timer firing.

OK, so what could be null here?  Some options:

1)  aCallback.  This is being passed in from Call() like so:

      CallSetup s(this, aRv, aExceptionHandling, aCompartment);

so it can only be null if the dom::Function was null.  It's not null when we initially create the nsIScriptTimeoutHandler, but maybe that got unlinked or something?

2)  aCallback->CallbackPreserveColor().  This returns mCallback, basically, so mCallback would have to be null.  This can really only happen if DropJSObjects happened on the dom::Function, which means it was unlinked by the cycle collector.  Unless something went _really_ wrong somewhere...

I agree that bug 1036515 looks most obviously related to this stuff, but I don't see how the patch in that bug would cause things to be unlinked when they shouldn't be...
Flags: needinfo?(bzbarsky)
Tracking as it is a top crash.
Today is gtb for beta 10 (before the RC later this week).
Can we backout bug 1036515?
tracking-firefox36: ? → +
Bz, what do you think we should do here? Thanks
Flags: needinfo?(bzbarsky)
We could try backing out bug 1036515 and seeing if it helps.  Past that, the most useful thing to do would be if we could figure out a way to reproduce this....
Flags: needinfo?(bzbarsky)
OK, thanks. Backout of bug 1036515 has been done.
Blocks: 1036515
(Reporter)

Comment 11

4 years ago
Also there is a pretty strong correlation with avrt:
    100% (426/428) vs.  53% (26110/49176) avrt.dll

Which seems to be used only by webrtc and libcubeb: https://dxr.mozilla.org/mozilla-central/search?q=avrt&case=true

The latter makes me wonder if this is yet another variant of bug 1133190 (which has been addressed by backout on beta). Sorry for not noticing this earlier.
(Reporter)

Comment 12

4 years ago
(On the other hand, if the true cause is something in those solitaire apps, and they happen to do audio things, then the avrt correlation would be just a symptom)
Tracking this crash. Bug 1036515 has been backed out of 36 and ESR, so they should no longer be affected by this bug. We still need to deal with this for 37+.
status-firefox35: --- → unaffected
status-firefox36: affected → unaffected
status-firefox37: ? → affected
status-firefox38: ? → affected
tracking-firefox37: --- → +
tracking-firefox38: --- → +
Nathan - Given that the current assumption is that this is fallout from bug 1036515, can you take this bug to investigate?
Flags: needinfo?(nfroyd)
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #14)
> Nathan - Given that the current assumption is that this is fallout from bug
> 1036515, can you take this bug to investigate?

Given that we were shooting in the dark backing out the patch from bug 1036515, and given comment 11 and comment 12, I'd like to wait a week or so to see what the crash volume looks like on release and/or beta.  Is that feasible?
Flags: needinfo?(nfroyd) → needinfo?(lmandel)
Yes. I think that's reasonable. 37 Beta 1 is scheduled to ship on Thu, Feb 26. We should have crash data by the following Mon, Mar 2.
Flags: needinfo?(lmandel)
Whiteboard: [checkin on Mar 2]
How's the Fx37 crash data looking? Any signs of this?
Flags: needinfo?(dmajor)
(Reporter)

Comment 18

4 years ago
This crash doesn't appear in 36b10 or 37b1.

Since bug 1133190 was also backed out for 37, and bug 1036515 was not, this must have been caused by the former, right?
Flags: needinfo?(dmajor)
Sure sounds like it to me, thanks. I'm going to go ahead and re-land bug 1036515 on <37 then.
Bug 1036515 re-landed. Calling this fixed by the backout of bug 1133190.
Assignee: nobody → padenot
Blocks: 1133190
No longer blocks: 1036515
Status: NEW → RESOLVED
Last Resolved: 4 years ago
status-firefox36: unaffected → fixed
status-firefox37: affected → fixed
status-firefox38: affected → fixed
Resolution: --- → FIXED
Whiteboard: [checkin on Mar 2]
Target Milestone: --- → mozilla38
You need to log in before you can comment on or make changes to this bug.