Closed Bug 1443746 Opened 2 years ago Closed 2 years ago

Crash in mozilla::CycleCollectedJSContext::PerformMicroTaskCheckPoint

Categories

(Core :: DOM: Core & HTML, defect, critical)

Unspecified
All
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox-esr52 --- unaffected
firefox58 --- unaffected
firefox59 --- unaffected
firefox60 --- fixed

People

(Reporter: calixte, Assigned: smaug)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is
report bp-8758b2a7-6697-4677-a5b4-6c4820180305.
=============================================================

Top 10 frames of crashing thread:

0 libxul.so mozilla::CycleCollectedJSContext::PerformMicroTaskCheckPoint [clone .cold.252] 
1 libxul.so mozilla::EventListenerManager::HandleEventSubType 
2 libxul.so mozilla::EventListenerManager::HandleEventInternal 
3 libxul.so mozilla::EventTargetChainItem::HandleEventTargetChain 
4 libxul.so mozilla::EventDispatcher::Dispatch 
5 libxul.so mozilla::EventDispatcher::DispatchDOMEvent 
6 libxul.so nsINode::DispatchEvent 
7 libxul.so mozilla::dom::EventTarget::DispatchEvent dom/events/EventTarget.cpp:102
8 libxul.so mozilla::dom::EventTargetBinding::dispatchEvent 
9 libxul.so mozilla::dom::EventTargetBinding::genericMethod 

=============================================================

There are 41 crashes (from 9 installations) in nightly 60 starting with buildid 20180305100134. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1193394.

[1] https://hg.mozilla.org/mozilla-central/rev?node=068c59c7c4ec
Flags: needinfo?(arai.unmht)
The interesting part is way higher up in the stack. Someone is doing bad things either by scheduling mozilla::dom::ImageLoadTask::Run to run at stable state or more like by dispatching events inside a contentpolicy.
I wonder if ckerschb or tanvi recalls any suspicious JS implemented nsIContentPolicy implementations.
Or Kris, is it possible that some addon code dispatches DOM events at wrong time?
(I'm not at all familiar with web extensions and content policies. The crash report does have ublock)
OS: Linux → All
And to clarify, as far as I see, bug 1193394 just revealed some existing bug.
Hiro, are you able to look at this?
Flags: needinfo?(hikezoe)
Sure I can, I am super unfamiliar with nsIContentPolicy but I can check the implementors tomorrow.
Hello, i one of affected peoples.

at least these crashes is mine :

bp-0c46b97b-9830-4fb2-aa84-a86490180307
	3/7/18	5:16 PM
bp-73399732-7af0-4cf7-8deb-55a2a0180307
	3/7/18	3:28 AM
bp-5722cbae-273b-45ee-868b-9dfd70180307
	3/7/18	3:28 AM
bp-8e27bc79-ad56-4b21-8468-77f6d0180307
	3/7/18	3:28 AM
bp-4061dcbe-bdd7-4f3c-b484-ac7bd0180307
	3/7/18	3:13 AM
bp-5e38d5d8-2394-4f73-b43b-931be0180307
	3/7/18	3:13 AM


Extensions i have doesn't include UBlock :

 Name 	Version 	Enabled 	ID
All Tabs Helper	1.0.19	true	alltabshelper@alltabshelper.org
Ecosia - The search engine that plants trees	4.0.2	true	{d04b0b40-3dab-4f0b-97a6-04ec3eddbfb0}
HTTPS Everywhere	2018.2.26	true	https-everywhere-eff@eff.org
NoScript	5.1.8.4	true	{73a6fe31-595d-460b-a920-fcc0f8843232}
Video DownloadHelper	7.2.0	true	{b9db16a4-6edc-47ec-a1f4-b86292ed211d}
Web ScrapBook	0.26.2	true	webscrapbook@danny0838.addons.mozilla.org
Zoom Page WE	8.1	true	zoompage-we@DW-dev
(In reply to Olli Pettay [:smaug] (unusual timezone [JST] for a week) from comment #3)
> Or Kris, is it possible that some addon code dispatches DOM events at wrong
> time?

Not a WebExtension. The problem here, though, is a legacy NoScript extension, which dispatches a DOM event during a content policy shouldLoad callback. Which should never happen.

That's one of the problems, anyway. The other is that we try to perform a microtask checkpoint while JS from an XPC callback is on the stack. I know we can't perform a checkpoint after every XPC callback, but we should probably note when JS from an XPC callback is on the stack, and delay any microtask checkpoints that we need to perform from other contexts until the last callback exits.
Clearing NI to me since :kmag already found the problem.
Flags: needinfo?(hikezoe)
(In reply to Kris Maglione [:kmag] (long backlog; ping on IRC if you're blocked) from comment #8)
> That's one of the problems, anyway. The other is that we try to perform a
> microtask checkpoint while JS from an XPC callback is on the stack. I know
> we can't perform a checkpoint after every XPC callback, but we should
> probably note when JS from an XPC callback is on the stack, and delay any
> microtask checkpoints that we need to perform from other contexts until the
> last callback exits.
That doesn't sound reasonable. In case of event dispatching for example that would reveal
that there is xpc js on stack to the event listeners.
for this particular issue... I guess we should just prevent event dispatch when it is not right time to do so. And assert strongly in debug builds at least.
(In reply to Olli Pettay [:smaug] (unusual timezone [JST] for a week) from comment #10)
> That doesn't sound reasonable. In case of event dispatching for example that
> would reveal that there is xpc js on stack to the event listeners.

It would reveal that, yes. But this is the behavior any time we dispatch an event from chrome JS code that's running in a microtask. It seems more than a little counter-intuitive that when chrome JS calls a function, sometimes it may trigger a microtask checkpoint, but sometimes it won't. I don't think there's any situation where that can happen for content JS (correct me if I'm wrong), and I'd be surprised if exposing chrome JS to that situation doesn't cause any serious bugs.
It breaks the whole model of microtasks, which is to have microtask checkpoint at the end of the outermost script execution (and from content point of view there is no chrome JS).

If there is chrome JS dispatching events to content while we're inside a microtask (nsAutoMicroTask for example), that is a bug to fix.
It is hard to get this right, I know, but we should all the time think this from web content point of view.
(In reply to Olli Pettay [:smaug] (unusual timezone [JST] for a week) from comment #13)
> It breaks the whole model of microtasks, which is to have microtask
> checkpoint at the end of the outermost script execution (and from content
> point of view there is no chrome JS).
> 
> If there is chrome JS dispatching events to content while we're inside a
> microtask (nsAutoMicroTask for example), that is a bug to fix.
> It is hard to get this right, I know, but we should all the time think this
> from web content point of view.

OK. So if we want to draw that distinction, and only consider content micro-tasks on the stack when deciding whether to perform a checkpoint for content, then I think we need separate queues for chrome and content tasks. Because the situation right now for chrome is a mess.

If we trigger content micro-task while a chrome micro-task is on the stack, everything works as expected from the chrome point-of-view, but not from the content point of view. If we trigger a content micro-task from chrome JS from an observer callback, we wind up running chrome promise resolution handlers that we're probably not expecting to run. If we resolve a chrome promise from an event listener or a runnable in the event loop, the handlers run almost immediately. If we resolve them from an observer, they run at some arbitrary point in the future.
Anyhow, for this particular bug I'd like to do this. Legacy addons may not like this, but oh well. I was told that NoScript already has a non-legacy version.

remote: View your change here:
remote:   https://hg.mozilla.org/try/rev/a49063af17e27128ef21dd8064829edb1588b70c
remote: 
remote: Follow the progress of your build on Treeherder:
remote:   https://treeherder.mozilla.org/#/jobs?repo=try&revision=a49063af17e27128ef21dd8064829edb1588b70c


For generic microtask handling in chrome vs content case, which this bug is _not_ about, we can think of something better than what we have now. Like, we can do something similar-ish what we currently do with nsIGlobalObject::IsInSyncOperation()
Assignee: nobody → bugs
Attachment #8957032 - Flags: review?(masayuki)
Comment on attachment 8957032 [details] [diff] [review]
safer_event_dispatch.diff

It might be better to report to the console why the dispatching failed, though, since it may be hard to understand for the developers meeting this error.
Attachment #8957032 - Flags: review?(masayuki) → review+
true, though legacy addons are going away, and normal chrome JS code should just trigger the MOZ_ASSERT.
Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/95d553c23d52
ensure DOM events aren't dispatched at unexpected time, r=masayuki
Flags: needinfo?(arai.unmht)
https://hg.mozilla.org/mozilla-central/rev/95d553c23d52
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
While going through nightly crash triage I still see a few signatures in 61 and 60 (https://bit.ly/2H3LOV2). One of them has Crash Reason: MOZ_RELEASE_ASSERT(!mDoingStableStates). Perhaps right now it is such low volume we shouldn't worry much.
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.