crash at null in [@ NS_CycleCollectorSuspect3]
Categories
(Core :: DOM: Workers, defect, P3)
Tracking
()
People
(Reporter: tsmith, Assigned: smaug)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: crash, regression, testcase, Whiteboard: [bugmon:bisected,confirmed])
Crash Data
Attachments
(4 files, 3 obsolete files)
Found while fuzzing m-c 20220517-8027f6771a74 (--enable-address-sanitizer --enable-fuzzing)
To reproduce via Grizzly Replay:
$ pip install fuzzfetch grizzly-framework
$ python -m fuzzfetch -a --fuzzing -n firefox
$ python -m grizzly.replay ./firefox/firefox testcase.html
==23648==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f5d4a2006a7 bp 0x7f5d161fd850 sp 0x7f5d161fd780 T62)
==23648==The signal is caused by a READ memory access.
==23648==Hint: address points to the zero page.
#0 0x7f5d4a2006a7 in operator! /builds/worker/workspace/obj-build/dist/include/mozilla/RefPtr.h:311:36
#1 0x7f5d4a2006a7 in NS_CycleCollectorSuspect3 /gecko/xpcom/base/nsCycleCollector.cpp:3764:7
#2 0x7f5d4fceb523 in incr<&NS_CycleCollectorSuspect3> /builds/worker/workspace/obj-build/dist/include/nsISupportsImpl.h:248:7
#3 0x7f5d4fceb523 in incr<&NS_CycleCollectorSuspect3> /builds/worker/workspace/obj-build/dist/include/nsISupportsImpl.h:234:12
#4 0x7f5d4fceb523 in mozilla::DOMEventTargetHelper::AddRef() /gecko/dom/events/DOMEventTargetHelper.cpp:85:1
#5 0x7f5d4fe59ce8 in mozilla::dom::Blob::Create(nsIGlobalObject*, mozilla::dom::BlobImpl*) /gecko/dom/file/Blob.cpp:81:59
#6 0x7f5d4f6c9463 in mozilla::dom::OffscreenCanvas::CreateEncodeCompleteCallback(nsCOMPtr<nsIGlobalObject>&&, mozilla::dom::Promise*)::EncodeCallback::ReceiveBlobImpl(already_AddRefed<mozilla::dom::BlobImpl>) /gecko/dom/canvas/OffscreenCanvas.cpp:292:31
#7 0x7f5d4d5b9101 in mozilla::dom::EncodingCompleteEvent::Run() /gecko/dom/base/ImageEncoder.cpp
#8 0x7f5d4a3d3e8e in nsThread::ProcessNextEvent(bool, bool*) /gecko/xpcom/threads/nsThread.cpp:1174:16
#9 0x7f5d4a3cd48f in NS_ProcessPendingEvents(nsIThread*, unsigned int) /gecko/xpcom/threads/nsThreadUtils.cpp:430:19
#10 0x7f5d51faf8aa in mozilla::dom::WorkerPrivate::ClearMainEventQueue(mozilla::dom::WorkerPrivate::WorkerRanOrNot) /gecko/dom/workers/WorkerPrivate.cpp:3760:5
#11 0x7f5d51f9c804 in mozilla::dom::WorkerPrivate::ScheduleDeletion(mozilla::dom::WorkerPrivate::WorkerRanOrNot) /gecko/dom/workers/WorkerPrivate.cpp:3579:3
#12 0x7f5d51f83602 in mozilla::dom::workerinternals::(anonymous namespace)::WorkerThreadPrimaryRunnable::Run() /gecko/dom/workers/RuntimeService.cpp:2100:19
#13 0x7f5d4a3d3e8e in nsThread::ProcessNextEvent(bool, bool*) /gecko/xpcom/threads/nsThread.cpp:1174:16
#14 0x7f5d4a3dd7cc in NS_ProcessNextEvent(nsIThread*, bool) /gecko/xpcom/threads/nsThreadUtils.cpp:465:10
#15 0x7f5d4baf30d1 in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /gecko/ipc/glue/MessagePump.cpp:330:5
#16 0x7f5d4b978131 in RunInternal /gecko/ipc/chromium/src/base/message_loop.cc:380:10
#17 0x7f5d4b978131 in RunHandler /gecko/ipc/chromium/src/base/message_loop.cc:373:3
#18 0x7f5d4b978131 in MessageLoop::Run() /gecko/ipc/chromium/src/base/message_loop.cc:355:3
#19 0x7f5d4a3cba5b in nsThread::ThreadFunc(void*) /gecko/xpcom/threads/nsThread.cpp:378:10
#20 0x7f5d6f8d857e in _pt_root /gecko/nsprpub/pr/src/pthreads/ptthread.c:201:5
#21 0x7f5d70575608 in start_thread /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8
#22 0x7f5d7013c132 in __clone /build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Reporter | ||
Comment 1•2 years ago
|
||
A Pernosco session is available here: https://pernos.co/debug/-uCYdFryO8hFvxXKTu5zCA/index.html
Comment 2•2 years ago
|
||
Bugmon Analysis
Verified bug as reproducible on mozilla-central 20220518031437-1e98fd258975.
The bug appears to have been introduced in the following build range:
Start: ca296f209b269a1ea7b29a2a36929c1f05f6b6d4 (20220325152959)
End: 683ad09c90953c4a288c5e625c2e64a87fe1ae27 (20220325161857)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=ca296f209b269a1ea7b29a2a36929c1f05f6b6d4&tochange=683ad09c90953c4a288c5e625c2e64a87fe1ae27
Comment 3•2 years ago
|
||
(In reply to Bugmon [:jkratzer for issues] from comment #2)
Bugmon Analysis
Verified bug as reproducible on mozilla-central 20220518031437-1e98fd258975.
The bug appears to have been introduced in the following build range:Start: ca296f209b269a1ea7b29a2a36929c1f05f6b6d4 (20220325152959)
End: 683ad09c90953c4a288c5e625c2e64a87fe1ae27 (20220325161857)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=ca296f209b269a1ea7b29a2a36929c1f05f6b6d4&tochange=683ad09c90953c4a288c5e625c2e64a87fe1ae27
Jens, Olli, could this have been caused by your commits in bug 1756172?
Kelsey, given this is a null crash and not common, should we decrease the severity?
Comment 4•2 years ago
•
|
||
(In reply to Tyson Smith [:tsmith] from comment #1)
A Pernosco session is available here: https://pernos.co/debug/-uCYdFryO8hFvxXKTu5zCA/index.html
Thanks, always appreciated! I put some notes.
(In reply to Marco Castelluccio [:marco] from comment #3)
Jens, Olli, could this have been caused by your commits in bug 1756172?
Yes, I think so. It seems we anticipated the moment the cycle collector is shut down before the last round of event processing which happens inside WorkerPrivate::ScheduleDeletion
and thus changing the refcount of cycle collected objects as happening here for the global (which is kept alive by a callback beyond the worker's lifetime) will just fail.
Actually I think we should inhibit any further dispatch to the worker thread starting from here before we do our last explicit round of processing before the cycle collector shuts down. And WorkerPrivate::ScheduleDeletion
IMHO should not process any further events on the worker thread at all.
WDYT, Olli?
Assignee | ||
Comment 5•2 years ago
|
||
Yes. We seem to have yet another place where event loop is cleared.
https://searchfox.org/mozilla-central/rev/b462b11e71b500e084f51e61fbd9e19ea0122c78/dom/workers/WorkerPrivate.cpp#3575
We need to mark the thread as shutting down before that or something, or hmm, EncodingCompleteEvent should use Worker Refs to detect shutdown I think.
Assignee | ||
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 7•2 years ago
|
||
Updated•2 years ago
|
Comment 8•2 years ago
|
||
Comment on attachment 9278488 [details]
Bug 1769878: Ensure WorkerThreadPrimaryRunnable will not run any events after the cycle collector is destroyed. r?#dom-worker-reviewers
Revision D147516 was moved to bug 1772084. Setting attachment 9278488 [details] to obsolete.
Comment 9•2 years ago
|
||
Comment 10•2 years ago
|
||
Depends on D147857
Comment 11•2 years ago
|
||
Depends on D147858
Comment 12•2 years ago
|
||
Depends on D147858
Updated•2 years ago
|
Updated•2 years ago
|
Comment 13•2 years ago
|
||
:jgilbert, sorry for not clearing your ni?, I already decreased severity.
Assignee | ||
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 14•2 years ago
|
||
Depends on D147858
Comment 15•2 years ago
|
||
Comment 16•2 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/b202476826d2
https://hg.mozilla.org/mozilla-central/rev/851d0144fd91
https://hg.mozilla.org/mozilla-central/rev/cd4ca1691c0b
Comment 17•2 years ago
|
||
Bugmon Analysis
Verified bug as fixed on rev mozilla-central 20220623035450-d26536dbf462.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 18•2 years ago
|
||
Per discussion with smaug, the relevant API affected by this bug is enabled on ESR, but the use cases are pretty limited and it's not clear how high-impact it will actually be there. Let's keep an eye on it for a couple cycles and we can decide later about a possible backport. The patches do graft cleanly.
Updated•2 years ago
|
Description
•