Closed Bug 1208418 Opened 5 years ago Closed 5 years ago

Mochitest shutdown crash in NetlinkPoller dtor

Categories

(Firefox OS Graveyard :: Emulator, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(firefox46 fixed)

RESOLVED FIXED
Tracking Status
firefox46 --- fixed

People

(Reporter: cyu, Assigned: cyu)

References

Details

(Whiteboard: [EMU] [CI])

Attachments

(1 file, 2 obsolete files)

When I run mochitest on emulator-x86-kk, I got this crash on shutdown:

The crash stack:
#0  0x00000000 in ?? ()
#1  0xb318498d in event_del (ev=0xb13f7790) at /mnt/SSD/data/hg/mozilla-central/ipc/chromium/src/third_party/libevent/event.c:2186
#2  0xb318a720 in base::MessagePumpLibevent::FileDescriptorWatcher::StopWatchingFileDescriptor (this=0xb165e00c) at /mnt/SSD/data/hg/mozilla-central/ipc/chromium/src/base/message_pump_libevent.cc:82
#3  0xb335a985 in mozilla::hal_impl::NetlinkPoller::~NetlinkPoller (this=0xb165e000, __in_chrg=<optimized out>) at /mnt/SSD/data/hg/mozilla-central/hal/gonk/UeventPoller.cpp:54
#4  0xb335a9be in mozilla::hal_impl::NetlinkPoller::~NetlinkPoller (this=0xb165e000, __in_chrg=<optimized out>) at /mnt/SSD/data/hg/mozilla-central/hal/gonk/UeventPoller.cpp:54
#5  0xb31195ac in nsAutoPtr<mozilla::net::ChannelEvent>::~nsAutoPtr (this=this@entry=0xb6cc68b0 <mozilla::hal_impl::sPoller>, __in_chrg=<optimized out>) at ../../../dist/include/nsAutoPtr.h:74
#6  0xb75ef038 in __cxa_finalize (dso=dso@entry=0x0) at bionic/libc/stdlib/atexit.c:150
#7  0xb75ef42c in exit (status=0) at bionic/libc/stdlib/exit.c:57
#8  0xb75ae535 in __libc_init (raw_args=0xbfd054e0, onexit=0x0, slingshot=0xb77042d4 <main(int, char const**)>, structors=0xbfd054c0) at bionic/libc/bionic/libc_init_dynamic.cpp:112
#9  0xb7703e12 in _start ()

Note that this is in __cxa_finalize(), where main() already finishes. That means the IO thread already shutdown. We need to Shutdown NetlinkPoller before main() finishes to fix this crash.
Assignee: nobody → cyu
Attachment #8671253 - Flags: review?(dhylands)
Comment on attachment 8671253 [details] [diff] [review]
Shut down the UeventPoller on XPCOM shutdown

Cancel the review since there is regression found in try push log.
Attachment #8671253 - Flags: review?(dhylands)
Whiteboard: [EMU] [CI]
This uses pthread key destructor to safely shut down the UeventPoller instance when the main thread shuts down the IPC thread.
Attachment #8671253 - Attachment is obsolete: true
Attachment #8704497 - Flags: review?(dhylands)
Summary: Mochitest shutdown crash on in NetlinkPoller dtor → Mochitest shutdown crash in NetlinkPoller dtor
Comment on attachment 8704497 [details] [diff] [review]
Shutdown UeventPoller on thread termination

Review of attachment 8704497 [details] [diff] [review]:
-----------------------------------------------------------------

I think that there are simpler ways of dealing with this.

1 - Use a StaticRefPtr or StaticAutoPtr rather than an nsAutoPtr when declaring sPoller. Then the destructor isn't called after main exits.

2 - Why not use xpcom-shutdown? If you really want it to shutdown.
(In reply to Dave Hylands [:dhylands] from comment #5)
> Comment on attachment 8704497 [details] [diff] [review]
> Shutdown UeventPoller on thread termination
> 
> Review of attachment 8704497 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I think that there are simpler ways of dealing with this.
> 
> 1 - Use a StaticRefPtr or StaticAutoPtr rather than an nsAutoPtr when
> declaring sPoller. Then the destructor isn't called after main exits.
> 

I am not familiar with libevent. But it seems to shut down the epoll fd after event_free(). If that's the case then using StaticRefPtr will work (by leaking the sPoller instance).

> 2 - Why not use xpcom-shutdown? If you really want it to shutdown.
It will also work with we need more care in which part runs on the main thread and which part on the IPC thread. I am trying this approach in https://hg.mozilla.org/try/rev/4e3e54e1b3b7
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #6)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=4e3e54e1b3b7

I am seeing memory corruption revealed by this patch. MessagePumpLibevent is corrupted that the space allocated to it is allocated to another object before it's freed. The patch isn't likely to have refcount bug in MessagePumpLibevent used by the MessageLoop of the IPC thread. But I see corruption of the instance of MessagePumpLibevent when its dtor is not called yet. I once saw MessagePumpLibevent's refcount updated in nsStringBuffer::Alloc():

#0  nsStringBuffer::Alloc (aSize=aSize@entry=25) at xpcom/string/nsSubstring.cpp:222
#1  0x40ae77fe in nsACString_internal::MutatePrep (this=this@entry=0xbed51720, aCapacity=aCapacity@entry=24, aOldData=aOldData@entry=0xbed51678, aOldFlags=aOldFlags@entry=0xbed5167c) at /
mnt/SSD/data/hg/mozilla-central/xpcom/string/nsTSubstring.cpp:133
#2  0x40ae782c in nsACString_internal::ReplacePrepInternal (this=this@entry=0xbed51720, aCutStart=aCutStart@entry=0, aCutLen=aCutLen@entry=0, aFragLen=aFragLen@entry=24, aNewLen=24) at /m
nt/SSD/data/hg/mozilla-central/xpcom/string/nsTSubstring.cpp:195
#3  0x40ae8edc in nsACString_internal::ReplacePrep (this=this@entry=0xbed51720, aCutStart=aCutStart@entry=0, aCutLength=0, aNewLength=aNewLength@entry=24) at /mnt/SSD/data/hg/mozilla-cent
ral/xpcom/string/nsTSubstring.cpp:186
#4  0x40ae8f76 in nsACString_internal::Assign (this=0xbed51720, aData=0x4268f245 <mozilla::dom::quota::(anonymous namespace)::kTestingPref> "dom.quotaManager.testing", aLength=<optimized 
out>, aFallible=...) at xpcom/string/nsTSubstring.cpp:342
#5  0x40ae8fc6 in nsACString_internal::Assign (this=this@entry=0xbed51720, aData=aData@entry=0x4268f245 <mozilla::dom::quota::(anonymous namespace)::kTestingPref> "dom.quotaManager.testin
g", aLength=aLength@entry=4294967295) at xpcom/string/nsTSubstring.cpp:319
#6  0x40b28c14 in nsCString::nsCString (this=0xbed51720, aData=0x4268f245 <mozilla::dom::quota::(anonymous namespace)::kTestingPref> "dom.quotaManager.testing", aLength=4294967295) at /mn
t/SSD/data/git/emulator/B2G/objdir-gecko-nodbg/dist/include/nsTString.h:41
#7  0x40b2cac4 in ValueObserverHashKey (aCallback=0x415b9ea1 <mozilla::dom::quota::(anonymous namespace)::TestingPrefChangedCallback(char const*, void*)>, aPref=<optimized out>, this=0xbe
d5171c) at modules/libpref/Preferences.cpp:110
#8  mozilla::Preferences::UnregisterCallback (aCallback=0x415b9ea1 <mozilla::dom::quota::(anonymous namespace)::TestingPrefChangedCallback(char const*, void*)>, aPref=<optimized out>, aCl
osure=aClosure@entry=0x0) at modules/libpref/Preferences.cpp:1698
#9  0x415c003e in mozilla::dom::quota::QuotaManagerService::Destroy (this=0x44555de0) at dom/quota/QuotaManagerService.cpp:321
#10 0x415c0078 in mozilla::dom::quota::QuotaManagerService::Release (this=<optimized out>) at dom/quota/QuotaManagerService.cpp:492
#11 0x40b21e52 in mozilla::KillClearOnShutdown () at /mnt/SSD/data/git/emulator/B2G/objdir-gecko-nodbg/dist/include/mozilla/ClearOnShutdown.h:101
#12 0x40b24340 in mozilla::ShutdownXPCOM (aServMgr=0x44fce1a4) at xpcom/build/XPCOMInit.cpp:905
#13 0x41afcf74 in ScopedXPCOMStartup::~ScopedXPCOMStartup (this=0x4022f180, __in_chrg=<optimized out>) at toolkit/xre/nsAppRunner.cpp:1479
#14 0x41afcfa2 in mozilla::DefaultDelete<ScopedXPCOMStartup>::operator() (aPtr=0x4022f180, this=<optimized out>) at /mnt/SSD/data/git/emulator/B2G/objdir-gecko-nodbg/dist/include/mozilla/
UniquePtr.h:482
#15 0x41b009c2 in reset (aPtr=0x0, this=0xbed51804) at /mnt/SSD/data/git/emulator/B2G/objdir-gecko-nodbg/dist/include/mozilla/UniquePtr.h:309
#16 operator= (this=0xbed51804) at /mnt/SSD/data/git/emulator/B2G/objdir-gecko-nodbg/dist/include/mozilla/UniquePtr.h:279
#17 XREMain::XRE_main (this=this@entry=0xbed517f0, argc=argc@entry=1, argv=argv@entry=0x40238188, aAppData=aAppData@entry=0x2ca88 <_ZL8sAppData>) at toolk
it/xre/nsAppRunner.cpp:4449
#18 0x41b00b42 in XRE_main (argc=1, argv=0x40238188, aAppData=0x2ca88 <_ZL8sAppData>, aFlags=<optimized out>) at toolkit/xre/nsAppRunner.cpp:4525
#19 0x00010768 in do_main (argc=argc@entry=1, argv=argv@entry=0x40238188) at b2g/app/nsBrowserApp.cpp:167
#20 0x000108b4 in b2g_main (argc=1, argv=<optimized out>) at b2g/app/nsBrowserApp.cpp:299
#21 0x00010620 in RunProcesses (aReservedFds=..., argv=0xbed52ac4, argc=1) at b2g/app/B2GLoader.cpp:233
#22 main (argc=1, argv=0xbed52ac4) at b2g/app/B2GLoader.cpp:298
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #10)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=67ee951d039d

Fix the bug in using capture by reference in the lambda that causes the crash.
Comment on attachment 8704497 [details] [diff] [review]
Shutdown UeventPoller on thread termination

Clearing review request - this patch seems to be obsolete now.
Attachment #8704497 - Flags: review?(dhylands)
Attachment #8704497 - Attachment is obsolete: true
Attachment #8708171 - Flags: review?(dhylands)
Priority: -- → P1
Comment on attachment 8708171 [details] [diff] [review]
Shut down the UeventPoller on XPCOM shutdown

Review of attachment 8708171 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry to take so long.

This looks much better - it uses a more traditional approach that other people will understand.
Attachment #8708171 - Flags: review?(dhylands) → review+
https://hg.mozilla.org/integration/b2g-inbound/rev/9f42d92f870561be92409b819aa3ab36e3e91950
Bug 1208418: Shut down UeventPoller on XPCOM shutdown to fix the crash when the chrome process exits. r=dhylands
https://hg.mozilla.org/mozilla-central/rev/9f42d92f8705
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.