Closed Bug 954452 Opened 7 years ago Closed 7 years ago

Shutdown crash [@ purple_blist_node_set_ui_data ]

Categories

(Chat Core :: General, defect)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: florian, Assigned: florian)

References

()

Details

(Whiteboard: [1.2-blocking])

Attachments

(1 file, 1 obsolete file)

*** Original post on bio 1017 at 2011-09-05 09:28:00 UTC ***

*** Due to BzAPI limitations, the initial description is in comment 1 ***
Attached patch Patch (WIP) (obsolete) — Splinter Review
*** Original post on bio 1017 as attmnt 805 at 2011-09-05 09:28:00 UTC ***

We received several (~1/week) crash reports for this. We crash at shutdown because a purpleAccountBuddy is uninitialized at during a JS GC, probably after the purple core has been uninitialized.

I think I have a fix for this (patch attached), but I would like to look at it later to double check the usage of the libpurple blist signals/uiops which is a bit hackish...

Apparently the buddy-removed signal is fired by libpurple when a buddy is permanently removed from the list (so it's fired when we delete the accounts while uninitializing libpurple), but not when a buddy is destroyed because the whole list is destroyed. The remove uiop is called in both cases, so I think it's more reliable to use it. I would still like to understand how come deleting all the accounts don't delete all the buddies, causing some to be left over for when we uninitialize the libpurple core (a bug in libpurple? in a prpl? Some of our code returning early in an NS_ENSURE_* macro?).

I'm not really confident in my patch as it causes lots of "WARNING: NS_ENSURE_TRUE(accountBuddy) failed: file /Users/florian/buildhg/hg.instantbird.org/purple/purplexpcom/src/purpleInitContacts.cpp, line 210" messages to appear in my terminal, meaning some buddies are already uninitialized when the remove uiop is called...
Attached patch PatchSplinter Review
*** Original post on bio 1017 as attmnt 998 at 2011-11-15 15:10:00 UTC ***

Different approach, less scary.
Comment on attachment 8352547 [details] [diff] [review]
Patch (WIP)

*** Original change on bio 1017 attmnt 805 at 2011-11-15 15:10:20 UTC was without comment, so any subsequent comment numbers will be shifted ***
Attachment #8352547 - Attachment is obsolete: true
*** Original post on bio 1017 at 2012-01-16 17:57:14 UTC ***

People seem to be frequently crashing with different but similar stacks on current nightly builds, so this should probably block 1.2.
Whiteboard: [1.2-blocking]
*** Original post on bio 1017 at 2012-01-16 22:48:19 UTC ***

(In reply to comment #1)
> Created attachment 8352740 [details] [diff] [review] (bio-attmnt 998) [details]
> Patch
> 
> Different approach, less scary.

Checked in as http://hg.instantbird.org/instantbird/rev/be416e3b6a3f

Hopefully it fixes it! :-D
*** Original post on bio 1017 at 2012-01-16 22:52:36 UTC ***

I've pushed attachment 8352740 [details] [diff] [review] (bio-attmnt 998) as https://hg.instantbird.org/instantbird/rev/be416e3b6a3f, we will see if it improves the situation on the next few nightlies :).
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → 1.2
*** Original post on bio 1017 at 2012-01-28 23:54:52 UTC ***

I played with valgrind on an up to date linux debug build today, and I saw this at shutdown:

==27722== Thread 1:
==27722== Invalid write of size 8
==27722==    at 0x1A4065CD: purple_blist_node_set_ui_data (blist.c:901)
==27722==    by 0x1A8D579B: purpleAccountBuddy::UnInit() (purpleAccountBuddy.cpp:97)
==27722==    by 0x6FAAA3C: NS_InvokeByIndex_P (xptcinvoke_x86_64_unix.cpp:195)
==27722==    by 0x6802E3F: CallMethodHelper::Invoke() (xpcwrappednative.cpp:3147)
==27722==    by 0x680094E: CallMethodHelper::Call() (xpcwrappednative.cpp:2370)
==27722==    by 0x68007E5: XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) (xpcwrappednative.cpp:2334)
==27722==    by 0x680E2BB: XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) (xpcwrappednativejsops.cpp:1629)
==27722==    by 0x746E7E2: js::CallJSNative(JSContext*, int (*)(JSContext*, unsigned int, JS::Value*), js::CallArgs const&) (jscntxtinlines.h:296)
==27722==    by 0x7446D9D: js::InvokeKernel(JSContext*, js::CallArgs const&, js::MaybeConstruct) (jsinterp.cpp:660)
==27722==    by 0x745C4B6: js::Interpret(JSContext*, js::StackFrame*, js::InterpMode) (jsinterp.cpp:4036)
==27722==    by 0x7446B17: js::RunScript(JSContext*, JSScript*, js::StackFrame*) (jsinterp.cpp:614)
==27722==    by 0x7446E87: js::InvokeKernel(JSContext*, js::CallArgs const&, js::MaybeConstruct) (jsinterp.cpp:678)
==27722==  Address 0x22458400 is 48 bytes inside a block of size 128 free'd
==27722==    at 0x4C282ED: free (vg_replace_malloc.c:366)
==27722==    by 0x1A407ADC: purple_buddy_destroy (blist.c:1423)
==27722==    by 0x1A40B56A: purple_blist_node_destroy (blist.c:2879)
==27722==    by 0x1A40B4DF: purple_blist_node_destroy (blist.c:2866)
==27722==    by 0x1A40B4DF: purple_blist_node_destroy (blist.c:2866)
==27722==    by 0x1A40C408: purple_blist_uninit (blist.c:3264)
==27722==    by 0x1A41F482: purple_core_quit (core.c:222)
==27722==    by 0x1A8DCA0E: purpleCoreService::Quit() (purpleCoreService.cpp:225)
==27722==    by 0x1A8DC435: purpleCoreService::Observe(nsISupports*, char const*, unsigned short const*) (purpleCoreService.cpp:122)
==27722==    by 0x6F3B91E: nsObserverList::NotifyObservers(nsISupports*, char const*, unsigned short const*) (nsObserverList.cpp:130)
==27722==    by 0x6F3D37D: nsObserverService::NotifyObservers(nsISupports*, char const*, unsigned short const*) (nsObserverService.cpp:182)
==27722==    by 0x6FAAA3C: NS_InvokeByIndex_P (xptcinvoke_x86_64_unix.cpp:195)
==27722== 

So this still isn't fixed :(.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
*** Original post on bio 1017 at 2012-03-07 00:54:55 UTC ***

Fixed: https://hg.instantbird.org/instantbird/rev/931175959e82
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.