Closed
Bug 144838
Opened 22 years ago
Closed 22 years ago
[viewpoint] Crashed destroying ?scriptable interface? after reload
Categories
(Core Graveyard :: Plug-ins, defect, P1)
Tracking
(Not tracked)
VERIFIED
DUPLICATE
of bug 148889
mozilla1.0
People
(Reporter: aberger, Assigned: peterl-bugs)
References
()
Details
(Keywords: crash)
From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) BuildID: 1.0 branch When our plugin tries to load content and is told that it is too old, it begins an autoupdate process. It brings down a new dll and xpt (which we use for sending javascript in and out of the plugin). After we have the new dll and xpt in the correct folders, we GetURL ("javascript:navigator.plugins.refresh(true)"). This causes the page to get reloaded, and our new plugin is loaded and starts to show content. A few seconds later, Mozilla crashes. I think that it is the old scriptable interface being destroyed (garbage collection?), but I could be wrong. Here is the call stack: nsCOMPtr<nsIClassInfo>::~nsCOMPtr<nsIClassInfo>() line 490 + 13 bytes XPCWrappedNativeProto::~XPCWrappedNativeProto() line 83 + 11 bytes XPCWrappedNativeProto::`scalar deleting destructor'(unsigned int 1) + 15 bytes DyingProtoKiller(JSDHashTable * 0x010fc8c8, JSDHashEntryHdr * 0x03950000, unsigned long 10, void * 0x00000000) line 200 + 28 bytes JS_DHashTableEnumerate(JSDHashTable * 0x010fc8c8, int (JSDHashTable *, JSDHashEntryHdr *, unsigned long, void *)* 0x014e9e50 DyingProtoKiller (JSDHashTable *, JSDHashEntryHdr *, unsigned long, void *), void * 0x00000000) line 599 + 34 bytes XPCWrappedNativeProtoMap::Enumerate(JSDHashOperator (JSDHashTable *, JSDHashEntryHdr *, unsigned long, void *)* 0x014e9e50 DyingProtoKiller (JSDHashTable *, JSDHashEntryHdr *, unsigned long, void *), void * 0x00000000) line 639 + 27 bytes XPCJSRuntime::GCCallback(JSContext * 0x0352b428, JSGCStatus JSGC_FINALIZE_END) line 485 DOMGCCallback(JSContext * 0x0352b428, JSGCStatus JSGC_FINALIZE_END) line 1622 + 23 bytes js_GC(JSContext * 0x0352b428, unsigned int 0) line 1345 + 12 bytes js_ForceGC(JSContext * 0x0352b428, unsigned int 0) line 979 + 13 bytes JS_GC(JSContext * 0x0352b428) line 1644 + 11 bytes nsJSContext::Notify(nsJSContext * const 0x0352b248, nsITimer * 0x03859340) line 1569 + 13 bytes nsTimerImpl::Fire() line 348 nsTimerManager::FireNextIdleTimer(nsTimerManager * const 0x01792f48) line 591 nsAppShell::Run(nsAppShell * const 0x016d0b10) line 134 nsAppShellService::Run(nsAppShellService * const 0x016d0568) line 451 main1(int 1, char * * 0x00307c48, nsISupports * 0x00000000) line 1431 + 32 bytes main(int 1, char * * 0x00307c48) line 1779 + 37 bytes mainCRTStartup() line 338 + 17 bytes KERNEL32! 77e97d08() In the NSCOMPTR destructor: if ( mRawPtr ) NSCAP_RELEASE(this, mRawPtr); } mRawPtr is not NULL, but in the debugger it looks like it is probably not valid. Looks to me like some timer is going off, which causes this object to get destroyed, but it is already gone (because the page was reloaded already? Because we replaced it with a new xpt?) Reproducible: Sometimes Steps to Reproduce: 1.Giving steps to reproduce this will be difficult. If somebody needs to try to reproduce it, they can contact me. If there are any suggestions as to how I could debug it myself and what to look for, I can work with someone.
Comment 1•22 years ago
|
||
assigning to peterl
Assignee: beppe → peterl
Priority: -- → P1
Whiteboard: [ViewPoint]
Target Milestone: --- → mozilla1.0
Comment 2•22 years ago
|
||
....wonder if its something with garbage collection???? The javascript:navigator.plugins.refresh(1) command causes all running plugins to be shutdown, even in other windows.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: crash
Summary: Crashed destroying ?scriptable interface? after reload → [viewpoint] Crashed destroying ?scriptable interface? after reload
Whiteboard: [ViewPoint]
Comment 3•22 years ago
|
||
I would be very suprised if xpconnect is doing the refcounting wrong. I would bet that there is a refcount imbalance either in the plugin subsystem or in the plugin itself. If the plugin is unloading itself (and in effect deleting the objects it has created - well maybe not the object itself if you are using the shared heap functions, but certainly the vtbls and method code) while there are still valid references held to those objects then you are going to crash like this. We saw something like this once before where a dll component was explicitly deleting objects and unloading its dll at app shutdown time with no regard to the objects' ref counts and the normal course of the release calls as other code released references during the shutdown sequence. That is certainly bad. I'm guessing that you are using the nsIClassInfo mixin code in your plugin. It is probably that classinfo that is being released here. JS GC happens when it happens. Objects need to respect the references held and not delete or unload prematurely.
Comment 4•22 years ago
|
||
Ari, do you see someone calling ~nsIClassInfo first? I believe we should be keeping all scriptable plugins in memory and no unloading them but that could be the bug here, especially with a TRUE argument to navigator.plugins.refresh(1) which is supposed to shutdown and unload. Hm...this maybe a long shot, but I wonder if calling NPN_SetValue(NPPVpluginKeepLibraryInMemory, TRUE) helps?
Reporter | ||
Comment 5•22 years ago
|
||
Of course, while I could repeat this crash easily yesterday, and QA can still repeat it, I am having trouble trying to repeat it today. However, I do have this information to offer: 1) We never explicitly destroy our scriptable class (derived off of classInfoMixin). We merely call Release. Normally, this causes it to get destroyed on exit: CPluginInstanceScriptable::~CPluginInstanceScriptable() line 4142 CPluginInstanceScriptable::`scalar deleting destructor'(unsigned int 1) + 15 bytes CPluginInstanceScriptable::Release(CPluginInstanceScriptable * const 0x037f8d48) line 4158 + 68 bytes CPluginInstance::Destroy() line 736 NPP_Destroy(_NPP * 0x03a57400, _NPSavedData * * 0x0012e2c0) line 381 ns4xPluginInstance::Stop(ns4xPluginInstance * const 0x03a573e8) line 710 + 68 bytes nsActivePluginList::stopRunning() line 629 nsPluginHostImpl::ReloadPlugins(nsPluginHostImpl * const 0x0182a278, int 1) line 2751 PluginArrayImpl::Refresh(PluginArrayImpl * const 0x038b5300, int 1) line 204 + 27 bytes I do NOT think that the object is being deleted twice when it crashes. I think what happens is that you try to delete the object (through the callstack that I posted yesterday) but the dll has already been unloaded! The way that I see this is that if I add the code that Peter suggested trying (NPN_SetValue(NPPVpluginKeepLibraryInMemory, TRUE)) Then the destructor is NOT called on page refresh, and I get the same crash as yesterday (the crash is in trying to destroy the nsiclassinfo). When I look at the modules currently loaded in memory, our new plugin (different filename) is currently running, but the OLD plugin (whose nsiclassinfo is being deleted) is not in memory! So i think that whoever is deciding to not destroy the scriptable until garbage collection time is still allowing the library to actually get unloaded.
Comment 6•22 years ago
|
||
Sounds like an apt assessment to me. Unfortunately, (as I understand things) the mechanisms of JS GC and xpcom refcounting are quite independent of the plugin manager's mechanisms for dll unloading. I don't have a handy solution to offer. Any one-off solutions that might be suggested for this specific case regarding xpconnect would be, I think, off the mark. Once plugins are doing xpcom, then *any* code might be holding valid references to objects created by the plugin. I'd say the plugin manager has a real problem on its hands here in deciding (in concert with the plugin itself probably) whether of not it is truly safe to unload a given plugin dll at a given time. This is an issue me decided was too thorny to handle for generic xpcom component dlls - for which we opted to simply not implement dll unloading. Perhaps that is not an option here. But, I can't say what the best solution will be.
Comment 7•22 years ago
|
||
FWIW, your stack trace doesn't look like one that is caused by an access violation from a DLL that was unloaded. If the DLL was unloaded you should have something like 12345678() as the last entry in the stack (was that the case and it was just omitted?). The jump will actually be made, and the violation occurs at the former location of the DLL.
Reporter | ||
Comment 8•22 years ago
|
||
good point re: access violation. But since I don't have nay better ideas at the moment, I will try calling loadLibrary on ourselves to avoid getting unloaded until the process goes away, and see if QA sees any difference. Any other thoughts?
Comment 9•22 years ago
|
||
I think there may be a mess-up in the refcount. This is starting to show up in a few places. Possible dups: bug 148889 bug 144838 bug 109281
Comment 10•22 years ago
|
||
So bug 150764 also has a similar stack. In that bug, it appears that if the plugin asks the host not to unload the library, we don't crash. Since all these stacks looks like they are going through garbage collection, would it be possible to fix this crash by forcing a GC to happen before unloading the library?
Comment 11•22 years ago
|
||
> ...would it be possible to fix this crash by forcing a GC... ?
No, that would be a very bad bet. It assumes that doing a gc will for sure
release all references. But if the JS code happens to hold a reference then that
assumption will be incorrect. e.g...
var foo = some_plugin_interface;
// do the stuff that unloads with a forced gc. xpconnect does not release
// references into plugin here because the JS code holds a ref.
var foo = null;
// gc happens sometime later and xpconnect does try to release. bad things happen
The above noted bugs are clearly dups - and should be marked as such.
Either there is a refcounting error - some code is releasing refs held by other
code - or you are unloading a dll for which there are still live referenced
objects. You need to figure out how to fix the real problem. I don't buy the
idea that the occurance of gc is the source of the problem or that invoking fgc
at a particular time is a viable solution.
Comment 12•22 years ago
|
||
Well...yeah...I think we are running into the problem where we still have a ref to one of the plugin's objects and we are unloading their DLL. In bug 148889, Jeff from EA say if he tells the us not to unload his plugin (through a special NPAPI call), the crash does not happen. Serge attached a patch that also fixes this problem by basically automatically doing the same thing if the plugin tells us it's scriptable. How can we tell if all references to the plugin have been released? It looks like we may just have to leak the library if no other solution can be found but it could break plugin upgrade procedures where they depend that a plugins.refresh(1) unloads the DLL. Since bug 148889 has a patch and most of the same comments, I'm gona make that the bug others are dupped against. *** This bug has been marked as a duplicate of 148889 ***
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → DUPLICATE
Comment 13•22 years ago
|
||
mass duplicate verifications . For filtering purposes, pls use keywd "massdupverification"
Status: RESOLVED → VERIFIED
Updated•2 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•