Closed Bug 386286 Opened 18 years ago Closed 17 years ago

firefox.exe process running, stuck in ::OleUninitialize() in OleRegisterMgr::~OleRegisterMgr()

Categories

(Firefox :: General, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: moco, Assigned: aaronlev)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 3 obsolete files)

firefox.exe process running, stuck in ::OleUninitialize() in OleRegisterMgr::~OleRegisterMgr() I attached in the debugger, and I had one thread here: ntdll.dll!7c90eb94() [Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll] ntdll.dll!7c90e9ab() kernel32.dll!7c8094e2() user32.dll!7e4195f9() user32.dll!7e42e7d6() ole32.dll!77512235() ole32.dll!7751235c() ole32.dll!776017a2() ole32.dll!776011b6() ole32.dll!7760109a() rpcrt4.dll!77e7be9e() rpcrt4.dll!77e7beb5() ole32.dll!7750efd4() rpcrt4.dll!77e7c2c7() ole32.dll!77512409() ole32.dll!775123b2() ole32.dll!77510414() rpcrt4.dll!77e7b78f() rpcrt4.dll!77ef3db5() rpcrt4.dll!77ef3eac() ntdll.dll!7c910732() ntdll.dll!7c910732() ntdll.dll!7c9106ab() ntdll.dll!7c9106eb() kernel32.dll!7c80a9b6() ole32.dll!775779e4() ole32.dll!77577a03() kernel32.dll!7c80e4a4() kernel32.dll!7c80e4a4() kernel32.dll!7c80e5ab() kernel32.dll!7c80e62b() ntdll.dll!7c911e75() ntdll.dll!7c910d5c() ntdll.dll!7c910e91() ntdll.dll!7c91056d() ntdll.dll!7c91056d() ntdll.dll!7c910e91() rpcrt4.dll!77ef3e42() rpcrt4.dll!77e89aa4() ole32.dll!77511410() ole32.dll!775113b1() ole32.dll!77510052() ole32.dll!7750fe4b() ole32.dll!77554a4f() ole32.dll!77554a30() ole32.dll!77564daf() uxtheme.dll!5ad74d8d() uxtheme.dll!5ad73623() ole32.dll!77502b8a() uxtheme.dll!5ad74e29() ole32.dll!774fd20a() ole32.dll!774fd1d7() ole32.dll!7750206d() ole32.dll!77502b8a() user32.dll!7e4193e9() user32.dll!7e4193a8() user32.dll!7e419402() user32.dll!7e41929b() ole32.dll!774fda8c() ole32.dll!774fd9cd() ole32.dll!774fd1d7() ole32.dll!77501f2e() ole32.dll!7750221e() ole32.dll!77502171() ole32.dll!774ff221() ole32.dll!774fee88() > gkwidget.dll!OleRegisterMgr::~OleRegisterMgr() Line 411 C++ gkwidget.dll!`dynamic atexit destructor for 'OleRegisterMgr::mSingleton''() + 0xd bytes C++ gkwidget.dll!_CRT_INIT(void * hDllHandle=0x022e0000, unsigned long dwReason=0, void * lpreserved=0x00000001) Line 417 C gkwidget.dll!__DllMainCRTStartup(void * hDllHandle=0x022e0000, unsigned long dwReason=0, void * lpreserved=0x00000001) Line 509 + 0x11 bytes C gkwidget.dll!_DllMainCRTStartup(void * hDllHandle=0x022e0000, unsigned long dwReason=0, void * lpreserved=0x00000001) Line 459 + 0x11 bytes C ntdll.dll!7c9011a7() ntdll.dll!7c923f31() ntdll.dll!7c910945() ntdll.dll!7c91094e() ntdll.dll!7c9105c8() kernel32.dll!7c81cd76() ntdll.dll!7c90f0aa() kernel32.dll!7c80e62b() kernel32.dll!7c80e45c() kernel32.dll!7c81cdee() msvcr80d.dll!__crtExitProcess(int status=0) Line 684 C msvcr80d.dll!doexit(int code=0, int quick=0, int retcaller=0) Line 596 + 0x9 bytes C msvcr80d.dll!exit(int code=0) Line 398 + 0xd bytes C firefox.exe!__tmainCRTStartup() Line 599 C firefox.exe!mainCRTStartup() Line 403 C kernel32.dll!7c816fd7() And the second thread here: > ntdll.dll!7c90eb94() [Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll] ntdll.dll!7c90e9c0() ntdll.dll!7c91901b() ntdll.dll!7c90104b() ntdll.dll!7c927357() ntdll.dll!7c90eac7() When in this state, when I click on a link in tbird, a browser window comes up, but the toolbar is empty. in the js console, I see: Error: [Exception... "Component returned failure code: 0x80570016 (NS_ERROR_XPC_GS_RETURNED_FAILURE) [nsIJSCID.getService]" nsresult: "0x80570016 (NS_ERROR_XPC_GS_RETURNED_FAILURE)" location: "JS frame :: file:///C:/Documents%20and%20Settings/work/Desktop/trunk/components/nsBrowserGlue.js :: bg__initPlaces :: line 297" data: no] Source File: file:///C:/Documents%20and%20Settings/work/Desktop/trunk/components/nsBrowserGlue.js Line: 297 That line of code is: var histsvc = Cc["@mozilla.org/browser/nav-history-service;1"]. getService(Ci.nsINavHistoryService); note, besides the places problems (no bookmark menu, no toolbar, empty bm manager dialog, etc) the browser appears usable fwiw, see ancient bug #38776
note, there was a places-journal.sqlite file in my profile directory.
s/places-journal.sqlite/places.sqlite-journal
this was with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a6pre) Gecko/20070628 Minefield/3.0a6pre, on windows xp and it was a release build.
just hit this again with a debug build. on my console: ###!!! ASSERTION: Fonts still alive while shutting down gfxFontCache: 'mFonts.Co unt() == 0', file c:\builds\trunk\mozilla\ff-debug\dist\include\thebes\gfxFont.h , line 161 Assertion failed at c:/builds/trunk/mozilla/gfx/cairo/cairo/src/cairo-hash.c:196 : hash_table->live_entries == 0 Stuart, could that be related?
hit this again, with a recent build: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a7pre) Gecko/2007071219 Minefield/3.0a7pre From the console: WARNING: Textrun cache not empty!: 'mCache.Count() == 0', file c:/builds/trunk/m ozilla/gfx/thebes/src/gfxTextRunWordCache.cpp, line 51 WARNING: Fonts still alive while shutting down gfxFontCache: 'mFonts.Count() == 0', file c:\builds\trunk\mozilla\ff-debug\dist\include\thebes\gfxFont.h, line 16 0 Assertion failed at c:/builds/trunk/mozilla/gfx/cairo/cairo/src/cairo-hash.c:196 : hash_table->live_entries == 0 WARNING: Leaking the RDF Service.: file c:/builds/trunk/mozilla/rdf/build/nsRDFM odule.cpp, line 236 Stack in debugger similar to last time.
I filed a similar issue (may even be a duplicate), bug 392626 , and found similar results using WinDBG. That other bug has a WinDBG output attached, with symbols. Maybe that helps in this case as well!
Depends on: zombieproc
Seth, I wonder what you and Marco have in common, that you keep experiencing the same bug. Marco is using JAWS which activates accessibility. You could see if something is activating a11y on your system by using the about:accessibilityenabled extension.
Doug, Ere any ideas on this one? Why would it hang in ::OleUninitialize()?
> You could see if something is activating a11y on your system by using the > about:accessibilityenabled extension. I installed that extension, and about:accessibilityenabled loads, but it gives me a blank page.
We should read in bug 38776 where this code was added, maybe it provides some hints. Also, I saw this hint on the web in someone else's bug list when I searched around: "Added a check in the <snip> object to check if OLEInitialize succeeded so OLEUninitialize won't be called unless it was property initialized. This can cause the lockups on shut down." That sounds similar. We should make sure we uninit the same # times we |successfully| init.
tried about:accessibilityenabled from fx2, and I got: Firefox currently does *not* have accessibility API support active.
Just a guess -- maybe OleInitialize is failing(). From my reading of the code, we already ensure that there is exactly 1 call to OleInitialize(), in the constructor for the singleton OleRegisterMgr(). Right now if that fails we still call OleUninitialize(). Marco thought that perhaps this bug started happening more often once Firefox went to shared libs recently. Maybe when Marco is running with JAWS, JAWS will call OleInitialize() if it doesn't see that the app called it already, and going to shared libraries made us call that later, since the library with OleRegisterMgr() would probably be loaded later. We probably should have this change even if it doesn't fix the problem. It's just an HRESULT check.
Attachment #286675 - Flags: superreview?(neil)
Attachment #286675 - Flags: review?
OK, I believe I found the exact build this started happening. It's the build dated 2007-06-12-06, which is the first libXUL build. See bug 345517, which was checked in 2007-06-12 at 06:22 PDT. See also bug 392626.
Attachment #286675 - Flags: review? → review?(dougt)
The problem is that we are calling OleUnitialize from with DllMain (via a static object destructor). That is not allowed per the rules of DllMain, because that method holds a global lock and any operations that could cause threads to start or exit will cause a deadlock. We should be uninitializing OLE much sooner... say, during XPCOM shutdown.
Comment on attachment 286675 [details] [diff] [review] Check error code -- if it failed don't uninitialize We should still check the error code, but even more importantly do what bsmedberg suggests -- perhaps unregister at xpcom-shutdown.
Attachment #286675 - Attachment is obsolete: true
Attachment #286675 - Flags: superreview?(neil)
Attachment #286675 - Flags: review?(dougt)
(In reply to comment #17) > @@ -90,11 +90,10 @@ > @@ -371,38 +373,44 @@ static const char *sScreenManagerContrac This patch fails with a "malformed patch at line 9" error message.
Attachment #286692 - Attachment is obsolete: true
(In reply to comment #19) > Correctly formed patch -- Marco here you go This looks promising! I've been trying to repro with a build containing this patch, both using several pages I visit often, but also the steps I outlined in bug 392626 regarding the http://live.gnome.org/Accerciser link and clicking on "Eitan Isaacson" from there. So far, the process didn't hang a single time for me. I also no longer experience that sound card delay I also commented on in bug 392626. I'll keep watching this, but from what I've seen so far, this fixes the problem.
Attachment #286722 - Flags: superreview?(neil)
Attachment #286722 - Flags: review?(emaijala)
Comment on attachment 286722 [details] [diff] [review] Correctly formed patch -- Marco here you go >+ PRBool mIsInitialized; What does this actually do for us? You're only calling Init() once anyway.
Comment on attachment 286722 [details] [diff] [review] Correctly formed patch -- Marco here you go How about moving the functionality to nsWindow and getting rid of OleRegisterMgr class? As it's not a singleton anymore, I don't see the need to have a class for it.
Attachment #286722 - Flags: superreview?(neil)
Attachment #286722 - Flags: review?(emaijala)
Attachment #286722 - Flags: review-
(In reply to comment #21) > (From update of attachment 286722 [details] [diff] [review]) > >+ PRBool mIsInitialized; > What does this actually do for us? You're only calling Init() once anyway. > It tells whether OleInitialize actually succeeded and OleUnitialize should be called in the end.
(In reply to comment #23) >(In reply to comment #21) >>(From update of attachment 286722 [details] [diff] [review]) >>>+ PRBool mIsInitialized; >>What does this actually do for us? You're only calling Init() once anyway. >It tells whether OleInitialize actually succeeded and OleUnitialize should be >called in the end. OK... so then I think it should be set to false if OleUninitialize succeeds.
Attachment #286722 - Attachment is obsolete: true
Assignee: nobody → aaronleventhal
Comment on attachment 286830 [details] [diff] [review] Incorporate Ere's and Neil's suggestions -- it is cleaner to remove extra class & reset sIsOleInitialized >+ if (sIsOleInitialized) { >+ ::OleUninitialize(); >+ sIsOleInitialized = FALSE; >+ } I don't think this is right, you want to check for the instance count too.
We do, that entire block is wrapped in |if (sInstanceCount == 0)| My first patch had an extra sInstanceCount check that we don't need.
Attachment #286830 - Flags: superreview?(neil) → superreview+
Attachment #286830 - Flags: review?(emaijala) → review+
Attachment #286830 - Flags: approvalM9?
Flags: blocking-firefox3?
Flags: blocking-firefox3? → blocking-firefox3+
Comment on attachment 286830 [details] [diff] [review] Incorporate Ere's and Neil's suggestions -- it is cleaner to remove extra class & reset sIsOleInitialized a=endgame drivers for M9
Attachment #286830 - Flags: approvalM9? → approvalM9+
Checked in! Thanks to all -- especially Marco for tracking it down and Benjamin Smedberg for instantly knowing what the issue is caused by.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
Depends on: 402439
Blocks: zombieproc
No longer depends on: zombieproc
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: