Closed Bug 1080388 Opened 10 years ago Closed 10 years ago

Assertion failure: !GetModuleHandleA("mozglue.dll")

Categories

(Firefox Build System :: General, defect)

x86_64
Windows 8
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla36

People

(Reporter: alessarik, Assigned: jimm)

References

Details

Attachments

(3 files)

User Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0
Build ID: 20140825202822

Steps to reproduce:

I built debug build on m-c


Actual results:

I tried run FireFox. But it finished at startup with the message:
Assertion failure: !GetModuleHandleA("mozglue.dll")


Expected results:

FireFox should work
Blocks: 1042108
Can you attach your mozconfig and the full debug build start output?
Component: Untriaged → Build Config
Flags: needinfo?(alessarik)
Product: Firefox → Core
Blocks: 1023941
Any chance you could provide a stack for the loading of mozglue.dll? In WinDbg you can set a breakpoint on it with "sxe ld mozglue". (Unless it's already loaded by the time the debugger kicks in -- which would also be good to know.) I'm not sure how you would do this in Visual Studio.
Attached file .mozconfig
mozconfig is attached.

Output of start is very simple:
>mach run
>0:07.25 s:\FireFox\sourceNINE\obj-i686-pc-mingw32\dist\bin\firefox.exe -no-remote -profile s:\FireFox\sourceNINE\obj-i686-pc-mingw32\tmp\scratch_user
>Assertion failure: !GetModuleHandleA("mozglue.dll"), at s:\firefox\sourcenine\toolkit\xre\WindowsCrtPatch.h:126

Something else?
> ac_add_options --enable-metro

I'll bet this has something to do with metro code.
Not sure why but for a normal desktop run built on win8 w/metro enabled we hit this - 

'firefox.exe' (Win32): Loaded 'T:\Mozilla\DBG\dist\bin\firefox.exe'. Symbols loaded.
'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\ntdll.dll'. Symbols loaded.
'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\kernel32.dll'. Symbols loaded.
'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\KernelBase.dll'. Symbols loaded.
'firefox.exe' (Win32): Loaded 'T:\Mozilla\DBG\dist\bin\mozglue.dll'. Symbols loaded.
'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\version.dll'. Symbols loaded.
'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\msvcp110.dll'. Symbols loaded.
'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\msvcr110.dll'. Symbols loaded.
'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\msvcrt.dll'. Symbols loaded.
firefox.exe has triggered a breakpoint.

not sure what's loading mozglue.dll, will try to track that down.
Flags: needinfo?(alessarik)
I get a crash in a fresh visual studio 2012 build of mc tip, without --enable-metro, which seems to be related.

 	ntdll.dll!_RtlReportCriticalFailure@8()	Unknown
 	ntdll.dll!_RtlpReportHeapFailure@4()	Unknown
 	ntdll.dll!_RtlpLogHeapFailure@24()	Unknown
 	ntdll.dll!_RtlFreeHeap@12()	Unknown
 	kernel32.dll!_HeapFree@12()	Unknown
>	firefox.exe!free(void * pBlock=0x00901000) Line 51	C
 	firefox.exe!_wsetenvp() Line 139	C
 	firefox.exe!__tmainCRTStartup() Line 259	C
 	kernel32.dll!@BaseThreadInitThunk@12()	Unknown
 	ntdll.dll!___RtlUserThreadStart@8()	Unknown
 	ntdll.dll!__RtlUserThreadStart@8()	Unknown
Flags: needinfo?(dmajor)
nm, after a full clobber, the problem went away. this definitely appears to be tied to enable-metro.
Flags: needinfo?(dmajor)
This looks like what's described in bug 1023941, comment 32. I was able break out on malloc, the call was in a crt call __crtGetEnvironmentStringsW from within __tmainCRTStartup. Not sure why we run into this but normal fx doesn't.

I'm having a hard time getting crt src matched up unfortunately so debugging is a bit of a pain. If anyone has any ideas, please post!
I'm unable to build with --enable-metro. First I got errors about windows.system.h that I hacked around by de-unifying the widget sources. Then I hit bug 1084323. Then I get these errors:

0:40.53 MSVCRT.lib(crtexe.obj) : error LNK2019: unresolved external symbol _main referenced in  0:40.53 CommandExecuteHandler.exe : fatal error LNK1120: 1 unresolved externals 

This is my revision:
parent: 210904:33c0181c4a25 tip
 No bug, Automated blocklist update from host bld-linux64-spot-069 - a=blocklist-update

It's pretty clear that the metro build isn't getting much love. I'm not going to spend the time to work through all of these issues. The best I can do is just blindly turn off this function for metro since you don't need WinXP hacks.
Attached patch metro patchSplinter Review
Does this fix the issue?
Attachment #8507573 - Flags: feedback?(alessarik)
Output in Visual Studio:
>'firefox.exe' (Win32): Loaded 'S:\FireFox\sourceNINE\obj-i686-pc-mingw32\dist\bin\firefox.exe'. Symbols loaded.
>'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\ntdll.dll'. Cannot find or open the PDB file.
>'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\kernel32.dll'. Cannot find or open the PDB file.
>'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\KernelBase.dll'. Cannot find or open the PDB file.
>'firefox.exe' (Win32): Loaded 'S:\FireFox\sourceNINE\obj-i686-pc-mingw32\dist\bin\mozglue.dll'. Symbols loaded.
>'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\version.dll'. Cannot find or open the PDB file.
>'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\msvcp110.dll'. Symbols loaded.
>'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\msvcr110.dll'. Symbols loaded.
>'firefox.exe' (Win32): Loaded 'C:\Windows\SysWOW64\msvcrt.dll'. Cannot find or open the PDB file.
Call stack in Visual Studio:
>>firefox.exe!WindowsCrtPatch::Init() Line 126	C++
> firefox.exe!wmain(int argc, wchar_t * * argv) Line 88	C++
> firefox.exe!__tmainCRTStartup() Line 241	C
> kernel32.dll!754086e3()	Unknown
> [Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]	
> ntdll.dll!779cbe99()	Unknown
> ntdll.dll!779cbe6c()	Unknown
Comment on attachment 8507573 [details] [diff] [review]
metro patch

(In reply to David Major [:dmajor] (UTC+13) from comment #10)
> Does this fix the issue?
Yes, It does.
Attachment #8507573 - Flags: feedback?(alessarik) → feedback+
But FireFox does not still work correctly. After patch:
Output of "mach run" is empty.
Callstack in Visual Studio:
> ntdll.dll!77a4c1e5()	Unknown
> [Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll]	
> ntdll.dll!77a01a61()	Unknown
>>firefox.exe!_msize(void * pblock) Line 50	C
> firefox.exe!_recalloc(void * memblock, unsigned int count, unsigned int size) Line 65	C
> firefox.exe!_recalloc_crt(void * ptr, unsigned int count, unsigned int size) Line 91	C
> firefox.exe!__crtsetenv(char * * poption, const int primary) Line 249	C
> firefox.exe!__wtomb_environ() Line 72	C
> firefox.exe!_getenv_helper_nolock(const char * option) Line 131	C
> firefox.exe!getenv(const char * option) Line 83	C
> firefox.exe!XPCOMGlueLoad(const char * aXPCOMFile) Line 416	C++
> firefox.exe!XPCOMGlueStartup(const char * aXPCOMFile) Line 521	C++
> firefox.exe!InitXPCOMGlue(const char * argv0, nsIFile * * xreDirectory) Line 556	C++
> firefox.exe!NS_internal_main(int argc, char * * argv) Line 621	C++
> firefox.exe!wmain(int argc, wchar_t * * argv) Line 113	C++
> firefox.exe!__tmainCRTStartup() Line 241	C
> kernel32.dll!754086e3()	Unknown
> ntdll.dll!779cbe99()	Unknown
> ntdll.dll!779cbe6c()	Unknown
I found XRE_DONT_SUPPORT_XPSP2 as well, but defining that doesn't turn this off, we still end up with a weird memory related crash on startup.

(In reply to David Major [:dmajor] (UTC+13) from comment #9)
> I'm unable to build with --enable-metro. First I got errors about
> windows.system.h that I hacked around by de-unifying the widget sources.
> Then I hit bug 1084323. Then I get these errors:

We have fixes over on the metro repo for most of this. (I should merge those changes over to mc, will do that today.)
(In reply to Jim Mathies [:jimm] from comment #14)
> I found XRE_DONT_SUPPORT_XPSP2 as well, but defining that doesn't turn this
> off, we still end up with a weird memory related crash on startup.
> 
> (In reply to David Major [:dmajor] (UTC+13) from comment #9)
> > I'm unable to build with --enable-metro. First I got errors about
> > windows.system.h that I hacked around by de-unifying the widget sources.
> > Then I hit bug 1084323. Then I get these errors:
> 
> We have fixes over on the metro repo for most of this. (I should merge those
> changes over to mc, will do that today.)

See bug 1042108, comment 10 for a list.
(In reply to Jim Mathies [:jimm] from comment #14)
> I found XRE_DONT_SUPPORT_XPSP2 as well, but defining that doesn't turn this
> off, we still end up with a weird memory related crash on startup.
But we have issue in other place. Maybe we find another issue.
The WindowsCrtPatch is super early in startup; it's the first thing we do at main(). XPCOMGlueLoad is relatively late (at least in my world) so I suspect it's a separate issue. But it's possible the two are merely symptoms of a common problem.
(In reply to David Major [:dmajor] (UTC+13) from comment #17)
> so I suspect it's a separate issue. 
Should we fix this bug and create a new instance or should we continue work in this bug?
> But it's possible the two are merely symptoms of a common problem.
Maybe one of dll was not loaded (which contains system code). The reason can be in inconsistence of files.
Some investigation:
Some of test on TRY servers were PASSED: https://tbpl.mozilla.org/?tree=Try&rev=386fb8ca1afd
And some of them were failed with current issue: !GetModuleHandleA("mozglue.dll")
It looks like, some of test configuration knows some mysteries (maybe on some of test servers).
All the debug tests failed with this issue, which makes sense given that it's a debug-only assert. The opt tests generally passed except for the cppunittests which hit access violations, maybe the same thing Jim is seeing.

I downloaded the debug build and can confirm I see mozglue loading early in startup. This would be easier if we had debugging symbols, so I pushed one with the magic build flag: https://tbpl.mozilla.org/?tree=Try&rev=757435d7dd3c

Once that build finishes I'll be able to get to a diagnosis very quickly. At that point we'll have better information on whether to split up the issues we're seeing.
Argh, now with correct spelling: https://tbpl.mozilla.org/?tree=Try&rev=019ddb23351c
Ok, so: some code, somewhere in the metro version of firefox.exe, uses malloc (probably the |iniFile->Append| in nsBrowserApp.cpp, maybe others too). This links to mozglue as in bug 1023941 comment 32. It breaks the XP hack, but that's OK, metro doesn't care.

Best option is to just skip those asserts in the metro build. This issue is benign and I don't think it's related to the other crashes.
Attachment #8507573 - Flags: review?(jmathies)
Try build contains patch and --enable-metro: https://tbpl.mozilla.org/?tree=Try&rev=cf5b96c9fae3
Yes, we cannot see current issue !GetModuleHandleA("mozglue.dll") in failed test.
But we can see strange exit code in all failed tests: application terminated with exit code 3221225477
And very strange for me, that we can see passed tests in mochitest-system.
> Ok, so: some code, somewhere in the metro version of firefox.exe, uses
> malloc (probably the |iniFile->Append| in nsBrowserApp.cpp, maybe others
> too). This links to mozglue as in bug 1023941 comment 32. It breaks the XP
> hack, but that's OK, metro doesn't care.

So this is actually a problem after all. Firefox.exe is mixing allocators. The crash at comment 13 is because the CRT is calling its own _msize (RtlSizeHeap) on a piece of jemalloc memory. (Actually, it's a bad-pointer assertion inside the 'debug heap' which is only used when the process starts under a debugger. That's why the mochitests don't crash on startup.) Even if you can get past that, there's a crash on shutdown when jemalloc tries to free memory that was allocated by the CRT.

Ugh, this is a mess. You should probably just revert the changes in bug 1023941 Part 1 when the build is MOZ_METRO. Unless :glandium has any better ideas.

For try builds please add MOZ_CRASHREPORTER_UPLOAD_FULL_SYMBOLS=1 to your mozconfig. It helps me debug this stuff since I can't compile metro myself.
Attachment #8507573 - Flags: review?(jmathies)
This is kinda sad, but I really don't have the time to get into debugging those early allocations now. Will file a follow up.

This fixes the crash on startup for metro enabled desktop firefox running on Win7.

Maksim, can you test this in a metro build on Win8?
Assignee: nobody → jmathies
Attachment #8509507 - Flags: review?(dmajor)
Attachment #8509507 - Flags: feedback?(alessarik)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment on attachment 8509507 [details] [diff] [review]
work around: disable static link for metro builds of firefox.exe

Looks like patch can resolve the issue.
My local FireFox build can work.
The most of test on TRY servers are PASSED:
https://tbpl.mozilla.org/?tree=Try&rev=15558643a5a9
Attachment #8509507 - Flags: feedback?(alessarik) → feedback+
There are still crashes on shutdown. I pushed a try with full symbols [1] and found that it's bug 1085662. So once you update past that rev, things should be good. Since it's a shutdown crash, I doubt there are other issues remaining after it, so I didn't bother pushing another try build.

[1] https://tbpl.mozilla.org/?tree=Try&rev=e2201b9d9592
Comment on attachment 8509507 [details] [diff] [review]
work around: disable static link for metro builds of firefox.exe

Review of attachment 8509507 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me but glandium should have a look.
You'll still need to pick up the assert fix. Want to include it here? Or you can review and I can land, either way.
Attachment #8509507 - Flags: review?(mh+mozilla)
Attachment #8509507 - Flags: review?(dmajor)
Attachment #8509507 - Flags: feedback+
Attachment #8507573 - Flags: review+
I'll roll all this together and take care of the landed, no problem.
Attachment #8509507 - Flags: review?(mh+mozilla) → review+
Should we land both patches?
Maybe only second (work around: disable static link for metro builds of firefox.exe) repair this issue.
Try build with only second patch (without first patch):
https://tbpl.mozilla.org/?tree=Try&rev=4b27fe3b2759
Disabling the static link guarantees that those asserts will always fail :) You'll definitely want both patches.
Thanks everyone!
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: