The default bug view has changed. See this FAQ.

Webapps crash (hang) on start because of missing ICU delayload DLL

VERIFIED FIXED in Firefox 29

Status

Firefox Graveyard
Webapp Runtime
P1
normal
VERIFIED FIXED
3 years ago
a year ago

People

(Reporter: bsmedberg, Assigned: fzzzy)

Tracking

unspecified
Firefox 29
x86_64
Windows 8.1
Dependency tree / graph

Details

(crash signature)

Attachments

(1 attachment)

(Reporter)

Description

3 years ago
I saw a twitter link to a webapp and tried to install it using nightly. The app wouldn't start, and I have it in a debugger. There's a cascade of crashes leading to a deadlock...

Main thread:

 	kernel32.dll!_WaitForSingleObject@8() 	
 	xul.dll!google_breakpad::ExceptionHandler::WriteMinidumpOnHandlerThread(exinfo=0x0046e768, assertion=0x00000000)  Line 697	C++
 	xul.dll!google_breakpad::ExceptionHandler::HandleException(exinfo=0x0046e768)  Line 488	C++
 	kernel32.dll!_UnhandledExceptionFilter@4() 	
 	ntdll.dll!___RtlUserThreadStart@8() 	
 	ntdll.dll!@_EH4_CallFilterFunc@8() 	
 	ntdll.dll!ExecuteHandler2@20() 	
 	ntdll.dll!ExecuteHandler@20() 	
 	ntdll.dll!_RtlDispatchException@8() 	
 	ntdll.dll!_KiUserExceptionDispatcher@8() 	
 	KernelBase.dll!_RaiseException@16() 	
>	mozjs.dll!__delayLoadHelper2(pidd=0x000000dc, ppfnIATEntry=0x0046ec14)  Line 331	C++
 	mozjs.dll!__tailMerge_icuuc50_dll() 	Unknown
 	xul.dll!NS_InitXPCOM2(result=0x0250513c, binDirectory=0x02518220, appFileLocationProvider=0x0046ed4c)  Line 553	C++
 	xul.dll!ScopedXPCOMStartup::Initialize()  Line 1224	C++
 	xul.dll!XREMain::XRE_main(argc=0x00000001, argv=0x00543f08, aAppData=0x005170c0)  Line 4088	C++
 	xul.dll!XRE_main(argc=0x00000001, argv=0x00543f08, aAppData=0x025170c0, aFlags=0x00000000)  Line 4345	C++
 	R.U.Nuts.exe!`anonymous namespace'::AttemptGRELoadAndLaunch(greDir=0x0046fa28)  Line 300	C++

So we're starting, calling JS_SetICUMemoryFunctions from within NS_InitXPCOM2 and then I think we're trying to delayload the ICU dll, except it's not on the PATH because this is webapprt. This can probably be fixed by adding the ICU DLLs to dependentlibs.list, but I'm not sure about that. I'm not sure why we delayloaded it if we always call into it in the startup path anyway.

The delayload handler throws an exception because it can't find the DLL, which is caught by breakpad (which has already been installed). So far so good, we start to try and get a crash report. But next on the breakpad thread we have:

 	ntdll.dll!_RtlEnterCriticalSection@4() 	
>	xul.dll!google_breakpad::AutoExceptionHandler::AutoExceptionHandler()  Line 417	C++
 	xul.dll!google_breakpad::ExceptionHandler::HandleException(exinfo=0x0521b4c4)  Line 456	C++
 	kernel32.dll!_UnhandledExceptionFilter@4() 	
 	ntdll.dll!___RtlUserThreadStart@8() 	
 	ntdll.dll!@_EH4_CallFilterFunc@8() 	
 	ntdll.dll!ExecuteHandler2@20() 	
 	ntdll.dll!ExecuteHandler@20() 	
 	ntdll.dll!_RtlDispatchException@8() 	
 	ntdll.dll!_KiUserExceptionDispatcher@8() 	
 	ntdll.dll!_RtlpWaitOnCriticalSection@8() 	
 	ntdll.dll!_RtlEnterCriticalSection@4() 	
 	mozglue.dll!`anonymous namespace'::DllBlockSet::Write(file=0x0000028c)  Line 371	C++
 	mozglue.dll!DllBlocklist_WriteNotes(file=0x0000028c)  Line 640	C++
 	xul.dll!CrashReporter::MinidumpCallback(dump_path=0x02528280, minidump_id=0x0251b100, context=0x00000000, exinfo=0x0046e768, assertion=0x00000000, succeeded=true)  Line 579	C++
 	xul.dll!google_breakpad::ExceptionHandler::WriteMinidumpWithException(requesting_thread_id=0x000017e8, exinfo=0x0046e768, assertion=0x00000001)  Line 831	C++
 	xul.dll!google_breakpad::ExceptionHandler::ExceptionHandlerThreadMain(lpParameter=0x025280c0)  Line 381	C++

We are trying to write the DLL blocklist annotations for the crash report, but the DLL blocklist hasn't been initialized yet. So in DllBlockSet::Write we try to lock the null "sLock" and throw *another* exception. This should probably be fixed in a separate bug (NI:dmajor).

Finally, breakpad tries to handle the second exception. This deadlocks at this line: http://hg.mozilla.org/mozilla-central/annotate/124d80d7556d/toolkit/crashreporter/google-breakpad/src/client/windows/handler/exception_handler.cc#l448 because we lock ExceptionHandler::handler_stack_critical_section_ which is already locked from the first exception on the main thread. This is *supposed* to be prevented by http://hg.mozilla.org/mozilla-central/annotate/124d80d7556d/toolkit/crashreporter/google-breakpad/src/client/windows/handler/exception_handler.cc#l421 but that appears to be broken in some way. Because this is a release-mode build I don't have great locals to figure out what the issue is, but it also ought to be a separate bug!

This bug will be about the first issue (webapprt crashing the first time because of ICU delayload).
(Reporter)

Comment 1

3 years ago
I tried adding the ICU libs to dependentlibs.list by adding the following lines before mozjs.dll and this didn't seem to help. Glandium do you have suggestions?

icudt50.dll
icuuc50.dll
icuin50.dll
Flags: needinfo?(mh+mozilla)
> but the DLL blocklist hasn't been initialized yet. So in DllBlockSet::Write
> we try to lock the null "sLock" and throw *another* exception. This should
> probably be fixed in a separate bug (NI:dmajor).
Opened bug 958344.

> Finally, breakpad tries to handle the second exception. This deadlocks
Opened bug 958348.
See Also: → bug 958344, bug 958348
(Assignee)

Updated

3 years ago
Assignee: nobody → dpreston

Updated

3 years ago
Crash Signature: [@ __delayLoadHelper2 | ffi_closure_STDCALL ]
Priority: -- → P1
(Assignee)

Comment 3

3 years ago
Which linux is this on?
(Reporter)

Comment 4

3 years ago
.dll == Windows ;-)
(Assignee)

Comment 5

3 years ago
Hmm ok, the platform said lunix. Can you change it to the appropriate windows? Thanks
(Assignee)

Comment 6

3 years ago
I was able to replicate this crash on Windows 8.1.
OS: Linux → Windows 8.1
Ehsan, can you remind me what impact you measured when you set icu to delay load?

I /guess/ the js engine actually needs only one of the ICU libs during initialization, so we might as well make that one non delay-loaded. That being said, that's unlikely to solve the whole problem (except if icu is dynamically opening the other libraries itself).

In principle, adding the libs to dependentlibs.list should work, so Benjamin, i'd suggest you add some debugging code to the xpcom standalone glue (or step through it).
Flags: needinfo?(mh+mozilla) → needinfo?(ehsan)
(In reply to Mike Hommey [:glandium] from comment #7)
> Ehsan, can you remind me what impact you measured when you set icu to delay
> load?

It was negligible.  I basically delay loaded those DLLs because I was under the impression that we don't need them during startup, which seems to not be the case.

> I /guess/ the js engine actually needs only one of the ICU libs during
> initialization, so we might as well make that one non delay-loaded. That
> being said, that's unlikely to solve the whole problem (except if icu is
> dynamically opening the other libraries itself).
> 
> In principle, adding the libs to dependentlibs.list should work, so
> Benjamin, i'd suggest you add some debugging code to the xpcom standalone
> glue (or step through it).

I think the right fix here is to remove <http://mxr.mozilla.org/mozilla-central/source/toolkit/library/Makefile.in#133>.
Flags: needinfo?(ehsan)
JS_SetICUMemoryFunctions in NS_InitXPCOM2 accesses ICU code, so delay load option may be useless.

As long as bp-6645491f-1677-4bfa-840d-9c6622140113, we should remove mozjs's delayload too.
It may or may not be useless. The icu data libs may not be required at startup.
(Assignee)

Comment 11

3 years ago
I removed the ifdef block at <http://mxr.mozilla.org/mozilla-central/source/toolkit/library/Makefile.in#133> as suggested by Ehsan in Comment 8. Strangely, I still get the exact same stack trace. I did ./mach clobber and tried removing the obj- directory and rebuilding, but that didn't help either. Any other ideas?
(In reply to comment #11)
> I removed the ifdef block at
> <http://mxr.mozilla.org/mozilla-central/source/toolkit/library/Makefile.in#133>
> as suggested by Ehsan in Comment 8. Strangely, I still get the exact same stack
> trace. I did ./mach clobber and tried removing the obj- directory and
> rebuilding, but that didn't help either. Any other ideas?

Hmm, try removing this as well? <http://mxr.mozilla.org/mozilla-central/source/js/src/Makefile.in#177>
(Assignee)

Comment 13

3 years ago
Awesome, comment 8 plus comment 12 fixed it! Thanks!! Will post the patch in a bit.
(Assignee)

Comment 14

3 years ago
Created attachment 8361288 [details] [diff] [review]
Remove the delayload option from the ICU libs on windows, because it was causing the webruntime to crash at startup.
Attachment #8361288 - Flags: review?(ehsan)
Comment on attachment 8361288 [details] [diff] [review]
Remove the delayload option from the ICU libs on windows, because it was causing the webruntime to crash at startup.

I wouldn't dare review build system code!
Attachment #8361288 - Flags: review?(ehsan) → review?(mh+mozilla)
Attachment #8361288 - Flags: review?(mh+mozilla) → review+
(Assignee)

Updated

3 years ago
Keywords: checkin-needed
https://hg.mozilla.org/integration/mozilla-inbound/rev/2ecb4c7413d5
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/2ecb4c7413d5
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 29
(Reporter)

Updated

3 years ago
Blocks: 962163

Updated

3 years ago
status-firefox28: --- → affected
status-firefox29: --- → fixed
Crashstats data confirms this signature is no longer reported against Nightly since 2014-01-21 and the rank has dropped significantly out of topcrash range. Given this is the #3 topcrash on Aurora can this be nominated for uplift?
Status: RESOLVED → VERIFIED
status-firefox29: fixed → verified
(Reporter)

Comment 19

3 years ago
As discussed on Monday, this isn't going to hit beta because ICU is still disabled. And since we uplift Aurora next week, I don't think we need to worry about this further.
Depends on: 970123
Product: Firefox → Firefox Graveyard
You need to log in before you can comment on or make changes to this bug.