Closed
Bug 1321244
Opened 9 years ago
Closed 9 years ago
[App Verifier] WSAStartup() is called in loading xul.dll
Categories
(Core :: General, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla53
People
(Reporter: cyu, Assigned: cyu)
References
(Blocks 1 open bug)
Details
(Whiteboard: app_verifier)
Attachments
(3 files)
|
3.61 KB,
text/xml
|
Details | |
|
12.01 KB,
patch
|
Details | Diff | Splinter Review | |
|
59 bytes,
text/x-review-board-request
|
froydnj
:
review+
jcristau
:
approval-mozilla-beta+
|
Details |
This check is in the "Networking" test section and is off by default. When I turned the Networking section on, firefox fails to start because of the error that WSAStartup() is called in loading xul.dll. It turns out that PR_GetEnv() in the global scope (https://dxr.mozilla.org/mozilla-central/rev/a69583d2dbc6fdc18f63761a89cf539c356668be/dom/quota/ActorsParent.cpp#109 ) initializes NSPR, which initializes winsock: https://dxr.mozilla.org/mozilla-central/rev/a69583d2dbc6fdc18f63761a89cf539c356668be/nsprpub/pr/src/md/windows/w95io.c#48
According to MSDN, DllMain() "must not call the LoadLibrary or LoadLibraryEx function" (https://msdn.microsoft.com/zh-tw/library/windows/desktop/ms682583(v=vs.85).aspx ), and WSAStartup() might load other DLLs (https://msdn.microsoft.com/zh-tw/library/windows/desktop/ms742213(v=vs.85).aspx ) so should also be avoided in the global scope.
I am not sure how bad this is given that on my workstation, it doesn't result in deadlock in launching the browser as the main danger of calling LoadLibrary() in DllMain().
| Assignee | ||
Updated•9 years ago
|
Component: Print Preview → General
| Assignee | ||
Updated•9 years ago
|
Priority: P1 → P3
| Assignee | ||
Comment 1•9 years ago
|
||
The error stack:
nss3!_PR_MD_INIT_IO+28 (e:\hg\mozilla-central\nsprpub\pr\src\md\windows\w95io.c @ 76)
nss3!_PR_InitStuff+142 (e:\hg\mozilla-central\nsprpub\pr\src\misc\prinit.c @ 203)
nss3!PR_GetEnv+17 (e:\hg\mozilla-central\nsprpub\pr\src\misc\prenv.c @ 67)
xul!`dynamic initializer for mozilla::dom::quota::QuotaManager::kRunningXPCShellTests+11 (e:\hg\mozilla-central\dom\quota\actorsparent.cpp @ 109)
ucrtbase!initterm+8e ( @ 0)
xul!dllmain_crt_process_attach+bb (f:\dd\vctools\crt\vcstartup\src\startup\dll_dllmain.cpp @ 67)
xul!dllmain_dispatch+5d (f:\dd\vctools\crt\vcstartup\src\startup\dll_dllmain.cpp @ 190)
verifier!VerifierGetPropertyValueByName+6ff9 ( @ 0)
vrfcore!VerifierTlsSetValue+585 ( @ 0)
vfbasics!+7ffef37f32b9 ( @ 0)
ntdll!RtlDeactivateActivationContextUnsafeFast+1bf ( @ 0)
ntdll!RtlGetThreadErrorMode+28a ( @ 0)
ntdll!RtlGetThreadErrorMode+d7 ( @ 0)
ntdll!RtlDeleteBoundaryDescriptor+1cd ( @ 0)
ntdll!RtlFormatCurrentUserKeyPath+7b9 ( @ 0)
ntdll!RtlFormatCurrentUserKeyPath+1fd ( @ 0)
ntdll!LdrLoadDll+8c ( @ 0)
vfbasics!+7ffef37f384c ( @ 0)
mozglue!`anonymous namespace::patched_LdrLoadDll+5e9 (e:\hg\mozilla-central\mozglue\build\windowsdllblocklist.cpp @ 755)
KERNELBASE!LoadLibraryExW+16f ( @ 0)
firefox!ReadDependentCB+62 (e:\hg\mozilla-central\xpcom\glue\standalone\nsxpcomglue.cpp @ 168)
firefox!XPCOMGlueLoad+1be (e:\hg\mozilla-central\xpcom\glue\standalone\nsxpcomglue.cpp @ 323)
firefox!XPCOMGlueStartup+30 (e:\hg\mozilla-central\xpcom\glue\standalone\nsxpcomglue.cpp @ 425)
firefox!InitXPCOMGlue+113 (e:\hg\mozilla-central\browser\app\nsbrowserapp.cpp @ 371)
firefox!NS_internal_main+c3 (e:\hg\mozilla-central\browser\app\nsbrowserapp.cpp @ 451)
firefox!wmain+163 (e:\hg\mozilla-central\toolkit\xre\nswindowswmain.cpp @ 118)
firefox!__scrt_common_main_seh+11d (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 253)
KERNEL32!BaseThreadInitThunk+14 ( @ 0)
ntdll!RtlUserThreadStart+21 ( @ 0)
| Assignee | ||
Comment 2•9 years ago
|
||
I tried to lazily init the global variables, but it makes things worse because threads can be created during initialization of XPCOM, which may lead to race in NSPR initialization.
We may either 1. preload NSPR and initialize it explicitly before XPCOMGlueStartup(), or 2. provide a hook in the DLLs that we load for explicit initialization after the library is loaded.
Nathan, do you have any concerns with the above solutions? Thanks.
Flags: needinfo?(nfroyd)
Comment 3•9 years ago
|
||
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #2)
> I tried to lazily init the global variables, but it makes things worse
> because threads can be created during initialization of XPCOM, which may
> lead to race in NSPR initialization.
This solution is changing QuotaManager::kRunningXPCShellTests to QuotaManager::IsRunningXPCShellTests() and lazily doing the PR_GetEnv inside IsRunningXPCShellTests()? And that exposes some sort of badness because XPCOM initialization (or even something before it?) implicitly depended on NSPR already being initialized? Ugh.
> We may either 1. preload NSPR and initialize it explicitly before
> XPCOMGlueStartup(), or 2. provide a hook in the DLLs that we load for
> explicit initialization after the library is loaded.
I think I prefer option 1, but it would be much better if we didn't have to bother with preloading. I think bsmedberg's work on getting rid of the standalone glue may make your job much easier here: firefox.exe will load libxul and call into a prearranged function to make everything go, and we ought to be able to initialize NSPR explicitly there...assuming I'm understanding all of this correctly. Benjamin, does that sound like the right thing?
Flags: needinfo?(nfroyd) → needinfo?(benjamin)
Comment 4•9 years ago
|
||
I think we have to fix the XUL global initializers to not initialize XPCOM, no matter what else, because some of our binaries link directly against NSPR and xul (plugin-container for example) and so we don't have an opportunity to programmatically initialize NSPR before we hit this.
If it would help, we could call an explicit NSPR initialization function in NS_LogInit which should be well before we get to XPCOM init.
Flags: needinfo?(benjamin)
Comment 5•9 years ago
|
||
It sounds like we are agreed, then!
| Assignee | ||
Comment 6•9 years ago
|
||
This lazily init the global variable by wrapping the RHS in a lambda, which gets called when the variable is first used, much similar to what LazyLogModule does: https://dxr.mozilla.org/mozilla-central/rev/9b8bf5feb0b52aa4b03aa5fa3d4f0727b2974663/xpcom/base/Logging.h#163
In addition to QuotaManager::kRunningXPCShellTests, variable gWindowsLog in WinUtils.cpp and kWidgetPrintingLogMod in nsDeviceContextWin.cpp also calls PR_NewLogModule(), which may initialize NSPR.
| Assignee | ||
Comment 7•9 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #3)
> (In reply to Cervantes Yu [:cyu] [:cervantes] from comment #2)
> > I tried to lazily init the global variables, but it makes things worse
> > because threads can be created during initialization of XPCOM, which may
> > lead to race in NSPR initialization.
>
> This solution is changing QuotaManager::kRunningXPCShellTests to
> QuotaManager::IsRunningXPCShellTests() and lazily doing the PR_GetEnv inside
> IsRunningXPCShellTests()? And that exposes some sort of badness because
There are actually multiple global variables that may init NSPR. See comment #6, or refer to the LAZY_INIT_DEFINE() macro usage in attachment 8816406 [details] [diff] [review] for more detail.
> XPCOM initialization (or even something before it?) implicitly depended on
> NSPR already being initialized? Ugh.
>
Right. With this patch, I an error that setsockopt() gets called without prior call to WSAStartup() (and this happens on non-main thread). So I gave up lazy init the globals.
> > We may either 1. preload NSPR and initialize it explicitly before
> > XPCOMGlueStartup(), or 2. provide a hook in the DLLs that we load for
> > explicit initialization after the library is loaded.
>
> I think I prefer option 1, but it would be much better if we didn't have to
> bother with preloading. I think bsmedberg's work on getting rid of the
> standalone glue may make your job much easier here: firefox.exe will load
> libxul and call into a prearranged function to make everything go, and we
> ought to be able to initialize NSPR explicitly there...assuming I'm
> understanding all of this correctly. Benjamin, does that sound like the
> right thing?
After libxul is loaded, it would be too late to fix this problem because the global variables gets initialized during loading of libxul. We need to init NSPR without libxul is loaded to fix this issue.
BTW, how bad, do you think, this bug is? Browser works without any hiccup or loader deadlock on my workstation, but could some 3rd party software performing some magic like DLL injection lead to problem?
Comment 8•9 years ago
|
||
Comment on attachment 8816406 [details] [diff] [review]
Abandoned WIP
Review of attachment 8816406 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks for tracking down all the places that initialize NSPR through static constructors.
::: dom/quota/QuotaManager.h
@@ +113,5 @@
>
> public:
> NS_INLINE_DECL_REFCOUNTING(QuotaManager)
>
> + static LAZY_INIT_DECL(const bool) kRunningXPCShellTests;
We could do this, but the way we've handled things in the past like this is to do:
static bool IsRunningXPCShellTests();
and then define:
/* static */ bool
QuotaManager::IsRunningXPCShellTests()
{
static bool kRunningXPCShellTests = PR_GetEnv("XPCSHELL_TEST_PROFILE_DIR");
return kRunningXPCShellTests;
}
which provides the same functionality as the proposed LAZY_INIT_* macros, but is standardized in the language and therefore (at least with our recent compilers) gets thread-safety for free. Note that the non-global-scope static only gets initialized when IsRunningXPCShellTests is called, not when the library is loaded.
::: widget/windows/WinUtils.cpp
@@ +59,5 @@
> #endif // #ifdef NS_ENABLE_TSF
>
> #include <shlwapi.h>
>
> +LAZY_INIT_DEFINE(PRLogModuleInfo*, gWindowsLog, PR_NewLogModule("Widget"));
WDYT about just changing all of the NSPR logging uses here and elsewhere in this patch to use LazyLogModule instead? We have wanted to move everything away from NSPR logging anyway, and the problem in this bug would provide a motivation to do some of that work.
Comment 9•9 years ago
|
||
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #7)
> (In reply to Nathan Froyd [:froydnj] from comment #3)
> > (In reply to Cervantes Yu [:cyu] [:cervantes] from comment #2)
> > XPCOM initialization (or even something before it?) implicitly depended on
> > NSPR already being initialized? Ugh.
>
> Right. With this patch, I an error that setsockopt() gets called without
> prior call to WSAStartup() (and this happens on non-main thread). So I gave
> up lazy init the globals.
Right, that's because this patch alone isn't enough: we'd need to call PR_Init() (or whatever the function is called) somewhere during process startup, before any NSPR calls were made. Benjamin's suggestion in comment 4 sounds like about the right place.
We might still have people add more static constructors that would call into NSPR, but perhaps we can catch that somehow (static analysis for common patterns? we already track # of static constructors on Linux builds).
> BTW, how bad, do you think, this bug is? Browser works without any hiccup or
> loader deadlock on my workstation, but could some 3rd party software
> performing some magic like DLL injection lead to problem?
Obviously we've gone for quite some time without anybody having to track it down, but dynamic loader deadlock during startup wouldn't be something we'd get crash reports or similar for, assuming I understand things correctly, and it'd be a terrible user experience. So fixing this bug sounds like a good idea, even if we're not sure how often it happens in practice.
| Comment hidden (mozreview-request) |
| Assignee | ||
Comment 11•9 years ago
|
||
After lazy-init the global variables, init of NSPR is done in NS_LogInit() in InitXPCOMGlue(), which should be safe.
Comment 12•9 years ago
|
||
| mozreview-review | ||
Comment on attachment 8820677 [details]
Bug 1321244 - Lazily init global variables that could implicitly initialize NSPR.
https://reviewboard.mozilla.org/r/100156/#review100632
::: dom/quota/QuotaManager.h:120
(Diff revision 1)
> public:
> NS_INLINE_DECL_REFCOUNTING(QuotaManager)
>
> - static const bool kRunningXPCShellTests;
> + static bool IsRunningXPCShellTests()
> + {
> + static bool kRunningXPCShellTests = PR_GetEnv("XPCSHELL_TEST_PROFILE_DIR");
Please make this `!!PR_GetEnv(...)` to make the conversion to bool more obvious.
Comment 13•9 years ago
|
||
| mozreview-review | ||
Comment on attachment 8820677 [details]
Bug 1321244 - Lazily init global variables that could implicitly initialize NSPR.
https://reviewboard.mozilla.org/r/100156/#review100634
Thanks for the patch. You say that the NSPR initialization will now be performed in `NS_LogInit`, but I don't see that happening before your patch and I don't see it happening in your patch. Did you mean to add it?
Attachment #8820677 -
Flags: review?(nfroyd) → review+
| Assignee | ||
Comment 14•9 years ago
|
||
| mozreview-review-reply | ||
Comment on attachment 8820677 [details]
Bug 1321244 - Lazily init global variables that could implicitly initialize NSPR.
https://reviewboard.mozilla.org/r/100156/#review100634
We already call it in InitXPCOMGlue(): https://dxr.mozilla.org/mozilla-central/source/browser/app/nsBrowserApp.cpp#374 shortly after the DLLs are loaded by the XPCOMGlueStartup() call. The initialization is implicit: NS_LogInit() calls nsTraceRefCnt::SetActivityIsLegal(), where PR_NewThreadPrivateIndex() performs implicit initialization.
| Comment hidden (mozreview-request) |
| Assignee | ||
Comment 16•9 years ago
|
||
| Assignee | ||
Comment 17•9 years ago
|
||
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #16)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=d6599af5cf5b
This enforces that NS_LogInit() be the first to init NSPR (https://hg.mozilla.org/try/rev/d6599af5cf5b4162bf08c31d781618e6df864464 ). Let's see if this works.
| Assignee | ||
Comment 18•9 years ago
|
||
| Assignee | ||
Comment 19•9 years ago
|
||
The enforcement in NS_LogInit() breaks gtest and fennec so I'll just land the patch as is.
Comment 20•9 years ago
|
||
Pushed by cyu@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/55760fa58ad0
Lazily init global variables that could implicitly initialize NSPR. r=froydnj
Comment 21•9 years ago
|
||
| bugherder | ||
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Comment 22•9 years ago
|
||
this has an improvement in the number of constructors during the build:
== Change summary for alert #4607 (as of December 23 2016 09:29 UTC) ==
Improvements:
1% compiler_metrics num_constructors linux32 pgo 98 -> 97
1% compiler_metrics num_constructors linux64 pgo 98 -> 97
1% compiler_metrics num_constructors linux32 opt 98 -> 97
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=4607
Comment 23•9 years ago
|
||
Is this something we should consider taking on 52 ahead of the next ESR?
| Assignee | ||
Comment 24•9 years ago
|
||
Comment on attachment 8820677 [details]
Bug 1321244 - Lazily init global variables that could implicitly initialize NSPR.
Approval Request Comment
[Feature/Bug causing the regression]: No bug. This problem is common in several
[User impact if declined]: Potential loader lock (deadlock) during startup.
[Is this code covered by automated tests?]: No. We need dynamic checks in the system library to cover this.
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: No. This needs to be verified with debugger or dynamic check tool.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: We just move delay initializations of variables from load time of xul.dll to where the variables are first accessed. This is unlikely to be risky.
[String changes made/needed]: No.
Flags: needinfo?(cyu)
Attachment #8820677 -
Flags: approval-mozilla-beta?
Comment 25•9 years ago
|
||
Comment on attachment 8820677 [details]
Bug 1321244 - Lazily init global variables that could implicitly initialize NSPR.
avoid implicit early nspr init, beta52+
Attachment #8820677 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 26•9 years ago
|
||
| bugherder uplift | ||
Comment 27•9 years ago
|
||
| bugherder uplift | ||
status-firefox-esr52:
--- → fixed
You need to log in
before you can comment on or make changes to this bug.
Description
•