Closed
Bug 1366133
Opened 8 years ago
Closed 8 years ago
Calling InternetGetConnectedStateExW() in ReadInternetOption() may cause hangs
Categories
(Core :: Networking, enhancement)
Core
Networking
Tracking
()
Tracking | Status | |
---|---|---|
firefox55 | --- | fixed |
People
(Reporter: ehsan.akhgari, Assigned: xeonchen)
References
Details
(Whiteboard: [necko-active][bhr])
Attachments
(2 files)
In the BHR hang data I see hangs like this:
RtlpWaitOnAddress (in wntdll.pdb)<br>
RtlpWaitOnCriticalSection (in wntdll.pdb)<br>
RtlpEnterCriticalSectionContended (in wntdll.pdb)<br>
RtlEnterCriticalSection (in wntdll.pdb)<br>
IsDialUpConnection(CWxString *) (in wininet.pdb)<br>
InternetGetConnectedStateExW (in wininet.pdb)<br>
ReadInternetOption (in xul.pdb)<br>
nsWindowsSystemProxySettings::GetPACURI(nsACString &) (in xul.pdb)<br>
mozilla::net::nsProtocolProxyService::ReloadPAC() (in xul.pdb)<br>
mozilla::net::nsProtocolProxyService::ReloadNetworkPAC() (in xul.pdb)<br>
mozilla::net::nsProtocolProxyService::Observe(nsISupports *,char const *,char16_t const *) (in xul.pdb)<br>
nsObserverList::NotifyObservers(nsISupports *,char const *,char16_t const *) (in xul.pdb)<br>
nsObserverService::NotifyObservers(nsISupports *,char const *,char16_t const *) (in xul.pdb)<br>
nsNotifyAddrListener::ChangeEvent::Run() (in xul.pdb)<br>
nsThread::ProcessNextEvent(bool,bool *) (in xul.pdb)<br>
NS_ProcessNextEvent(nsIThread *,bool) (in xul.pdb)<br>
mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate *) (in xul.pdb)<br>
MessageLoop::RunHandler() (in xul.pdb)<br>
MessageLoop::Run() (in xul.pdb)<br>
nsBaseAppShell::Run() (in xul.pdb)<br>
nsAppShell::Run() (in xul.pdb)<br>
nsAppStartup::Run() (in xul.pdb)<br>
XREMain::XRE_mainRun() (in xul.pdb)<br>
As far as I can tell, we only call InternetGetConnectedStateExW() in <https://searchfox.org/mozilla-central/rev/24c443a440104dabd9447608bd41b8766e8fc2f5/toolkit/system/windowsproxy/nsWindowsSystemProxySettings.cpp#75> so that we fill in the pszConnection member for the third argument to InternetQueryOptionW(), however per MSDN <https://msdn.microsoft.com/en-us/library/windows/desktop/aa385101(v=vs.85).aspx> that is an out argument...
Masatoshi, do you know if we actually need to initialize this field to the name of the modem connection? I don't have a modem connection so I wouldn't be able to test what would happen if I changed this code myself...
Reporter | ||
Updated•8 years ago
|
Flags: needinfo?(VYV03354)
Comment 1•8 years ago
|
||
(In reply to :Ehsan Akhgari (super long backlog, slow to respond) from comment #0)
> As far as I can tell, we only call InternetGetConnectedStateExW() in
> <https://searchfox.org/mozilla-central/rev/
> 24c443a440104dabd9447608bd41b8766e8fc2f5/toolkit/system/windowsproxy/
> nsWindowsSystemProxySettings.cpp#75> so that we fill in the pszConnection
> member for the third argument to InternetQueryOptionW(), however per MSDN
> <https://msdn.microsoft.com/en-us/library/windows/desktop/aa385101(v=vs.85).
> aspx> that is an out argument...
Actually this is an in-out parameter. MSDN is inaccurate as always.
https://msdn.microsoft.com/en-us/library/windows/desktop/aa385146(v=vs.85).aspx
> pszConnection
> Pointer to a string that contains the name of the RAS connection or NULL, which indicates the default or LAN connection, to set or *query* options on.
(emphasis mine)
> Masatoshi, do you know if we actually need to initialize this field to the
> name of the modem connection?
INTERNET_CONNECTION_MODEM will correspond to [Internet Options] > [Connections] > [Dial-up and Virtual Private Network settings] (so it requires a connection name to identify which connection setting we are operating on) and INTERNET_CONNECTION_LAN will correspond to [Local Area Network (LAN) settings].
> I don't have a modem connection so I wouldn't
> be able to test what would happen if I changed this code myself...
The "modem" doesn't have to be a physical modem. It might be a PPPoE connection, it might be a VPN connection.
As an aside, we can remove the INTERNET_PER_CONN_FLAGS fallback because we no longer supports Windows Vista or earlier.
Flags: needinfo?(VYV03354)
Updated•8 years ago
|
Whiteboard: [qf][bhr] → [necko-backlog][qf][bhr]
Comment 2•8 years ago
|
||
I asked Honza about this bug and his response was:
> mayhemer: probably just make nsProtocolProxyService::Reload(Network)PAC happen on a background (best separate - new) thread?
> mayhemer: this is proxy related, maybe Gary should look at it
Gary, what do you think?
Flags: needinfo?(xeonchen)
Assignee | ||
Comment 3•8 years ago
|
||
(In reply to Valentin Gosu [:valentin] from comment #2)
> I asked Honza about this bug and his response was:
> > mayhemer: probably just make nsProtocolProxyService::Reload(Network)PAC happen on a background (best separate - new) thread?
> > mayhemer: this is proxy related, maybe Gary should look at it
>
> Gary, what do you think?
I agree, I'll take this bug.
Two parts to do:
- Remove the INTERNET_PER_CONN_FLAGS fallback
- Move nsProtocolProxyService::Reload(Network)PAC to another thread
Assignee: nobody → xeonchen
Flags: needinfo?(xeonchen)
Reporter | ||
Comment 4•8 years ago
|
||
Sounds great, thanks Masatoshi and Gary!
Updated•8 years ago
|
Whiteboard: [necko-backlog][qf][bhr] → [necko-backlog][qf:p1][bhr]
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 9•8 years ago
|
||
Comment on attachment 8873794 [details]
Bug 1366133 - Part 1: make nsISystemProxySettings::GetPACURI happen in another thread;
After investigation, I found that |InternetGetConnectedStateExW()| is only called by two functions: |nsWindowsSystemProxySettings::GetPACURI| and |nsWindowsSystemProxySettings::GetProxyForURI|, and the latter is never called on main thread of Windows platform, therefore, only 2 occurrences of |GetPACURI| must be handled. I move the API calls and following steps into newly created function, |AsyncConfigureFromPAC|.
Comment 10•8 years ago
|
||
mozreview-review |
Comment on attachment 8873793 [details]
Bug 1366133 - Part 0: remove support for Windows Vista and earlier versions;
https://reviewboard.mozilla.org/r/145210/#review149590
::: toolkit/system/windowsproxy/nsWindowsSystemProxySettings.cpp:97
(Diff revision 2)
> unsigned long size = sizeof(INTERNET_PER_CONN_OPTION_LISTW);
> if (!InternetQueryOptionW(nullptr, INTERNET_OPTION_PER_CONNECTION_OPTION,
> &list, &size)) {
> - if (GetLastError() != ERROR_INVALID_PARAMETER) {
> - return NS_ERROR_FAILURE;
> + return NS_ERROR_FAILURE;
> - }
> + }
Can you clarify exactly how this removes support for Vista and earlier (maybe in the commit message)? Are those versions more likely tyo return ERROR_INVALID_PARAMETER and require that second code block?
Attachment #8873793 -
Flags: review?(daniel) → review+
Comment 11•8 years ago
|
||
mozreview-review |
Comment on attachment 8873794 [details]
Bug 1366133 - Part 1: make nsISystemProxySettings::GetPACURI happen in another thread;
https://reviewboard.mozilla.org/r/145212/#review149594
Attachment #8873794 -
Flags: review?(daniel) → review+
Comment 12•8 years ago
|
||
Just a heads-up: this bug is very similar to bug 1360164 so with a little care/luck I would say that we might be able to fix both at once.
Updated•8 years ago
|
Whiteboard: [necko-backlog][qf:p1][bhr] → [necko-active][qf:p1][bhr]
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 22•8 years ago
|
||
Pushed by gachen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/40c08de101f1
Part 0: remove support for Windows Vista and earlier versions; r=bagder
https://hg.mozilla.org/integration/autoland/rev/9a76c93fb1b5
Part 1: make nsISystemProxySettings::GetPACURI happen in another thread; r=bagder
Comment 23•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/40c08de101f1
https://hg.mozilla.org/mozilla-central/rev/9a76c93fb1b5
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Comment 25•8 years ago
|
||
IIUC we're still going to delay getting system proxy info until the 1st necko channel needs it? Would be it better to initialize the proxyProtocolSvc early in startup, so the 1st XHR, etc doesn't wind up waiting for init to finish?
Flags: needinfo?(daniel)
Comment 26•8 years ago
|
||
Yes, I agree.
Even though this landed fix will already make it faster since there were other main-thread operations waiting to get done, so now they'll get done faster. This async but still init-on-first-use still risks delaying the first network request with 100ms.
That said, the first network request done is often also doing DNS and many other slow requests so those 100ms might not be very notable in a lot of cases.
I think we should file a new bug for this.
Flags: needinfo?(daniel)
Comment 27•8 years ago
|
||
Filed follow up bug 1379872 per comment 26.
Comment 28•7 years ago
|
||
FWIW, I just got an assertion crash in ~AsyncGetPACURIRequest () on line 369:
367: ~AsyncGetPACURIRequest()
368: {
369: MOZ_ASSERT(NS_IsMainThread() == mIsMainThreadOnly);
370: NS_ReleaseOnMainThreadSystemGroup(
371: "AsyncGetPACURIRequest::mServiceHolder", mServiceHolder.forget());
372: }
on resume from hibernation on FF 56.0.1 x64 under Win7 SP1 x64. VS indicates that the code is running in the main thread in the chrome process. I have saved a dump about which I will answer questions (but not upload).
A comment on line 318 led me here:
318: // Bug 1366133: make GetPACURI off-main-thread since it may hang on Windows platform
Assignee | ||
Comment 29•7 years ago
|
||
(In reply to q1 from comment #28)
> FWIW, I just got an assertion crash in ~AsyncGetPACURIRequest () on line 369:
>
> 367: ~AsyncGetPACURIRequest()
> 368: {
> 369: MOZ_ASSERT(NS_IsMainThread() == mIsMainThreadOnly);
> 370: NS_ReleaseOnMainThreadSystemGroup(
> 371: "AsyncGetPACURIRequest::mServiceHolder",
> mServiceHolder.forget());
> 372: }
>
> on resume from hibernation on FF 56.0.1 x64 under Win7 SP1 x64. VS indicates
> that the code is running in the main thread in the chrome process. I have
> saved a dump about which I will answer questions (but not upload).
>
> A comment on line 318 led me here:
>
> 318: // Bug 1366133: make GetPACURI off-main-thread since it may hang on
> Windows platform
thank you for reporting this, the issue has been fixed in bug 1405496.
Updated•3 years ago
|
Performance Impact: --- → P1
Whiteboard: [necko-active][qf:p1][bhr] → [necko-active][bhr]
You need to log in
before you can comment on or make changes to this bug.
Description
•