Crash in mozilla::net::nsProtocolProxyService::LoadHostFilters

RESOLVED FIXED in Firefox -esr52

Status

()

defect
--
critical
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: philipp, Assigned: valentin.gosu)

Tracking

({crash})

49 Branch
mozilla55
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox-esr45 unaffected, firefox51 wontfix, firefox52 wontfix, firefox-esr52 fixed, firefox53 wontfix, firefox54 fixed, firefox55 fixed)

Details

(Whiteboard: [necko-active], crash signature)

Attachments

(2 attachments, 1 obsolete attachment)

This bug was filed from the Socorro interface and is 
report bp-d8d2b0b6-bd72-4cc2-8344-455672170127.
=============================================================
Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	xul.dll 	mozilla::net::nsProtocolProxyService::LoadHostFilters(char const*) 	netwerk/base/nsProtocolProxyService.cpp:1576
1 	xul.dll 	mozilla::net::nsProtocolProxyService::PrefsChanged(nsIPrefBranch*, char const*) 	netwerk/base/nsProtocolProxyService.cpp:648
2 	xul.dll 	mozilla::net::nsProtocolProxyService::Init() 	netwerk/base/nsProtocolProxyService.cpp:451
3 	xul.dll 	nsProtocolProxyServiceConstructor 	netwerk/build/nsNetModule.cpp:70
4 	xul.dll 	nsComponentManagerImpl::CreateInstanceByContractID(char const*, nsISupports*, nsID const&, void**) 	xpcom/components/nsComponentManager.cpp:1170
5 	xul.dll 	nsComponentManagerImpl::GetServiceByContractID(char const*, nsID const&, void**) 	xpcom/components/nsComponentManager.cpp:1526
6 	xul.dll 	nsTHashtable<nsBaseHashtableET<nsCStringHashKey, nsFactoryEntry*> >::s_HashKey(void const*) 	obj-firefox/dist/include/nsTHashtable.h:376
7 	xul.dll 	nsGetServiceByContractIDWithError::operator()(nsID const&, void**) 	xpcom/glue/nsComponentManagerUtils.cpp:292
8 	xul.dll 	nsCOMPtr_base::assign_from_gs_contractid_with_error(nsGetServiceByContractIDWithError const&, nsID const&) 	xpcom/glue/nsCOMPtr.cpp:106
9 	xul.dll 	mozilla::net::nsHttpChannel::AsyncOpen(nsIStreamListener*, nsISupports*) 	netwerk/protocol/http/nsHttpChannel.cpp:5763
10 	xul.dll 	mozilla::net::nsHttpChannel::AsyncOpen2(nsIStreamListener*) 	netwerk/protocol/http/nsHttpChannel.cpp:5782
11 	xul.dll 	mozilla::dom::XMLHttpRequestMainThread::InitiateFetch(nsIInputStream*, __int64, nsACString_internal&) 	dom/xhr/XMLHttpRequestMainThread.cpp:2707
12 	xul.dll 	mozilla::dom::XMLHttpRequestMainThread::SendInternal(mozilla::dom::XMLHttpRequestMainThread::RequestBodyBase const*) 	dom/xhr/XMLHttpRequestMainThread.cpp:2916
13 	xul.dll 	mozilla::dom::XMLHttpRequestMainThread::Send(JSContext*, mozilla::ErrorResult&) 	dom/xhr/XMLHttpRequestMainThread.h:333
14 	xul.dll 	mozilla::dom::XMLHttpRequestBinding::send 	obj-firefox/dom/bindings/XMLHttpRequestBinding.cpp:650
15 	xul.dll 	js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp:459
16 	xul.dll 	InternalCall 	js/src/vm/Interpreter.cpp:504
17 	xul.dll 	Interpret 	js/src/vm/Interpreter.cpp:2922
18 	xul.dll 	js::RunScript(JSContext*, js::RunState&) 	js/src/vm/Interpreter.cpp:405
19 	xul.dll 	js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp:477
20 	xul.dll 	InternalCall 	js/src/vm/Interpreter.cpp:504
21 	xul.dll 	nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS*, unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) 	js/xpconnect/src/XPCWrappedJSClass.cpp:1213
22 	xul.dll 	nsXPCWrappedJS::CallMethod(unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) 	js/xpconnect/src/XPCWrappedJS.cpp:613
23 	xul.dll 	CallAddPropertyHook 	js/src/vm/NativeObject.cpp:1075
24 	xul.dll 	SharedStub 	xpcom/reflect/xptcall/md/win32/xptcstubs.cpp:112
25 	xul.dll 	mozilla::net::CaptivePortalService::Prepare() 	netwerk/base/CaptivePortalService.cpp:328
26 	xul.dll 	NS_InvokeByIndex 	xpcom/reflect/xptcall/md/win32/xptcinvoke_asm_x86_msvc.asm:54

crashes with this signature seem to be regressing since firefox 49 - over all they occur in rather low volume though. one user comment explicitly mentioned that they were changing their proxy settings after the crash happened: https://crash-stats.mozilla.com/report/index/e4fcff64-3b84-487f-a5c8-6956c2161020
The stack trace looked odd, so I took a look at the dissasembly:

        1597:             else if (*endhost == ']') // IPv6 address literals
      0FA40340  and         dword ptr [esp+14h],0
      0FA40345  cmp         al,5Dh
      0FA40347  cmove       edi,dword ptr [esp+14h]
        1598:                 portLocation = 0;
        1599:             endhost++;
      0FA4034C  inc         esi
-->   0FA4034D  mov         al,byte ptr [esi]
      0FA4034F  test        al,al
      0FA40351  jne         loser+59h (0FA40324h)
      0FA40353  mov         dword ptr [esp+1Ch],ecx
      0FA40357  mov         dword ptr [esp+10h],esi
        1600:         }

https://crash-stats.mozilla.com/report/index/3f27ac6f-b31a-41b2-8b91-41fcd2170126
In this case, esi is 0x15383000
So either there is a problem with the allocation, and we crash when we get to the next page, or the underlying buffer is deallocated by another thread.

Daniel, you know the code a bit better, can you take a look at this?
Flags: needinfo?(daniel)
I took a look. This reason for this is really not obvious to me.

The crash happens in LoadHostFilters(), which is only called from a single spot in the code. Based on the crash data, it seems it reads outside of its buffer/allocation.

The only caller of this method is PrefsChanged() and it only calls LoadHostFilters() at init or if the "no_proxies_on" pref changed. (https://dxr.mozilla.org/mozilla-central/source/netwerk/base/nsProtocolProxyService.cpp?q=nsProtocolProxyService%3A%3APrefsChanged&redirect_type=single#648)

The PrefsChanged method works on a local nsXPIDLCString.

However, the "no_proxies_on" prefs is handled differently than the other string prefs in the proxy code and it could be an idea to unify that by doing something like the patch below. But again, I have not identified a problem with the current code.

--- a/netwerk/base/nsProtocolProxyService.cpp
+++ b/netwerk/base/nsProtocolProxyService.cpp
@@ -640,14 +640,14 @@ nsProtocolProxyService::PrefsChanged(nsIPrefBranch *prefBranch,
     if (!pref || !strcmp(pref, PROXY_PREF("failover_timeout")))
         proxy_GetIntPref(prefBranch, PROXY_PREF("failover_timeout"),
                          mFailedProxyTimeout);
 
     if (!pref || !strcmp(pref, PROXY_PREF("no_proxies_on"))) {
-        rv = prefBranch->GetCharPref(PROXY_PREF("no_proxies_on"),
-                                     getter_Copies(tempString));
-        if (NS_SUCCEEDED(rv))
-            LoadHostFilters(tempString.get());
+        nsCString no_proxies;
+        proxy_GetStringPref(prefBranch, PROXY_PREF("no_proxies_on"), no_proxies);
+        if (no_proxies.Length())
+            LoadHostFilters(no_proxies.get());
     }
 
     // We're done if not using something that could give us a PAC URL
     // (PAC, WPAD or System)
     if (mProxyConfig != PROXYCONFIG_PAC && mProxyConfig != PROXYCONFIG_WPAD &&
Flags: needinfo?(daniel)
Correction, the change would not work like that, is should call LoadHostFilters() even for a blank string:

--- a/netwerk/base/nsProtocolProxyService.cpp
+++ b/netwerk/base/nsProtocolProxyService.cpp
@@ -640,14 +640,13 @@ nsProtocolProxyService::PrefsChanged(nsIPrefBranch *prefBranch,
     if (!pref || !strcmp(pref, PROXY_PREF("failover_timeout")))
         proxy_GetIntPref(prefBranch, PROXY_PREF("failover_timeout"),
                          mFailedProxyTimeout);
 
     if (!pref || !strcmp(pref, PROXY_PREF("no_proxies_on"))) {
-        rv = prefBranch->GetCharPref(PROXY_PREF("no_proxies_on"),
-                                     getter_Copies(tempString));
-        if (NS_SUCCEEDED(rv))
-            LoadHostFilters(tempString.get());
+        nsCString no_proxies;
+        proxy_GetStringPref(prefBranch, PROXY_PREF("no_proxies_on"), no_proxies);
+        LoadHostFilters(no_proxies.get());
     }
 
     // We're done if not using something that could give us a PAC URL
     // (PAC, WPAD or System)
     if (mProxyConfig != PROXYCONFIG_PAC && mProxyConfig != PROXYCONFIG_WPAD &&
Assignee: nobody → valentin.gosu
Status: NEW → ASSIGNED
I'm also not sure this will fix it, but this feels like a better style anyway.
Whiteboard: [necko-active]
Comment on attachment 8832028 [details] [diff] [review]
Crash in mozilla::net::nsProtocolProxyService::LoadHostFilters

Review of attachment 8832028 [details] [diff] [review]:
-----------------------------------------------------------------

Agreed, let's do this for the nicer style and then we keep checking if the problem remains.
Attachment #8832028 - Flags: review?(daniel) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/36839839cfa73ab9487169f4444582d788286eb8
Bug 1334443 - Crash in mozilla::net::nsProtocolProxyService::LoadHostFilters r=bagder
https://hg.mozilla.org/mozilla-central/rev/36839839cfa7
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
hi, there is one crash in a nightly build that contains the patch: https://crash-stats.mozilla.com/report/index/9a428534-9031-4828-b74c-9bd4c2170204
so the problem is still present as it seems...
Flags: needinfo?(valentin.gosu)
Status: RESOLVED → REOPENED
Flags: needinfo?(valentin.gosu)
Resolution: FIXED → ---
Comment on attachment 8848649 [details]
Bug 1334443 - Add gtest for nsProtocolProxyService::{LoadHostFilters,CanUseProxy} and fix IPv6 parsing

https://reviewboard.mozilla.org/r/121548/#review123890

Looks great!
Attachment #8848649 - Flags: review?(daniel) → review+
Comment on attachment 8848650 [details]
Bug 1334443 - Rewrite nsProtocolProxyService::LoadHostFilters to use Tokenizer

https://reviewboard.mozilla.org/r/121550/#review123892

::: commit-message-5a7ca:3
(Diff revision 1)
> +Bug 1334443 - Rewrite nsProtocolProxyService::LoadHostFilters to use Tokenizer r=bagder
> +This eliminates some of the pointer math and makes the method a bit safer.
> +I tried to keep the same behaviour as before.

"The function's behaviour remains as before" would sound better to me. No need for "I tried" ...
Attachment #8848650 - Flags: review?(daniel) → review+
Attachment #8832028 - Attachment is obsolete: true
Pushed by valentin.gosu@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/17ca3c39d1d9
Add gtest for nsProtocolProxyService::{LoadHostFilters,CanUseProxy} and fix IPv6 parsing r=bagder
https://hg.mozilla.org/integration/autoland/rev/c64a319909a8
Rewrite nsProtocolProxyService::LoadHostFilters to use Tokenizer r=bagder
https://hg.mozilla.org/mozilla-central/rev/17ca3c39d1d9
https://hg.mozilla.org/mozilla-central/rev/c64a319909a8
Status: REOPENED → RESOLVED
Closed: 3 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: mozilla54 → mozilla55
We need to watch crash-stats to see if the crash still happens.
so far there are no new reports on 55.0a1 after the patch landed (though there weren't frequent crashes before that on nightly). do you think we should attempt to uplift the new fix?
Flags: needinfo?(valentin.gosu)
Comment on attachment 8848650 [details]
Bug 1334443 - Rewrite nsProtocolProxyService::LoadHostFilters to use Tokenizer

Given that we don't know for sure if the problem has been fixed, and the small possibility of regressions I'm only requesting uplift to aurora - and maybe it should ride the trains afterwards. Regressions would be worse than a small number of crashes.

Approval Request Comment
[Feature/Bug causing the regression]: Unknown.
[User impact if declined]: A low number of crashes.
[Is this code covered by automated tests?]: Yes. (See other patch)
[Has the fix been verified in Nightly?]: No regressions have been identified in nightly so far, and no crashes since landing.
[Needs manual test from QE?]: No.
[List of other uplifts needed for the feature/fix]: attachment 8848649 [details] containing unit tests.
[Is the change risky?]: There is a low risk involved.
[Why is the change risky/not risky?]: We are changing the implementation. Even though we aim for the exact same behaviour, there is a small chance of bugs/regressions.
[String changes made/needed]: none.
Flags: needinfo?(valentin.gosu)
Attachment #8848650 - Flags: approval-mozilla-aurora?
Comment on attachment 8848650 [details]
Bug 1334443 - Rewrite nsProtocolProxyService::LoadHostFilters to use Tokenizer

Fix a crash and no new reports on 55.0a1 after the patch landed. Aurora54+.
Attachment #8848650 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
needs rebasing for aurora 

grafting 406346:17ca3c39d1d9 "Bug 1334443 - Add gtest for nsProtocolProxyService::{LoadHostFilters,CanUseProxy} and fix IPv6 parsing r=bagder"
merging netwerk/base/nsProtocolProxyService.cpp
merging netwerk/test/gtest/TestProtocolProxyService.cpp
warning: conflicts while merging netwerk/test/gtest/TestProtocolProxyService.cpp! (edit, then use 'hg resolve --mark')
abort: unresolved conflicts, can't continue
(use 'hg resolve' and 'hg graft --continue')
Flags: needinfo?(valentin.gosu)
(In reply to Carsten Book [:Tomcat] from comment #22)
> needs rebasing for aurora 

I have no idea why it wouldn't work. I just tried applying the patches on the aurora repo, and it worked.
Flags: needinfo?(valentin.gosu)
No reports on 54/55 in the last week. If we want to consider this for Fx53, the request needs to happen ASAP as we're building the RC early next week.
Flags: needinfo?(valentin.gosu)
Comment on attachment 8848650 [details]
Bug 1334443 - Rewrite nsProtocolProxyService::LoadHostFilters to use Tokenizer

Approval Request Comment
[Feature/Bug causing the regression]: Unknown.
[User impact if declined]: A low number of crashes on windows.
[Is this code covered by automated tests?]: Yes. attachment 8848649 [details]
[Has the fix been verified in Nightly?]: No related regressions have been identified in nightly so or aurora since this landed. No crash reports since then.
[Needs manual test from QE?]: No.
[List of other uplifts needed for the feature/fix]: attachment 8848649 [details] containing unit tests.
[Is the change risky?]: There is a low risk involved.
[Why is the change risky/not risky?]: We are changing the implementation. Even though we aim for the exact same behaviour, there is a small chance of bugs/regressions. However, since it landed on aurora, none have popped up.
[String changes made/needed]: none.
Flags: needinfo?(valentin.gosu)
Attachment #8848650 - Flags: approval-mozilla-beta?
Comment on attachment 8848649 [details]
Bug 1334443 - Add gtest for nsProtocolProxyService::{LoadHostFilters,CanUseProxy} and fix IPv6 parsing

If I understood correctly, you want this one too.
Attachment #8848649 - Flags: approval-mozilla-beta?
Comment on attachment 8848649 [details]
Bug 1334443 - Add gtest for nsProtocolProxyService::{LoadHostFilters,CanUseProxy} and fix IPv6 parsing

Sounds like this has been successful in fixing the crash, I'd like to see it ride the trains long enough to have a chance to catch possible regressions though, based on Valentin's risk assessment here. Earlier in beta I could take it but given we're a week away from release, seems too late for beta uplift.
Attachment #8848649 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Attachment #8848650 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Attachment #8848650 - Flags: approval-mozilla-esr52?
Comment on attachment 8848650 [details]
Bug 1334443 - Rewrite nsProtocolProxyService::LoadHostFilters to use Tokenizer

There's ~100 instances of this crash on esr52.1 in the past 7 days. Let's uplift to ESR52.2+
Attachment #8848650 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
The new GTest added by this bug is failing on ESR52:
https://treeherder.mozilla.org/logviewer.html#?job_id=96766093&repo=mozilla-esr52

Valentin, any ideas on what to do here?
Flags: needinfo?(valentin.gosu)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #31)
> The new GTest added by this bug is failing on ESR52:
> https://treeherder.mozilla.org/logviewer.html#?job_id=96766093&repo=mozilla-
> esr52
> 
> Valentin, any ideas on what to do here?

I think this is due to bug 1028195 not being on ESR yet - and making the test fail. I suggest uplifting that too.
Gary, do you think it would be OK? If so, could you nominate the patch? Thanks!
Flags: needinfo?(valentin.gosu) → needinfo?(xeonchen)
(In reply to Valentin Gosu [:valentin] from comment #32)
> I think this is due to bug 1028195 not being on ESR yet - and making the
> test fail. I suggest uplifting that too.
> Gary, do you think it would be OK? If so, could you nominate the patch?
> Thanks!

OK!
Flags: needinfo?(xeonchen)
Looks like that worked, thanks!
You need to log in before you can comment on or make changes to this bug.