Crash in PL_HashTableLookupConst | SECOID_FindOID_Util correlated to cliqz addon

RESOLVED FIXED in Firefox 54

Status

()

Core
Security: PSM
P1
critical
RESOLVED FIXED
11 months ago
9 months ago

People

(Reporter: philipp, Assigned: keeler)

Tracking

({crash})

50 Branch
mozilla55
crash
Points:
---

Firefox Tracking Flags

(firefox51 wontfix, firefox52 wontfix, firefox-esr52 wontfix, firefox53 wontfix, firefox54 fixed, firefox55 fixed)

Details

(Whiteboard: [psm-assigned], crash signature)

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment)

(Reporter)

Description

11 months ago
This bug was filed from the Socorro interface and is 
report bp-a9fb4c8d-d2dd-43a7-a0db-a101e2170304.
=============================================================
Crashing Thread (63)
Frame 	Module 	Signature 	Source
0 	nss3.dll 	PL_HashTableLookupConst 	nsprpub/lib/ds/plhash.c:349
1 	nss3.dll 	SECOID_FindOID_Util 	security/nss/lib/util/secoid.c:2120
2 	nss3.dll 	NSS_CMSAttribute_GetType 	security/nss/lib/smime/cmsattr.c:118
3 	nss3.dll 	SECOID_GetAlgorithmTag_Util 	security/nss/lib/util/secalgid.c:17
4 	nss3.dll 	seckey_ExtractPublicKey 	security/nss/lib/cryptohi/seckey.c:577
5 	xul.dll 	mozilla::dom::CryptoKey::PublicKeyFromSpki(mozilla::dom::CryptoBuffer&, nsNSSShutDownPreventionLock const&) 	dom/crypto/CryptoKey.cpp:602
6 	xul.dll 	mozilla::dom::ImportRsaKeyTask::DoCrypto() 	dom/crypto/WebCryptoTask.cpp:1777
7 	xul.dll 	mozilla::dom::WebCryptoTask::Run() 	dom/crypto/WebCryptoTask.cpp:419
8 	xul.dll 	nsThreadPool::Run() 	xpcom/threads/nsThreadPool.cpp:225
9 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:1067
10 	xul.dll 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/glue/nsThreadUtils.cpp:311
11 	xul.dll 	mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:338
12 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:225
13 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:205
14 	xul.dll 	nsThread::ThreadFunc(void*) 	xpcom/threads/nsThread.cpp:465
15 	nss3.dll 	_PR_NativeRunThread 	nsprpub/pr/src/threads/combined/pruthr.c:397
16 	nss3.dll 	pr_root 	nsprpub/pr/src/md/windows/w95thred.c:95
17 	ucrtbase.dll 	_o__CIpow 	
18 	kernel32.dll 	BaseThreadInitThunk 	
19 	ntdll.dll 	__RtlUserThreadStart 	
20 	ntdll.dll 	_RtlUserThreadStart

this crash signature has been around for a while, but it seems to have sprung up in volume on windows and os x apparently correlated to the cliqz addon from https://addons.mozilla.org/firefox/addon/cliqz/
this pattern has started on 2016-12-30 which is coinciding with their 2.11.1 extension update and spiking up again in volume on 2017-02-09 where they released v2.13.0: http://bit.ly/2m5wqtO

Correlations for Firefox Release
(88.93% in signature vs 08.31% overall) useragent_locale = de [206.02% vs 15.08% if process_type = null]
(90.71% in signature vs 00.57% overall) address = 0xc
(87.86% in signature vs 00.20% overall) Addon "CLIQZ- search directly in your browser" = true
(29.64% in signature vs 99.96% overall) ipc_message_name = null
(99.64% in signature vs 35.68% overall) reason = EXCEPTION_ACCESS_VIOLATION_READ
(93.21% in signature vs 31.13% overall) app_init_dlls = null
(66.79% in signature vs 04.76% overall) shutdown_progress = profile-before-change
(26.43% in signature vs 02.41% overall) shutdown_progress = xpcom-shutdown [61.45% vs 03.79% if process_type = null]

Comment 1

11 months ago
Hi Philipp! How could we support in finding the cause of this issue?
(Reporter)

Comment 2

11 months ago
needinfoing david keeler about this...
Flags: needinfo?(dkeeler)
This looks like a duplicate of bug 702307.
(Assignee)

Comment 4

11 months ago
This is just to let you know I've seen the ni? request, but I haven't had time to dig into this yet.
(Assignee)

Comment 5

11 months ago
Given that most of those crashes are at address 0xc, it looks like ht is null in PL_HashTableLookupConst, meaning oidhash is null. oidhash is only null before NSS has been initialized and after it has been shut down. The NSS initialization/shutdown code in PSM should protect against code using NSS running before/after NSS has been initialized/shut down, but I think I found an explanation for when it would not. Each WebCryptoTask is an nsNSSShutDownObject, which in theory gives it the ability to ask if NSS has already been shut down. However, after nsNSSShutDownList::shutdown() has run, there's nothing I can see that prevents creating a new nsNSSShutDownObject. More importantly, the result of isAlreadyShutDown() will always be false for these new objects, leading them to erroneously think it's safe to call NSS functions. While this whole thing should be overhauled, there may be a simpler solution in the meantime: make it so nsNSSShutDownObjects know if NSS has already been shut down when they're created. I'll get together a patch for this.
Assignee: nobody → dkeeler
Flags: needinfo?(dkeeler)
Priority: -- → P1
Whiteboard: [psm-assigned]
Comment hidden (mozreview-request)

Comment 7

10 months ago
mozreview-review
Comment on attachment 8847832 [details]
bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down

https://reviewboard.mozilla.org/r/120746/#review123484

Great stuff.
Attachment #8847832 - Flags: review?(cykesiopka.bmo) → review+

Comment 8

10 months ago
mozreview-review
Comment on attachment 8847832 [details]
bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down

https://reviewboard.mozilla.org/r/120746/#review124430

LGTM.
Attachment #8847832 - Flags: review?(ttaubert) → review+

Comment 10

10 months ago
Pushed by dkeeler@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9d6095db5090
isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down r=Cykesiopka,ttaubert

Comment 11

10 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/9d6095db5090
Status: NEW → RESOLVED
Last Resolved: 10 months ago
status-firefox55: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
(Reporter)

Comment 12

10 months ago
so far there are no new reports on 55.0a1 after the patch landed (though there weren't frequent crashes before that on nightly). do you think we should attempt to uplift the fix?
Flags: needinfo?(dkeeler)
status-firefox51: affected → wontfix
status-firefox52: affected → wontfix
(Assignee)

Comment 13

10 months ago
Comment on attachment 8847832 [details]
bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down

Let's start slow and see if this actually does stop the crashes.

Approval Request Comment
[Feature/Bug causing the regression]: long-standing issue
[User impact if declined]: shutdown crashes (particularly if using Cliqz)
[Is this code covered by automated tests?]: there are tests that exercise this area of code, but nothing specifically for this issue
[Has the fix been verified in Nightly?]: no, but we haven't seen new crashes with this signature on Nightly
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: there is slight risk with this patch (see next line)
[Why is the change risky/not risky?]: this code pretty much makes PSM strictly more correct than what we had before. However, what could happen is it could expose already-incorrect behavior in code that uses PSM. This would probably be in the form JS code throwing exceptions (rather than crashing outright or succeeding by pure luck). If the calling code doesn't handle this, users may see broken behavior (but at least their browser shouldn't crash due to this).
[String changes made/needed]: none
Flags: needinfo?(dkeeler)
Attachment #8847832 - Flags: approval-mozilla-aurora?

Comment 14

10 months ago
Comment on attachment 8847832 [details]
bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down

Fix a crash and there are no new reports on 55.0a1 after the patch landed. Let's start it slowly. Aurora54+.
Attachment #8847832 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Comment 15

10 months ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-aurora/rev/316ec3452a61
status-firefox54: affected → fixed
(Assignee)

Updated

10 months ago
Duplicate of this bug: 1346762
(Assignee)

Comment 18

10 months ago
I believe those are a result of an add-on using ctypes to directly call NSS functions, which isn't safe in general (and also probably isn't something we can prevent unless/until we outlaw using ctypes (in short, that's more or less a different bug).

0 	libnss3.dylib 	PL_HashTableLookupConst 	nsprpub/lib/ds/plhash.c:349
1 	libnss3.dylib 	SECOID_FindOID_Util 	security/nss/lib/util/secoid.c:2120
2 	libnss3.dylib 	SECOID_FindOIDTag_Util 	security/nss/lib/util/secoid.c:2136
3 	libnss3.dylib 	seckey_ExtractPublicKey 	security/nss/lib/cryptohi/seckey.c:577
4 	XUL 	ffi_call_unix64 	
5 	XUL 	ffi_call 	js/src/ctypes/libffi/src/x86/ffi64.c:535
6 	XUL 	js::ctypes::FunctionType::Call 	js/src/ctypes/CTypes.cpp:7144
Flags: needinfo?(dkeeler)
If that's a different signature, should we go ahead and request 53 uplift then too? We're due to build the RC early next week.
Flags: needinfo?(dkeeler)
(Assignee)

Comment 21

10 months ago
Comment on attachment 8847832 [details]
bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down

Approval Request Comment
[Feature/Bug causing the regression]: long-standing issue
[User impact if declined]: shutdown crashes (particularly if using Cliqz)
[Is this code covered by automated tests?]: there are tests that exercise this area of code, but nothing specifically for this issue
[Has the fix been verified in Nightly?]: no, but we haven't seen new crashes with this signature on Nightly or Aurora
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: there is slight risk with this patch (see next line)
[Why is the change risky/not risky?]: this code pretty much makes PSM strictly more correct than what we had before. However, what could happen is it could expose already-incorrect behavior in code that uses PSM. This would probably be in the form JS code throwing exceptions (rather than crashing outright or succeeding by pure luck). If the calling code doesn't handle this, users may see broken behavior (but at least their browser shouldn't crash due to this). So far as I know, we haven't seen this on Nightly or Aurora yet.
[String changes made/needed]: none
Attachment #8847832 - Flags: approval-mozilla-beta?
David - what kind of broken behavior? 
It does not look like that common of a shutdown crash even on release, and it's not a new issue. But, if you think it is important to uplift, and also safer for users and more correct, we can try it in beta 10.
Flags: needinfo?(dkeeler)
(Assignee)

Comment 23

10 months ago
Basically, if an add-on (like cliqz) or browser code (e.g. fx accounts) is trying to do something crypto-related at shutdown, this patch makes it more likely that that operation will throw a JS exception. If the calling code doesn't handle this, it could leave things in an inconsistent state. That said, since it's at shutdown, users probably won't notice as long as it doesn't cause loss of data.

The best-case scenario is that this patch takes only every instance that would have resulted in a crash and turns it into a JS exception (in which case it would be a clear win and I wouldn't be concerned about uplifting this). However, I'm fairly sure that's not the case. That is, in addition to turning these crashes into exceptions, I think there are some cases where the browser would not have crashed and now encounters an exception (where it wouldn't have before). Thus, to the user, there's the potential of more perceived breakage. Since there's not much more time left for 53 on beta, perhaps it would be best to wait on this.

In the meantime, cliqz can actually work around this bug themselves - by observing an early shutdown notification (something earlier than "profile-before-change" - "profile-change-teardown" perhaps), they can know to stop all activity that might require crypto operations.
Flags: needinfo?(dkeeler)
Comment on attachment 8847832 [details]
bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down

From comment 23, sounds like there is a viable workaround for cliqz and it may be best to keep this riding the trains with 54 aurora. Thanks David.
Attachment #8847832 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
status-firefox53: affected → wontfix

Comment 25

10 months ago
Thanks for the update guys! We will look into it and try to implement the suggestion you mentioned.
As we are in the process of moving towards WebExtension tech we were wondering if there is a way to get a similar signal to "profile-change-teardown" there. Any idea?
(In reply to lucian from comment #25)
> Thanks for the update guys! We will look into it and try to implement the
> suggestion you mentioned.
> As we are in the process of moving towards WebExtension tech we were
> wondering if there is a way to get a similar signal to
> "profile-change-teardown" there. Any idea?
Flags: needinfo?(aswan)
(In reply to lucian from comment #25)
> Thanks for the update guys! We will look into it and try to implement the
> suggestion you mentioned.
> As we are in the process of moving towards WebExtension tech we were
> wondering if there is a way to get a similar signal to
> "profile-change-teardown" there. Any idea?

I don't know exactly what that event is but it sounds like it is related to shutdown?  There is no explicit shutdown notification for webextensions but they can create a background page and listen for "unload" in the background page.  This sounds like it could easily be a longer conversation, not related to this bug, I would suggest either a chat in #webextensions on IRC or a new bug.
Flags: needinfo?(aswan)
Attachment #8847832 - Flags: approval-mozilla-esr52?

Comment 28

9 months ago
Comment on attachment 8847832 [details]
bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down

Based on comment 23, I don't think this is a must fix for ESR52.1 given the risk assessment. Unless the severity of this problem changes, I'd like to wontfix for esr52
Attachment #8847832 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52-

Updated

9 months ago
status-firefox-esr52: affected → wontfix
You need to log in before you can comment on or make changes to this bug.