Closed Bug 1344478 Opened 8 years ago Closed 8 years ago

Crash in PL_HashTableLookupConst | SECOID_FindOID_Util correlated to cliqz addon

Categories

(Core :: Security: PSM, defect, P1)

50 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox51 --- wontfix
firefox52 --- wontfix
firefox-esr52 --- wontfix
firefox53 --- wontfix
firefox54 --- fixed
firefox55 --- fixed

People

(Reporter: philipp, Assigned: keeler)

References

Details

(Keywords: crash, Whiteboard: [psm-assigned])

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is report bp-a9fb4c8d-d2dd-43a7-a0db-a101e2170304. ============================================================= Crashing Thread (63) Frame Module Signature Source 0 nss3.dll PL_HashTableLookupConst nsprpub/lib/ds/plhash.c:349 1 nss3.dll SECOID_FindOID_Util security/nss/lib/util/secoid.c:2120 2 nss3.dll NSS_CMSAttribute_GetType security/nss/lib/smime/cmsattr.c:118 3 nss3.dll SECOID_GetAlgorithmTag_Util security/nss/lib/util/secalgid.c:17 4 nss3.dll seckey_ExtractPublicKey security/nss/lib/cryptohi/seckey.c:577 5 xul.dll mozilla::dom::CryptoKey::PublicKeyFromSpki(mozilla::dom::CryptoBuffer&, nsNSSShutDownPreventionLock const&) dom/crypto/CryptoKey.cpp:602 6 xul.dll mozilla::dom::ImportRsaKeyTask::DoCrypto() dom/crypto/WebCryptoTask.cpp:1777 7 xul.dll mozilla::dom::WebCryptoTask::Run() dom/crypto/WebCryptoTask.cpp:419 8 xul.dll nsThreadPool::Run() xpcom/threads/nsThreadPool.cpp:225 9 xul.dll nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp:1067 10 xul.dll NS_ProcessNextEvent(nsIThread*, bool) xpcom/glue/nsThreadUtils.cpp:311 11 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) ipc/glue/MessagePump.cpp:338 12 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc:225 13 xul.dll MessageLoop::Run() ipc/chromium/src/base/message_loop.cc:205 14 xul.dll nsThread::ThreadFunc(void*) xpcom/threads/nsThread.cpp:465 15 nss3.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:397 16 nss3.dll pr_root nsprpub/pr/src/md/windows/w95thred.c:95 17 ucrtbase.dll _o__CIpow 18 kernel32.dll BaseThreadInitThunk 19 ntdll.dll __RtlUserThreadStart 20 ntdll.dll _RtlUserThreadStart this crash signature has been around for a while, but it seems to have sprung up in volume on windows and os x apparently correlated to the cliqz addon from https://addons.mozilla.org/firefox/addon/cliqz/ this pattern has started on 2016-12-30 which is coinciding with their 2.11.1 extension update and spiking up again in volume on 2017-02-09 where they released v2.13.0: http://bit.ly/2m5wqtO Correlations for Firefox Release (88.93% in signature vs 08.31% overall) useragent_locale = de [206.02% vs 15.08% if process_type = null] (90.71% in signature vs 00.57% overall) address = 0xc (87.86% in signature vs 00.20% overall) Addon "CLIQZ- search directly in your browser" = true (29.64% in signature vs 99.96% overall) ipc_message_name = null (99.64% in signature vs 35.68% overall) reason = EXCEPTION_ACCESS_VIOLATION_READ (93.21% in signature vs 31.13% overall) app_init_dlls = null (66.79% in signature vs 04.76% overall) shutdown_progress = profile-before-change (26.43% in signature vs 02.41% overall) shutdown_progress = xpcom-shutdown [61.45% vs 03.79% if process_type = null]
Hi Philipp! How could we support in finding the cause of this issue?
needinfoing david keeler about this...
Flags: needinfo?(dkeeler)
This looks like a duplicate of bug 702307.
This is just to let you know I've seen the ni? request, but I haven't had time to dig into this yet.
Given that most of those crashes are at address 0xc, it looks like ht is null in PL_HashTableLookupConst, meaning oidhash is null. oidhash is only null before NSS has been initialized and after it has been shut down. The NSS initialization/shutdown code in PSM should protect against code using NSS running before/after NSS has been initialized/shut down, but I think I found an explanation for when it would not. Each WebCryptoTask is an nsNSSShutDownObject, which in theory gives it the ability to ask if NSS has already been shut down. However, after nsNSSShutDownList::shutdown() has run, there's nothing I can see that prevents creating a new nsNSSShutDownObject. More importantly, the result of isAlreadyShutDown() will always be false for these new objects, leading them to erroneously think it's safe to call NSS functions. While this whole thing should be overhauled, there may be a simpler solution in the meantime: make it so nsNSSShutDownObjects know if NSS has already been shut down when they're created. I'll get together a patch for this.
Assignee: nobody → dkeeler
Flags: needinfo?(dkeeler)
Priority: -- → P1
Whiteboard: [psm-assigned]
Comment on attachment 8847832 [details] bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down https://reviewboard.mozilla.org/r/120746/#review123484 Great stuff.
Attachment #8847832 - Flags: review?(cykesiopka.bmo) → review+
Comment on attachment 8847832 [details] bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down https://reviewboard.mozilla.org/r/120746/#review124430 LGTM.
Attachment #8847832 - Flags: review?(ttaubert) → review+
Pushed by dkeeler@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9d6095db5090 isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down r=Cykesiopka,ttaubert
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
so far there are no new reports on 55.0a1 after the patch landed (though there weren't frequent crashes before that on nightly). do you think we should attempt to uplift the fix?
Flags: needinfo?(dkeeler)
Comment on attachment 8847832 [details] bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down Let's start slow and see if this actually does stop the crashes. Approval Request Comment [Feature/Bug causing the regression]: long-standing issue [User impact if declined]: shutdown crashes (particularly if using Cliqz) [Is this code covered by automated tests?]: there are tests that exercise this area of code, but nothing specifically for this issue [Has the fix been verified in Nightly?]: no, but we haven't seen new crashes with this signature on Nightly [Needs manual test from QE? If yes, steps to reproduce]: no [List of other uplifts needed for the feature/fix]: none [Is the change risky?]: there is slight risk with this patch (see next line) [Why is the change risky/not risky?]: this code pretty much makes PSM strictly more correct than what we had before. However, what could happen is it could expose already-incorrect behavior in code that uses PSM. This would probably be in the form JS code throwing exceptions (rather than crashing outright or succeeding by pure luck). If the calling code doesn't handle this, users may see broken behavior (but at least their browser shouldn't crash due to this). [String changes made/needed]: none
Flags: needinfo?(dkeeler)
Attachment #8847832 - Flags: approval-mozilla-aurora?
Comment on attachment 8847832 [details] bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down Fix a crash and there are no new reports on 55.0a1 after the patch landed. Let's start it slowly. Aurora54+.
Attachment #8847832 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
I believe those are a result of an add-on using ctypes to directly call NSS functions, which isn't safe in general (and also probably isn't something we can prevent unless/until we outlaw using ctypes (in short, that's more or less a different bug). 0 libnss3.dylib PL_HashTableLookupConst nsprpub/lib/ds/plhash.c:349 1 libnss3.dylib SECOID_FindOID_Util security/nss/lib/util/secoid.c:2120 2 libnss3.dylib SECOID_FindOIDTag_Util security/nss/lib/util/secoid.c:2136 3 libnss3.dylib seckey_ExtractPublicKey security/nss/lib/cryptohi/seckey.c:577 4 XUL ffi_call_unix64 5 XUL ffi_call js/src/ctypes/libffi/src/x86/ffi64.c:535 6 XUL js::ctypes::FunctionType::Call js/src/ctypes/CTypes.cpp:7144
Flags: needinfo?(dkeeler)
If that's a different signature, should we go ahead and request 53 uplift then too? We're due to build the RC early next week.
Flags: needinfo?(dkeeler)
Comment on attachment 8847832 [details] bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down Approval Request Comment [Feature/Bug causing the regression]: long-standing issue [User impact if declined]: shutdown crashes (particularly if using Cliqz) [Is this code covered by automated tests?]: there are tests that exercise this area of code, but nothing specifically for this issue [Has the fix been verified in Nightly?]: no, but we haven't seen new crashes with this signature on Nightly or Aurora [Needs manual test from QE? If yes, steps to reproduce]: no [List of other uplifts needed for the feature/fix]: none [Is the change risky?]: there is slight risk with this patch (see next line) [Why is the change risky/not risky?]: this code pretty much makes PSM strictly more correct than what we had before. However, what could happen is it could expose already-incorrect behavior in code that uses PSM. This would probably be in the form JS code throwing exceptions (rather than crashing outright or succeeding by pure luck). If the calling code doesn't handle this, users may see broken behavior (but at least their browser shouldn't crash due to this). So far as I know, we haven't seen this on Nightly or Aurora yet. [String changes made/needed]: none
Attachment #8847832 - Flags: approval-mozilla-beta?
David - what kind of broken behavior? It does not look like that common of a shutdown crash even on release, and it's not a new issue. But, if you think it is important to uplift, and also safer for users and more correct, we can try it in beta 10.
Flags: needinfo?(dkeeler)
Basically, if an add-on (like cliqz) or browser code (e.g. fx accounts) is trying to do something crypto-related at shutdown, this patch makes it more likely that that operation will throw a JS exception. If the calling code doesn't handle this, it could leave things in an inconsistent state. That said, since it's at shutdown, users probably won't notice as long as it doesn't cause loss of data. The best-case scenario is that this patch takes only every instance that would have resulted in a crash and turns it into a JS exception (in which case it would be a clear win and I wouldn't be concerned about uplifting this). However, I'm fairly sure that's not the case. That is, in addition to turning these crashes into exceptions, I think there are some cases where the browser would not have crashed and now encounters an exception (where it wouldn't have before). Thus, to the user, there's the potential of more perceived breakage. Since there's not much more time left for 53 on beta, perhaps it would be best to wait on this. In the meantime, cliqz can actually work around this bug themselves - by observing an early shutdown notification (something earlier than "profile-before-change" - "profile-change-teardown" perhaps), they can know to stop all activity that might require crypto operations.
Flags: needinfo?(dkeeler)
Comment on attachment 8847832 [details] bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down From comment 23, sounds like there is a viable workaround for cliqz and it may be best to keep this riding the trains with 54 aurora. Thanks David.
Attachment #8847832 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Thanks for the update guys! We will look into it and try to implement the suggestion you mentioned. As we are in the process of moving towards WebExtension tech we were wondering if there is a way to get a similar signal to "profile-change-teardown" there. Any idea?
(In reply to lucian from comment #25) > Thanks for the update guys! We will look into it and try to implement the > suggestion you mentioned. > As we are in the process of moving towards WebExtension tech we were > wondering if there is a way to get a similar signal to > "profile-change-teardown" there. Any idea?
Flags: needinfo?(aswan)
(In reply to lucian from comment #25) > Thanks for the update guys! We will look into it and try to implement the > suggestion you mentioned. > As we are in the process of moving towards WebExtension tech we were > wondering if there is a way to get a similar signal to > "profile-change-teardown" there. Any idea? I don't know exactly what that event is but it sounds like it is related to shutdown? There is no explicit shutdown notification for webextensions but they can create a background page and listen for "unload" in the background page. This sounds like it could easily be a longer conversation, not related to this bug, I would suggest either a chat in #webextensions on IRC or a new bug.
Flags: needinfo?(aswan)
Attachment #8847832 - Flags: approval-mozilla-esr52?
Comment on attachment 8847832 [details] bug 1344478 - isAlreadyShutDown should return true for nsNSSShutDownObjects created after NSS shut down Based on comment 23, I don't think this is a must fix for ESR52.1 given the risk assessment. Unless the severity of this problem changes, I'd like to wontfix for esr52
Attachment #8847832 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: