Crash in [@ PR_Unlock | nssTokenObjectCache_FindObjectsByTemplate | nssToken_FindObjectsByTemplate | nssToken_FindCertificateByIssuerAndSerialNumber | nssTrustDomain_UpdateCachedTokenCerts | PK11_DoPassword]
Categories
(NSS :: Libraries, defect, P1)
Tracking
(firefox-esr9199+ fixed, firefox97 wontfix, firefox98+ wontfix, firefox99+ fixed)
People
(Reporter: jesup, Assigned: jschanck)
References
(Blocks 1 open bug)
Details
(4 keywords, Whiteboard: [sec-moderate for Firefox][will be fixed in bug 1370866][sec-survey] [post-critsmash-triage][adv-main99+][adv-esr91.8+])
Crash Data
Attachments
(1 file, 1 obsolete file)
248 bytes,
text/plain
|
Details |
Long-term UAF, appears to go back to either 60 or 72. Hopefully a duplicate.
Crash report: https://crash-stats.mozilla.org/report/index/7f2e9311-1197-45bb-a6a4-134c10211117
Reason: EXCEPTION_ACCESS_VIOLATION_READ
Top 10 frames of crashing thread:
0 nss3.dll PR_Unlock nsprpub/pr/src/threads/combined/prulock.c:326
1 nss3.dll nssTokenObjectCache_FindObjectsByTemplate security/nss/lib/dev/devutil.c:736
2 nss3.dll nssToken_FindObjectsByTemplate security/nss/lib/dev/devtoken.c:412
3 nss3.dll nssToken_FindCertificateByIssuerAndSerialNumber security/nss/lib/dev/devtoken.c:832
4 nss3.dll nssTrustDomain_UpdateCachedTokenCerts security/nss/lib/pki/tdcache.c:474
5 nss3.dll PK11_DoPassword security/nss/lib/pk11wrap/pk11auth.c:640
6 nss3.dll PK11_Authenticate security/nss/lib/pk11wrap/pk11auth.c:324
7 xul.dll mozilla::psm::FindClientCertificatesWithPrivateKeys security/manager/ssl/nsNSSComponent.cpp:2684
8 xul.dll ClientAuthDataRunnable::RunOnTargetThread security/manager/ssl/nsNSSIOLayer.cpp:2317
9 xul.dll mozilla::psm::SyncRunnableBase::Run security/manager/ssl/PSMRunnable.cpp:31
Updated•3 years ago
|
Comment 1•3 years ago
|
||
Hey Benjamin, is this a sec-high?
Updated•3 years ago
|
Comment 2•3 years ago
•
|
||
Difficult to say, the crash address seem to have a fixed offset, so my understanding is that it should not be that easy to trigger.
Maybe sec-moderate, but I would trust Randall's and Dan's judgement more than mine...
Updated•3 years ago
|
Assignee | ||
Comment 3•3 years ago
|
||
In PK11_DoPassword(PK11SlotInfo *slot, ...)
we have a non-owning reference to slot->nssToken
that we pass to nssTrustDomain_UpdateCachedTokenCerts
. The crash happens when the reference count for slot->nssToken
is decremented to 0 in another thread (probably here, based on the fact that this affects Mac OS and Windows versions after osclientcerts landed). We should be able to reduce the crash frequency by taking an owning reference to slot->nssToken
in PK11_DoPassword
.
There's a deeper problem though, and I think that fixing this bug properly would involve finishing some of the abandoned "Stan" work, e.g. Bug 135532. As it stands, I see no thread-safe way to obtain an owning reference to slot->nssToken
from PK11SlotInfo *slot
; c.f. the extensive discussion PK11_GetNSSToken
in Bug 1366187. Performing a null check on slot->nssToken
before taking an owning reference, as in Bug 1370866, narrows the window for a race but does not eliminate it.
Assignee | ||
Comment 4•3 years ago
|
||
Comment 5•3 years ago
|
||
Hmm it seems that:
- the slot should own a reference to it's nssToken,
- There should be a function which "atomicly" returns the token for the slot.
2a) the returned token should be checked for null - The code that changes or clears nssToken should use the same lock as the one in function 2 to make sure #2 is atomic.
The biggest question is does 1) mess up the token life cycle?
bob
Comment 6•3 years ago
|
||
So we already have the appropriate calls (warning pseudo code without complete spellings):
PK11Slot_SetNSSToken() {
new == addref(nsst);
lock()
old = ssl->nssToken;
ssl->nssToken = new;
unlock();
if (old) freeref(old);
}
PK11Slot_GetNSSToken()
{
lock();
if (ssl->nssToken) {
nsst = addref(ssl->nssToken);
}
unlock();
return nsst;
}
For the one existing caller we need to add an extra destroy because it now returns a reference. Other places we can now use this function to get a durable token (until it's freed).
bob
Updated•3 years ago
|
Assignee | ||
Comment 7•3 years ago
|
||
Thanks, Bob. I've taken this approach in the latest revision. The patch also adds null checks for all but two of the PK11Slot_GetNSSToken
calls. There are two exceptions are in sha1_hash
and md5_hash
in devtoken.c
. Those require a bit of refactoring, and I think they should be handled in a separate bug.
Comment 8•3 years ago
|
||
Thanks John, I really had only one comment on the patch, other than that it looks good. Since it's so extensive, it would be good to make sure all our tests (including the memory leak tests) come back clean.
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 9•3 years ago
|
||
Comment on attachment 9261617 [details]
Bug 1745667 - Make PK11Slot_GetNSSToken thread safe. r?rrelyea,beurdouche
Security Approval Request
- How easily could an exploit be constructed based on the patch?: Not easily. The patch fixes data races that happen primarily at shutdown. The exact location of this UAF is not obvious from the patch. The fact that there may be UAFs due to threading issues in token destruction is well documented in public bugs (e.g. 1607588, 1366187, 1370866).
- Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
- Which older supported branches are affected by this flaw?: all
- If not all supported branches, which bug introduced the flaw?: None
- Do you have backports for the affected branches?: No
- If not, how different, hard to create, and risky will they be?: Not difficult. The patch applies cleanly to NSS 3.68.2, but I have not reviewed it thoroughly.
- How likely is this patch to cause regressions; how much testing does it need?: The most likely type of regression would be a memory leak. A full nss-try run should be sufficient.
Comment 10•3 years ago
|
||
Comment on attachment 9261617 [details]
Bug 1745667 - Make PK11Slot_GetNSSToken thread safe. r?rrelyea,beurdouche
Approved to land and request uplift
Updated•3 years ago
|
Comment 12•3 years ago
|
||
Changing the priority to p1 as the bug is tracked by a release manager for the current beta.
See What Do You Triage for more information
Comment 13•3 years ago
|
||
In Firefox this crash happens in the parent process, and not directly in response to things happening in the client. It seems safe to call this race "sec-moderate for Firefox" and not have to backport to ESR.
On the other hand, it does not look like there's a big regression risk. It would be nice to have it backported
Comment 14•3 years ago
|
||
John mentioned wanting to check this patch in as part of bug 1370866 which seems like a good plan.
Updated•3 years ago
|
Assignee | ||
Comment 15•3 years ago
|
||
Current plan is to test on nss-try early next week and check in under bug 1370866 before the NSS 3.76 tree freeze on Thursday, Feb 24.
Comment 16•3 years ago
|
||
IIUC, we're having this all ride with the 99/91.8esr cycle.
Updated•3 years ago
|
Assignee | ||
Comment 17•3 years ago
|
||
Updated•3 years ago
|
Updated•3 years ago
|
Comment 19•3 years ago
|
||
As part of a security bug pattern analysis, we are requesting your help with a high level analysis of this bug. It is our hope to develop static analysis (or potentially runtime/dynamic analysis) in the future to identify classes of bugs.
Please visit this google form to reply.
Assignee | ||
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 20•3 years ago
|
||
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Description
•