Closed
Bug 212112
Opened 21 years ago
Closed 21 years ago
A recursive deadlock on td->cache->lock
Categories
(NSS :: Libraries, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
3.9
People
(Reporter: wtc, Assigned: wtc)
References
Details
(Whiteboard: [3.8.2][3.7.8])
Attachments
(1 file)
922 bytes,
patch
|
nelson
:
review+
bugz
:
superreview+
|
Details | Diff | Splinter Review |
The NSS version is 3.7.7. An NSS-based application was hung because it tried to lock td->cache->lock recursively: (gdb) where #0 0x90034828 in semaphore_wait_trap () #1 0x90009d18 in pthread_mutex_lock () #2 0x003a45a0 in PR_Lock (lock=0xf3ce60) at ../../../../mozilla/nsprpub/pr/src/pthreads/ptsynch.c:190 #3 0x0095f868 in nssTrustDomain_RemoveTokenCertsFromCache (td=0xf46bb8, token=0xf4b838) at tdcache.c:486 #4 0x008ef4f0 in nssToken_NotifyCertsNotVisible (tok=0xf4b838) at dev3hack.c:311 #5 0x00967a14 in nssSlot_IsTokenPresent (slot=0xf4c248) at devslot.c:360 #6 0x00967ac4 in nssSlot_GetToken (slot=0xf4c248) at devslot.c:403 #7 0x0095e808 in nssTrustDomain_FindTrustForCertificate (td=0xf46bb8, c=0x37c3e80) at trustdomain.c:1194 #8 0x00965d60 in nssTrust_GetCERTCertTrustForCert (c=0x37c3e80, cc=0x37c4660) at pki3hack.c:537 #9 0x009663f0 in fill_CERTCertificateFields (c=0x37c3e80, cc=0x37c4660, forced=1) at pki3hack.c:690 #10 0x00966508 in stan_GetCERTCertificate (c=0x37c3e80, forceUpdate=1) at pki3hack.c:718 #11 0x00966594 in STAN_ForceCERTCertificateUpdate (c=0x37c3e80) at pki3hack.c:734 #12 0x0095f8fc in nssTrustDomain_RemoveTokenCertsFromCache (td=0xf46bb8, token=0xf4b838) at tdcache.c:492 #13 0x008ef4f0 in nssToken_NotifyCertsNotVisible (tok=0xf4b838) at dev3hack.c:311 #14 0x00967a14 in nssSlot_IsTokenPresent (slot=0xf4c248) at devslot.c:360 #15 0x0096bed0 in nssToken_IsPresent (token=0xf4b838) at devtoken.c:1620 #16 0x00911120 in pk11_IsPresentCertLoad (slot=0xf45140, loadCerts=1) at pk11slot.c:2012 #17 0x0091131c in PK11_IsPresent (slot=0xf45140) at pk11slot.c:2059 #18 0x0090da28 in PK11_FindSlotByName (name=0x3628ba0 "AOL Key") at pk11slot.c:584 We believe that the error occurs in the vicinity of functions nssTrustDomain_RemoveTokenCertsFromCache and STAN_ForceCERTCertificateUpdate. Two questions: 1. Does nssTrustDomain_RemoveTokenCertsFromCache need to call STAN_ForceCERTCertificateUpdate at all? 2. If the answer to question 1 is yes, does nssTrustDomain_RemoveTokenCertsFromCache need to call STAN_ForceCERTCertificateUpdate while holding td->cache->lock?
Assignee | ||
Comment 1•21 years ago
|
||
I looked at the other places where we call STAN_ForceCERTCertificateUpdate and concluded that it is not necessary to call that function while holding td->cache->lock. This patch moves the STAN_ForceCERTCertificateUpdate call outside the td->cache->lock critical section. I have to iterate through the whole dtor.certs array twice. It should be possible to avoid that, but I think these arrays (the certs on a removable token) should be small so that the cost of iterating through the whole array is not high.
Assignee | ||
Comment 2•21 years ago
|
||
Comment on attachment 127290 [details] [diff] [review] Call STAN_ForceCERTCertificateUpdate without holding td->cache->lock Nelson, please review this patch. You might also want to use LXR to verify that we are calling STAN_ForceCERTCertificateUpdate without holding td->cache->lock elsewhere in NSS. Ian, if you could answer my two questions and review this patch, I'd appreciate it.
Attachment #127290 -
Flags: superreview?(bugz)
Attachment #127290 -
Flags: review?(nelsonb)
Comment 3•21 years ago
|
||
Comment on attachment 127290 [details] [diff] [review] Call STAN_ForceCERTCertificateUpdate without holding td->cache->lock Certainly looks correct to me. I agree performance is hopefully not a problem in this are, since the number of token certs should be small (on a removable token).
Attachment #127290 -
Flags: superreview?(bugz) → superreview+
Assignee | ||
Comment 4•21 years ago
|
||
Fix checked into the NSS tip (NSS 3.9), NSS_3_8_BRANCH (NSS 3.8.2), and NSS_3_7_BRANCH (NSS 3.7.8).
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Whiteboard: [3.8.2][3.7.8]
Target Milestone: --- → 3.9
Comment 5•21 years ago
|
||
Comment on attachment 127290 [details] [diff] [review] Call STAN_ForceCERTCertificateUpdate without holding td->cache->lock The solution employed here solves the deadlock but doesn't address the recursion itself, which I believe it fundamentally unnecessary. I wish that recursion could actually be eliminated. But I'm satisfied that this fix is correct.
Attachment #127290 -
Flags: review?(nelsonb) → review+
Assignee | ||
Comment 6•21 years ago
|
||
Good point, Nelson. I opened bug 212474 for the elimination of the recursion.
Assignee | ||
Comment 7•21 years ago
|
||
*** Bug 216945 has been marked as a duplicate of this bug. ***
You need to log in
before you can comment on or make changes to this bug.
Description
•