Closed Bug 1297397 Opened 8 years ago Closed 8 years ago

race condition in nssSlot_IsTokenPresent() causes spurious SEC_ERROR_NO_TOKEN

Categories

(NSS :: Libraries, defect)

3.21
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kdudka, Unassigned)

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0
Build ID: 20160810203402

Steps to reproduce:

Run PK11_FindSlotByName() concurrently on a single slot that always has a token inside.  In my case it was a slot that does not define the GetTokenPresent callback.


Actual results:

Occasionally PK11_FindSlotByName() spuriously fails with SEC_ERROR_NO_TOKEN.


Expected results:

PK11_FindSlotByName() should work reliably in a multi-threaded environment.
This was originally reported by Peter Wang at the curl-library mailing list:

https://curl.haxx.se/mail/lib-2016-08/0119.html
https://curl.haxx.se/mail/lib-2016-08/0128.html

I was able to repeat the issue on RHEL-7 VM using the provided reproducer:

https://curl.haxx.se/mail/lib-2016-08/att-0119/curl_nss_test_case.c

After having a brief look at the code, I believe the race condition is obvious:

PK11_FindSlotByName() acquires only a "read" lock, which does not ensure 
mutual exclusion of readers.  So multiple calls of PK11_FindSlotByName()
can be executed in parallel.  As there is no additional lock on the way,
nssSlot_IsTokenPresent() can also be executed in parallel.

If there is a schedule where one thread has assigned slot->lastTokenPing
but not yet set the CKF_TOKEN_PRESENT flag in slot->ckFlags, all other
threads will keep returning SEC_ERROR_NO_TOKEN (without actually checking 
presence of the token) until the first thread continues (or the specified 
timeout elapses).

In order to fix it, we need to add a mutex to nssSlot_IsTokenPresent() or
at least reorder mutation of slot's state such that slot->lastTokenPing is
not updated before the CKF_TOKEN_PRESENT flag in slot->ckFlags is updated.
Kamil, thanks for your bug report and your analysis.

Would you like to suggest a patch to fix the issue?

Bob, do you have thoughts/suggestions on Kamil's report?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(rrelyea)
Attached patch proposed fixSplinter Review
Attachment #8787677 - Flags: review?
Kamil, when you request a review, please also set a suggested reviewer. Having a review-request flag has higher visibility than just an attachment without a comment, because there are so many emails.
Comment on attachment 8787677 [details] [diff] [review]
proposed fix

Bob, could you please review Kamil's patch?
Attachment #8787677 - Flags: review? → review?(rrelyea)
(In reply to Kai Engert (:kaie) from comment #4)
> Kamil, when you request a review, please also set a suggested reviewer.

Based on what?  What were the choices?  Where was it documented?

> Having a review-request flag has higher visibility than just an attachment

Needinfo on Bob is set since October 2016.  Attachment #8787677 [details] [diff] has review? flag set since September 2016.  I am not convinced that setting additional Bugzilla flags is going to help us...

> without a comment

The attachment is a properly formatted HG commit, will try to make it more obvious next time.

> , because there are so many emails.

It would be a good idea to search for open NSS bugs from time to time.  I am afraid this bug is not the only one that nobody has looked at for several releases of NSS.
(In reply to Kamil Dudka from comment #6)
> Needinfo on Bob is set since October 2016.

I meant September 2016.
(In reply to Kamil Dudka from comment #7)
> (In reply to Kamil Dudka from comment #6)
> > Needinfo on Bob is set since October 2016.
> 
> I meant September 2016.

Oops, August actually, but it does not really matter...
Comment on attachment 8787677 [details] [diff] [review]
proposed fix

Review of attachment 8787677 [details] [diff] [review]:
-----------------------------------------------------------------

The big semantic change this makes is 
  1. calls to is_present on another thread between the time of the delay period and before we get actual status will return incorrect data. this is the bug that is being fixed.
  2. checks within the delay period does not extend the delay period. 
         - This is may be a good thing, in that continual checks within the delay period would cause the us to never check the state of the token again (because we are always in the delay perios,
         - however --- it means that we may make a Get_SlotInfo() call in a middle of a transaction because is started before the last delay period.

I think on the whole both changes are desired. It may also be desirable to put almost the whole function inside the slot_monitor.

bob
Attachment #8787677 - Flags: review?(rrelyea) → review+
I've marked the patch r+ with my comments.
Flags: needinfo?(rrelyea)
Bob, it's not clear to me if you are requesting mandatory changes, or have suggested optional changes, can you please clarify?

Given that Bob has marked the patch r+, does that mean that Bob's comments are optional?

Kamil, please let me know, if you will do another patch, or if the current patch is ready for checkin. Once it's ready, please set the keyword "checkin-needed", or even better, please send me personal email, requesting me or Bob to check the patch in for you.
Flags: needinfo?(rrelyea)
Flags: needinfo?(kdudka)
Bob's comments are optional. They are merely documenting the full semantic of this patch in case others may object or some one running down this patch and finding this bug has a leg up in understanding it's implications.

bob
Flags: needinfo?(rrelyea)
(In reply to Kai Engert (:kaie) from comment #11)
> Kamil, please let me know, if you will do another patch, or if the current
> patch is ready for checkin. Once it's ready, please set the keyword
> "checkin-needed", or even better, please send me personal email, requesting
> me or Bob to check the patch in for you.

I am fine with the patch as it is.  This bug is worked around in libcurl since August 2016 anyway:

https://github.com/curl/curl/commit/3a5d5de9
Keywords: checkin-needed
Flags: needinfo?(kdudka)
Try build looks good:
https://treeherder.mozilla.org/#/jobs?repo=nss-try&revision=35a3f450170cf99c939451490215777c727f149e

Pushed:
https://hg.mozilla.org/projects/nss/rev/754a4a1f6220
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → 3.29
Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: