Closed
Bug 225525
Opened 22 years ago
Closed 19 years ago
race assigning NSSCertificate fields leaks memory and slot reference
Categories
(NSS :: Libraries, defect, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
3.11.3
People
(Reporter: nelson, Assigned: julien.pierre)
References
Details
Attachments
(1 file, 9 obsolete files)
37.90 KB,
patch
|
nelson
:
review+
rrelyea
:
superreview+
|
Details | Diff | Splinter Review |
The NSSCertificate structure has a member named "decoding", which points
to an nssDecodedCert structure. There are several places in NSS that
assign a value to this member (e.g. c->decoding). Typical code looks like
if (!c->decoding)
c->decoding = somefunction(some,arguments);
Such code is found in pki/certificate.c, pki/pki3hack.c, pki/pkibase.c
None of these places uses a lock or other mutual exclusion device to make
the test and the update atomic. Consequently, there is a race in which
multiple threads may find that the pointer is null, call the function to
create a new decoding, and assign it to c->decoding. All but one of
those assignments results in a leak.
It is unclear to me what existing lock, if any, should be used to achieve
mutually exclusive access to this member of nssCertificate. Perhaps it is
NSSCertificate.object.lock.
I wonder: should the lock already be held when the functions that manipulate
c->decoding are called? or should this function acquire and release the
lock?
Is it possible that sometimes (in some code paths) the lock IS held when these
functions are called, and at other times (in other paths) it is not?
If that were true, then acquiring the lock in the functions that manipulate
c->decoding could deadlock if the lock was already held.
I will shortly checkin a temporary workaround for this problem. The workaround
doesn't eliminate the race, but it does attempt to detect the race and free
duplicate copies rather than leaking them. It reduces the window of opportunity
for the leak, but doesn't eliminate it. The patch for that change will be
attached here imminently.
Someone who knows how object locking in Stan code is supposed to work should
fix this race.
Reporter | ||
Comment 1•22 years ago
|
||
Reporter | ||
Updated•20 years ago
|
QA Contact: bishakhabanerjee → jason.m.reid
Reporter | ||
Updated•19 years ago
|
Attachment #135374 -
Attachment description: Workaround race, reduce leaks, v1 → Workaround race, reduce leaks, v1 (checked in)
Reporter | ||
Updated•19 years ago
|
Assignee: wtchang → nobody
QA Contact: jason.m.reid → libraries
Assignee | ||
Comment 2•19 years ago
|
||
This workaround does not work on all platforms. On those without total store order, such as PowerPC, the test
if (c->decoding)
may give the wrong result, ie. the thread will see NULL, even if another thread already assigned it.
On those chips, going through a lock is required to synchronize the cache lines. So, the leak could still exist.
We need to fix this properly.
I'm wondering if this race could be related to bug 331164 .
Reporter | ||
Comment 3•19 years ago
|
||
Isn't PR_AtomicSet supposed to ahieve "total store order" ?
Assignee | ||
Comment 4•19 years ago
|
||
No. PR_AtomicSet only affects the memory area being set/read. It's not guaranteed to synchronize the cache for any other memory area.
Assignee | ||
Comment 5•19 years ago
|
||
The following code has problems regardless of the store order of the CPU architecture.
/* Once this race is fixed, an assertion should be put
** here to detect any regressions.
PORT_Assert(!c->decoding);
*/
if (!c->decoding) {
c->decoding = dc;
} else {
/* Reduce the leaks here, until the race is fixed. */
nssDecodedPKIXCertificate_Destroy(dc);
dc = c->decoding;
PR_AtomicIncrement(&race); /* my addition */
}
Multiple threads can find c->decoding to be NULL, and then get pre-empted right after the test. Only one of the threads will get to do the assignment. The other threads will still leak their object when they resume and overwrite the first copy .
I added the PR_AtomicIncrement (race is a global integer) to find out how commonly this was happening. It turns out it's happening a great deal in my test case for bug 331279 . When I get the assertion on the leaked reference and examine the core files, the value of race is between 0 and 2, for a strsclnt doing 10 connections, on a dual CPU AMD box.
When I run with two connections only, the value of race is *always* 1 when I get the assertion about the reference leak. At this point, I think this race is the cause of bug 331279.
Blocks: 331279
Assignee | ||
Comment 6•19 years ago
|
||
I also added a PORT_Assert(race == 0) in strsclnt *after* NSS_Shutdown() . It was hit. That means sometimes the code in the else statement in comment 5 is able to recover from the race, since NSS_Shutdown() doesn't report a reference leak.
A lock is needed, but where does it belong ? For now, I'm going to try the brutal way with a global lock in this function and see if that resolves the reference leak issue.
Assignee | ||
Comment 7•19 years ago
|
||
This global lock (which I initialized / destroyed in nssinit.c) fixes this bug and bug 331279 at the same time. Hopefully there is a better place for the lock than global scope, because it will reduce scalability.
Reporter | ||
Comment 8•19 years ago
|
||
Julien, I'm giving this bug to you, since you've already done most of the
work on it, and have even produced a preliminary (?) patch.
Assignee: nobody → julien.pierre.bugs
Priority: -- → P2
Target Milestone: --- → 3.11.2
Assignee | ||
Comment 9•19 years ago
|
||
This race condition is the leading cause of failures in __CERT_NewTempCertificate . See the test program attached in bug 331279 .
Assignee | ||
Comment 10•19 years ago
|
||
There are two possible fixes here :
1) the brutal one that I attached - use a global lock within stan_GetCERTCertificate.
2) add a lock to each NSSCertificate, which would need to be acquired before inspecting c->decoding and setting it. It may need to be held while calling fill_CERTCertificateFields as well.
Logically, the second fix makes the most sense. There may be other places where the NSSCertificate is written to or read from where the lock also would need to be acquired.
Of course, this will considerably change the lock usage of NSS if each cert suddenly has its own lock.
Bob, Nelson, I would like to have your opinions on this matter.
Assignee | ||
Comment 11•19 years ago
|
||
I tried to implement option 2. It looks like the base Stan nssPKIObject already has a lock. I tried to hold it throughout stan_GetCERTCertificate .
But I ran into a problem where nssPKIObject_GetInstances tries to acquire the lock again later, but it's not safe to be acquired again recursively, so NSPR asserts.
I don't think it's safe to release the NSSCertificate lock before calling fill_CERTCertificateFields .
Unfortunately,nssPKIObject_GetInstances gets used in many other places, so I would rather not change it .
So, should we change the Stan nssPKIObject lock to a recursive type ?
t@null (l@1) program terminated by signal ABRT (Abort)
0xfffffd7ffeeade5a: __lwp_kill+0x000a: jae __lwp_kill+0x18 [ 0xfffffd7ffeeade68, .+0xe ]
Current function is PR_Assert
538 abort();
(dbx) w
[1] __lwp_kill(0x1, 0x6, 0xffffffff88d72860, 0xfffffd7ffeeae74e, 0xfffffd7fffdfe440, 0xfffffd7ffeea0755), at 0xfffffd7ffeeade5a
[2] _thr_kill(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffeea8e83
[3] raise(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffee573e9
[4] abort(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffee3a3d0
=>[5] PR_Assert(s = 0xfffffd7fff0b9ab8 "0 == rv", file = 0xfffffd7fff0b9ac0 "../../../../pr/src/pthreads/ptsynch.c", ln = 207), line 538 in "prlog.c"
[6] PR_Lock(lock = 0x54d380), line 207 in "ptsynch.c"
[7] nssPKIObject_GetInstances(object = 0x5526d8), line 350 in "pkibase.c"
[8] get_cert_instance(c = 0x5526d8), line 634 in "pki3hack.c"
[9] fill_CERTCertificateFields(c = 0x5526d8, cc = 0x54c280, forced = 0), line 718 in "pki3hack.c"
[10] stan_GetCERTCertificate(c = 0x5526d8, forceUpdate = 0), line 824 in "pki3hack.c"
[11] STAN_GetCERTCertificateOrRelease(c = 0x5526d8), line 863 in "pki3hack.c"
[12] PK11_FindCertFromNickname(nickname = 0x43dba0 "monstre.red.iplanet.com", wincx = 0x43ef80), line 618 in "pk11cert.c"
[13] main(argc = 16, argv = 0xfffffd7fffdfeab8), line 1974 in "selfserv.c"
(dbx)
Assignee | ||
Comment 12•19 years ago
|
||
RE: comment 11, I have confirmed that it's not safe to call fill_CERTCertificateFields without holding the NSSCertificate lock. I tried to release the lock just before the call, since nssPKIObject_Instances needed to acquire it later. But unfortunately the reference leak appeared right away in my test. So we really need to deal with this problem either by making the lock safe for recursive usage or through some change to the Stan API.
Reporter | ||
Comment 13•19 years ago
|
||
NSSCertificate is a "subclass" of nssPKIObject. Every nssPKIObject has a lock.
So each NSSCertificate has a lock named NSSCertificate.object.lock.
One probably should look to see how it's used now, but I think I'd try to
use that existing lock rather than inventing YA one.
Assignee | ||
Comment 14•19 years ago
|
||
Nelson,
Yes, I found out about the Stan lock. See comment 11 and comment 12. I'm working on a patch now that uses that lock. I think I can take care of the recursivity issue by unlocking selectively before the lower-level calls that require the lock in the fill function.
Assignee | ||
Comment 15•19 years ago
|
||
This patch solves the race condition that causes the reference leak. It is not the most optimized patch, but it is safe. I ran my test case for 20 minutes with it without seeing a reference leak.
I'm basically holding the lock throughout stan_GetCERTCertificate except that I release and reacquire it :
1) around the call to fill_CERTCertificateFields
2) within fill_CERTCertificateFields, around the call to nssCryptoContext_FindTrustForCertificate(context, c);
3) within fill_CERTCertificateFields, around the call to nssTrust_GetCERTCertTrustForCert
The first of these unlock/locks could be optimized out by either merging fill_CERTCertificateFields with stan_GetCERTCertificate, since that's its only caller - or by changing the semantics of fill_CERTCertificateFields to assume that the cert object is locked coming in and going out.
I don't know for sure that the 2nd is necessary because that part of the code wasn't executed during my run.
The 3rd one is required - this is where I hit the recursivity problem.
Attachment #135374 -
Attachment is obsolete: true
Attachment #218507 -
Attachment is obsolete: true
Attachment #225508 -
Flags: superreview?(rrelyea)
Attachment #225508 -
Flags: review?(nelson)
Assignee | ||
Comment 16•19 years ago
|
||
There was one more place where an unlock/lock was needed :
4) In stan_GetCERTCertificate, there is also a call to nssTrust_GetCERTCertTrustForCert which requires the object lock not to be held on entry.
Attachment #225508 -
Attachment is obsolete: true
Attachment #225514 -
Flags: superreview?(rrelyea)
Attachment #225514 -
Flags: review?(nelson)
Attachment #225508 -
Flags: superreview?(rrelyea)
Attachment #225508 -
Flags: review?(nelson)
Assignee | ||
Comment 17•19 years ago
|
||
This is a very old race, not a regression. Moving to 3.11.3 target.
Target Milestone: 3.11.2 → 3.11.3
Reporter | ||
Comment 18•19 years ago
|
||
Comment on attachment 225514 [details] [diff] [review]
update
I suppose that this patch reduces the window of opportunity for a race
significantly, but I doubt that it eliminates that race. Each of the
places where the patched code releases the lock, in hopes of immediately
reacquiring it, gives another thread an opportunity to get the lock there,
with the result that the state of the object, after reaquiring the lock,
is not guaranteed to be the same as before releasing it. You might be
able to convince me, after MUCH work, that this is all OK, but it would
not be easy.
I think a better approach is to use re-entrant locks, a.k.a PRMonitor
instead of PRLock. It is straight-forward to replace
PRLock with PRMonitor
PR_NewLock() with PR_NewMonitor(level) // can use a const
PR_Lock with PR_EnterMonitor
PR_Unlock with PR_ExitMonitor
PR_DestroyLock with PR_DestroyMonitor
I could easily be convinced of the correctness if you switch to PRMonitors.
Assignee | ||
Comment 19•19 years ago
|
||
I'd like to switch from PRLock to PRMonitors. But I don't know if I should do it for the Stan nssPKIObject . Doing that might have unforeseen consequences on the rest of the Stan code. On the other hand, having an extra PRMonitor when there is already a PRLock might lead to deadlocks.
Reporter | ||
Comment 20•19 years ago
|
||
Another technique is to take existing functions that have the form
func_name(args)
{
get lock
do lots of stuff
release lock
}
and "factor out" the "do lots of stuff" part into a
separate function, called func_name_locked(args) and change func_name to
func_name(args)
{
get lock
func_name_locked(args)
release lock
}
The other functions that already have the lock call func_name_locked
instead of calling func_name.
One nice thing about PRMonitor is that it's easy to create a test macro
that can be used in assertions, e.g. PORT_Assert(iHaveObjectMonitor(object))
I use this technique in libSSL.
When I converted libSSL from being single-thread-only code to being
thread-safe, I found numerous cases of nested locking and unlocking, and
decided to use Monitors instead. Instead of converting the code directly
from using PR_Lock calls to PR_EnterMonitor calls, I converted them to use
macros. That way, it was easy to switch the implementation between PRLock
and PRMonitor with only a tiny set of macro changes in one header file.
See http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslimpl.h#1172
and following lines.
Also, You know about the duality of PR_Lock and PZ_Lock.
There is also a set of PZ_ macros/functions to match the PR_ functions for
monitors.
By the way, I've never regretted the conversion to using PRMonitors.
Comment 21•19 years ago
|
||
OK, so I have the opposite problem with the patch then Nelson.
The patch tries to hold a lock over lots of function calls and operations. It's not a lock on invariant operations as much as a monitor lock.
What I would prefer to see is lots of short snippets which do something like:
if (cert->value == NULL) {
value = some_other_function_to_decode();
lock();
if (cert->value == NULL) {
cert->value = value;
value = NULL;
}
unlock();
if (value) {
free(value);
}
}
if you are going to hold the lock over the whole thing, then Nelson is correct, monitors are the right way to go.
I'm leary of adding to many monitors into NSS, though. With monitors there are a higher chance of holding multiple types of monitors at once, which increases the chance of deadlock. Nelson's use in SSL is one of the classic safe cases because of good code layering enforced by shared libraries. (you know the cert library is never going to get it's lock before the SSL lock because it never calls SSL).
We're relatively safe if each layer only has one lock... as long as we manage callbacks carefully.
bob
Comment 22•19 years ago
|
||
Ideally we should figure out which data need to
be protected with locks and use regular locks
when we read or modify the data. Threading experts
recommend we reserve recursive locks for making
"legacy" code thread-safe.
In practice it is us, not the threading experts,
who need to make our code thread-safe. Sometimes
the engineering cost of doing "the right thing"
is prohibitive. Remember, the cost includes not only
Julien's time but also the code reviewer's time
and future maintainers' time. So I understand
Nelson's preference for recursive locks.
Since I am a threading expert, I agree with Bob.
Reporter | ||
Comment 23•19 years ago
|
||
IINM, in *this* case, the lock is serving the purpose of ensuring that the
object stays around continuously, and isn't deleted out from under us.
If we can fix that, perhaps with proper use of ref counts, then I agree that
we can switch to "fine grain" locking. Until then...
Assignee | ||
Comment 24•19 years ago
|
||
Nelson,
Re: comment 23, AFAIK there is no risk of the NSSCertificate object going away in this case. What we are trying to fix are the unsafe updates to NSSCertificate.decoding and NSSCertificate.instance->token->pk11slot, which cause memory leaks and slot reference leaks, as well as potential other writes to the NSSCertificate field which may be unsafe.
There is another race about preventing the object from being deleted, and that's bug 341323. Maybe that's the one you were thinking of.
Assignee | ||
Updated•19 years ago
|
Summary: race assigning NSSCertificate.decoding leaks memory → race assigning NSSCertificate.decoding leaks memory and slot
Assignee | ||
Comment 25•19 years ago
|
||
Re: comment #21,
Bob,
Please consider the rarity of contention on this lock. The lock will be acquired at the time that a certificate is decoded. But it is almost never contended. The only time it is contended is if multiple threads are trying to decode the same certificate at the same time. This can only happen if those threads all began decoding the certificate around the same time. Otherwise, the second thread will find an already-decoded copy and decoding will not proceed. The only programs that runs into this currently is our own strsclnt, which starts threads that run precisely at the same time and all get into the full SSL handshake and obtain the SSL server certificate at the same time. In fact, we never saw this race on the slot for years with RSA algorithms - we only started seeing it recently with ECC algorithms. Even strsclnt will only be affected a couple times around the start of the program, before a copy of the cert is put in the cert cache. Afterwards, this decoding code won't even get to run anymore. Real-world applications aren't likely to be affected. A client is unlikely to do what strsclnt is doing, and the penalty will only be at startup. A server probably will not run into it at all, because it will likely be decoding different certs from client in each of its threads most likely.
I understand the desire to optimize and hold the lock for as short a period of time as possible, but this is such a rare case that I don't think it matters.
In fact, for 99% of the cases when the lock is uncontended, having multiple fine-grain locking/unlocking for each field that needs to be protected may end up costing more in performance than one or a small number of lock acquisitions.
Re: comment #22,
Wan-Teh,
Unfortunately, we know fully well that application code already reads CERTCertificate fields content such as the pk11slot without using any locks, once they have obtained the pointer to the CERTCertificate. The only way to force apps to do start using locks for reads is to introduce binary incompatible changes, such as making the content of the structure private and providing accessor functions for each field.
We can be reasonably confident that application code isn't writing to these fields - only NSS code should be writing. My patch attempts to prevent problems due to multiple writers by serializing the writes as much as is possible. This is necessarily imperfect on platforms without total store order, as readers may still end up getting an "old" value if cache lines haven't been synchronized. Only an NSS API change can fix that. However, my patch appears to solve the problem for platforms that have total store order.
Logically, it would make more sense to use the monitor which is recursive, rather than the multiple lock/unlocks I have in my patch. However, this will have performance impact on the rest of Stan. Also, given that most readers of CERTCertificate already aren't using the lock at all as I established earlier, they won't be using the monitor either, and we will be paying the extra cost of a monitor vs lock without any benefit .
Assignee | ||
Updated•19 years ago
|
Attachment #225514 -
Flags: review?(wtchang)
Assignee | ||
Updated•19 years ago
|
Summary: race assigning NSSCertificate.decoding leaks memory and slot → race assigning NSSCertificate fields leaks memory and slot reference
Assignee | ||
Comment 26•19 years ago
|
||
*** Bug 331279 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 27•19 years ago
|
||
The test case for this bug is to run selfserv and strsclnt as follows :
[jp96085@monstre]/export/home/nss/tip/mozilla/dist/SunOS5.10_i86pc_DBG.OBJ/bin 158 % more serve_331279
#!/bin/tcsh
./selfserv -D -p 8443 -d ../../../tests_results/security/monstre.1/ext_server -n monstre.red.iplanet.com \
-w nss -r -r
[jp96085@monstre]/export/home/nss/tip/mozilla/dist/SunOS5.10_i86pc_DBG.OBJ/bin 159 % more cl_331279
#!/bin/tcsh
while (1)
strsclnt -o -o -q -p 8443 -d ../../../tests_results/security/monstre.1/ext_client -B -s -w nss -c 100 -C c -T -N -n ExtendedSSLUser \
monstre.red.iplanet.com
end
NSS_STRICT_SHUTDOWN must be set , and strsclnt will dump core because of the reference leak assertion .
The memory leak also occurs, but there is no easy way to verify that since strsclnt runs short-lived in the above script. If you run it longer, you will see the process memory grow. One can verify that this bug is the cause of the memory leak using the method I specified in comment 5 .
Assignee | ||
Comment 28•19 years ago
|
||
Regarding the changes suggested in our concall this morning :
nssDecodedPKIXCertificate_Create is the function that creates/allocates the CERTCertificate. It does so by calling CERT_DecodeDERCertificate . The resulting pointer is then stuck inside the NSSCertificate .
STAN_GetCERTCertificate fills other fields of this CERTCertificate structure, such as the slot, after nssDecodedPKIXCertificate_Create is called .
I looked into the possibility of doing filling the other fields within nssDecodedPKIXCertificate_Create . It appears that it might be possible. However, this will have a detrimental impact on performance. That is because nssDecodedPKIXCertificate_Create has multiple callers. The other caller is nssDecodedCert_Create, which is itself called from nssCertificate_GetDecoding - which has many callers - mainly filtering functions. In the case of that second caller, the other fields are presently never filled.
Apparently, the Stan code is designed to be able to operate on those incomplete CERTCertificates . It might be an optimization. The population operation for the remaining fields looks like it might be expensive.
So, If I make this field population unconditional, I might resolve the race condition, but I might also significantly impact performance.
Assignee | ||
Comment 29•19 years ago
|
||
Another problem is that there are existing code paths to cause the fields of the CERTCertificate filled by fill_CERTCertificateFields to be called again .
The function STAN_ForceCERTCertificateUpdate calls stan_GetCERTCertificate with forceUpdate = PR_TRUE . This function has 5 callers including 1 in pk11wrap (PK11_ImportCert) . The other 4 are in the trust domain cache.
Updating these fields in CERTCertificates that have already been given to the application is always an unsafe thing to do, since we know the readers of the CERTCertificate do not use any locks to access them.
This can only be fixed by an API change that would introduce locks.
Making copies of the CERTCertificate and modifying the copies rather than the original might be another way, but unfortunately this isn't possible either given some API definitions. If you take PK11_ImportCert, it takes a CERTCertificate as input and returns a SECStatus . There is no separate output CERTCertificate argument. When successful, that function updates the slot field of the CERTCertificate in place. If the CERTCertificate was previously given to other threads, this is a recipe for problems.
Unless we make API changes, we aren't going to fully resolve this problem. We can only make band-aid fixes.
1) The problems we have seen in our nightly QA with the leaked slot reference leak is only one of many possible problems that programs could encounter. Other fields can also be corrupt.
2) The memory leak of c->decoding is also only one problem - but this one can be fixed safely without an API change
3) The nickname field of the CERTCertificate may also be updated in an unsafe manner. In fill_CERTCertificateFields you can see the following assignment :
cc->nickname = PORT_ArenaAlloc(cc->arena, len);
Which is by definition unsafe given the lack of locks. Another thread could read the nickname at that point, get garbage, and crash. But none of our tests do that, so we haven't seen the problem.
4) The trust field is also updated unsafely.
We have public functions CERT_GetCertTrust / CERT_ChangeCertTrust which are supposed to acquire locks to protect this field. I took a quick lock, and it is a late-initialized, global lock which also suffers from problems ! But at least there is already an existing public API that we can fix. However, client will have problems if they read the field directly.
5) the ownSlot, pkcs11ID, dbhandle, istemp, isperm and nssCertificate fields of the CERTCertificate are all also updated without holding a lock in fill_CERTCertificateFields . The nssCertificate field is particularly problematic because that's the one from which we want to use the lock ...
How do we resolve this mess ? I offer two possibilities.
Plan A.
We should make all these fields of the CERTCertificate private in the header file, and create accessor (reader / writer) functions that would all acquire locks . We should be able to use the lock from the NSSCertificate hopefully in all cases.
I believe we can do this by maintaining binary compatibility but breaking source compatibility. Ie. we could simply rename these fields in the public header file, to something like "slot_use_accessor_to_read" and create the new accessor functions such as CERT_GetCertSlot, CERT_SetCertSlot . We probably only want to make the "getter" functions public, but we need the "setter" functions for internal use in NSS as well. Of course, breaking source compatibility will create noise from application groups, but it is the surest way to enforce that applications actually move to using the functions that use the locks.
If everybody agrees with this, then I can start writing a patch implementing this suggestion. Such as patch should be fairly clear and easy to document, since all access to the fields will be protected, although it might be a long patch with impact to many NSS units.
If not, there is Plan B :
We serialize only the writes to the problematic fields mentioned above. We don't add any accessor function using locks. This means that :
- any memory or slot leaks caused by concurrent writes will be resolved, as all our the writes will be serialized, whether during initialization of NSSCertificate / CERTCertificate, or updates
- problems caused by concurrent NSS writes and application reads will not be resolved . But they may be minimized by doing atomic sets of the fields if they are small enough for machines with Total Store Order
- things will continue to be bad on machines without Total Store Order, since readers won't use any locks and may have inconsistent views of memory (eg. they could see a pointer update, but not the memory that it points to)
This fix would essentially be built upon patch 225514, plus some comments, and some improvements in the field updates code .
Assignee | ||
Updated•19 years ago
|
Priority: P2 → P1
Reporter | ||
Comment 30•19 years ago
|
||
Comment on attachment 225514 [details] [diff] [review]
update
I think this patch is very close to the right answer, but not
close enough.
After reviewing this patch, I conclude that it will ensure
that we have leaks of trust objects, just as we now have leaks
of other components. It may also assure leaks of things changed
by the fill_CERTCertificateFields function. Let's not move the
problem around. Let's fix it as much as we can.
Fine-grained locking would be ideal, but we know we can't get
all the users of the struct to participate in that model.
I propose going to monitors. There seems to be fear of monitors,
based on the idea that they might be more costly than locks.
But, IMO, monitors may be less costly than locks. The act of
recursively acquiring a held monitor is LESS costly than the
act of releasing and reacquiring a lock. Likewise the act of
releasing a recursively locked monitor is less costly than
releasing and reacquiring a lock.
Don't put PZ_Lock/PZ_Unlock (or PR_EnterMonitor/PR_ExitMonitor)
calls all over, as this patch does. Instead, create a lock and
unlock method on the underlying superclass. e.g.
nssPKIObject_Lock() and nssPKIObject_Unlock(), and call them
wherever needed. Then you can play with the implementation of
that lock without having to change code all over NSS.
These methods can be functions, or mere macros.
I can provide a patch that converts all the existing P[RZ]_Lock
and P[RZ]_Unlock calls into invocations of the new nssPKIObject
methods. The initial implementation will just be a macro that
does the same as what the code is now doing, so it will have a
net zero effect on the execution, but will make it trivial to
then replace the object lock with a different underlying
implementation.
Once converted to monitors, then this patch can be easily changed
to not release/reacquire locks around calls to nssCryptoContext_FindTrustForCertificate,
nssTrust_GetCERTCertTrustForCert,
cert_trust_from_stan_trust, and fill_CERTCertificateFields.
I could be rather easily persuaded to r+ that patch, I think.
Attachment #225514 -
Flags: review?(nelson) → review-
Assignee | ||
Comment 31•19 years ago
|
||
I believe this is what we agreed to this afternoon. This patch differs very little from the previous one.
1) I changed the nickname construction logic so that the cc->nickname field is updated only after the string is constructed . This is what I discussed in comment 14, issue 3)
2) I changed the trust update field to occur only upon success.
I was going to also destroy the old trust object in this case, but that turned out to be impossible, since the trust objects are allocated in the CERTCertificate arena. However, these objects will not be leaked, but the arena will grow when there is contention . A leak of trust objects was Nelson's main objection in comment 30, but there is no such leak due to the use of the arena.
Attachment #225514 -
Attachment is obsolete: true
Attachment #232839 -
Flags: superreview?(rrelyea)
Attachment #232839 -
Flags: review?
Attachment #225514 -
Flags: superreview?(rrelyea)
Attachment #225514 -
Flags: review?(wtchang)
Assignee | ||
Comment 32•19 years ago
|
||
The other notable change in the "Update 2" patch is that I changed the protocol for the fill_CERTCertificateFields function to assume that the NSSCertificate lock is held upon entry, and held before exit as well. There is only one caller for this function, so this was OK . I also added a number of comments.
Assignee | ||
Updated•19 years ago
|
Attachment #232839 -
Flags: review? → review?(nelson)
Reporter | ||
Comment 33•19 years ago
|
||
As an experiment, I am developing an alternative approach to this problem.
This patch changes the type of the nssPKIObject's lock to be a PRMonitor.
It makes no changes to the places where locking or unlocking is done.
It merely replaces the PZ_Lock and PZ_Unlock calls with calls to new
methods on the nssPKIObject, and those methods use Monitors rather than locks.
The net effect of THIS patch should be nothing, except perhaps a small
performance change. No logic change here. Just a substitution of one
mutual exclusion device for another.
The next step will be to change some of the places wherfe locking/unlocking
is now done, to try to address this bug. I will not ask for review until
that is done. Then we'll have to compare the two.
Assignee | ||
Comment 34•19 years ago
|
||
I found a bug in the patch with the nickname construction . This is the fix.
Attachment #232839 -
Attachment is obsolete: true
Attachment #233029 -
Flags: superreview?(rrelyea)
Attachment #233029 -
Flags: review?(nelson)
Attachment #232839 -
Flags: superreview?(rrelyea)
Attachment #232839 -
Flags: review?(nelson)
Reporter | ||
Comment 35•19 years ago
|
||
untested patch. testing needed.
I think this is equivalent to Julien's "Update 2" patch, but using monitors.
Reporter | ||
Comment 36•19 years ago
|
||
Comment on attachment 233030 [details] [diff] [review]
equivalent to "Fix nickname construction" but using monitors
I've tested this somewhat, by running all.sh and seeing that it passes.
Attachment #233030 -
Attachment description: equivalent to Update 2" but using monitors → equivalent to "Fix nickname construction" but using monitors
Reporter | ||
Comment 37•19 years ago
|
||
Comment on attachment 233030 [details] [diff] [review]
equivalent to "Fix nickname construction" but using monitors
I'm asking for review for this alterantive patch. I'm not asking for checkin approval, but merely review that this is correct and equivalent to Julien's most recent patch, except for locks/monitors.
If we agree that both patches are equivalent (with respect to curing the race), then we can figure out how to pick one. :)
Attachment #233030 -
Flags: superreview?(rrelyea)
Attachment #233030 -
Flags: review?(julien.pierre.bugs)
Assignee | ||
Comment 38•19 years ago
|
||
Comment on attachment 233030 [details] [diff] [review]
equivalent to "Fix nickname construction" but using monitors
I think your patch fixes the race.
But it will also increase the memory requirements for all Stan objects - all NSS PKI Objects will have a monitor instead of a lock, which means an extra condition variable allocated.
If we are going to use monitors, I would prefer that we use them only for objects that need them.
Attachment #233030 -
Flags: review?(julien.pierre.bugs) → review-
Comment 39•19 years ago
|
||
Comment on attachment 233029 [details] [diff] [review]
Fix nickname construction
r- for just one issue.
when we are setting the trust value, we should set it on the condition that is not already set. Unfortunately we can't do that in the fill-in code because that's the purpose of the fill in code (to change the trust values based on new imports or token insertions).
The last place we set it though, we do so if only if cc->trust is NULL. We should check cc->trust again after reaquiring the lock and not change the cc->trust in that case.
with that change it's a r+ for correctness.
Attachment #233029 -
Flags: superreview?(rrelyea) → superreview-
Comment 40•19 years ago
|
||
Comment on attachment 233030 [details] [diff] [review]
equivalent to "Fix nickname construction" but using monitors
the r+ is answering Nelson's question about correctness.
Attachment #233030 -
Flags: superreview?(rrelyea) → superreview+
Comment 41•19 years ago
|
||
General comments:
Julien's patch is fine, subject to my suggestion about the last trust change. I would not object to it.
At this point I prefer Nelson's patch. I prefer fine grain locks, but since neither patch as them, monitors are more appropriate for locking larger swaths of code. The monitor patch does not have trust change issue Julien's patch has.
Nelson's patch does point out that we are clearly overloading a lock. It's clear exactly what that lock used to protect (the instance array in the object), now it protects that plus the CERTCertificate creation. We should probably document this.
I'm not really all that worried about the extra size of the monitor versus the the size of a lock structure.
Assignee | ||
Comment 42•19 years ago
|
||
Bob,
Re: comment 39, I'm not sure that we can tell what thread has the "most recent" trust value, since they are all computing it while not holding the lock. So how do you know which thread should do the cc->trust assignment ? If they are the same trust value, then the change you suggest will make no difference. Do you think they would not be ?
Ideally, I would like to solve the memory growth problem of the trust growth on refresh, which is why I filed an RFE to merge arenas - see bug 348291 . I didn't make it a dependency of this bug, because solving that has a much lower priority than soving the race. It may also cause problems in programs that have previously read cc->trust if it gets freed. So perhaps we can't safely solve that problem with the current CERTCertificate structure being public.
Perhaps the new trust structure could be created in a temporary arena, and then its values copied into the existing cc->trust, rather than assigning the cc->trust pointer to the new trust structure.
Re: comment 41 and the monitor size, I'm not worried about doing it for the NSSCertificate lock, but it would be good if the other objects didn't suffer unnecessarily . Avoiding the monitor would save not just memory but in number of malloc() calls. I think Nelson's patch can be improved to have arguments to select the type of lock needed at creation in nssPKIObject_NewLock, with a bit in the nssPKIObject to indicate the lock type (lock vs monitor), and then nssPKIObject_Unlock and nssPKIObject_Lock could examine that field to call the proper NSPR APIs. A little bit more work would be needed for the callers of nssPKIObject_NewLock to select the appropriate lock type argument . nssPKIObject_Create may need an extra argument .
Assignee: julien.pierre.bugs → nobody
Assignee | ||
Comment 43•19 years ago
|
||
I reassigned this in error. Reassigning back to me.
Assignee: nobody → julien.pierre.bugs
Reporter | ||
Comment 44•19 years ago
|
||
Comment on attachment 233029 [details] [diff] [review]
Fix nickname construction
> /* trust */
>- cc->trust = nssTrust_GetCERTCertTrustForCert(c, cc);
>+ PZ_Unlock(c->object.lock);
>+ trust = nssTrust_GetCERTCertTrustForCert(c, cc);
>+ PZ_Lock(c->object.lock);
>+ if (trust) {
>+ /* we should destroy cc->trust before replacing it, but it's
>+ allocated in cc->arena, so memory growth will occur on each
>+ refresh */
>+ cc->trust = trust;
>+ }
I think that test needs to be "if (trust && !cc->trust)"
to properly deal with the case where some other thread won the race that
occurred when we released the lock. That's just one place in this patch
where I see this issue.
Every place where we unlock, create, and relock, I think we MUST recheck
(after relocking) that the pointer to be filled is still unfilled, and not
fill it if we lost the race to do so. Getting this right and keeping it
right will be an ongoing issue if we go that route.
But I propose that we actually measure performance of the two approaches.
If the method of repeated unlocking and locking proves to be faster,
then I'll go along with it.
Attachment #233029 -
Flags: review?(nelson) → review-
Assignee | ||
Comment 45•19 years ago
|
||
Nelson,
The check you (and Bob) suggest is only necessary if you believe that one trust value is more accurate than the other, and thus preferable to assign. I contend that with the code written now, there is no way to tell which one is preferable, and that the check is thus unnecessary .
Let's examine the code that you pasted more closely and a possible execution scenario.
Let's say we have 2 threads, neither of which is holding the lock.
Both go into nssTrust_GetCERTCertTrustForCert, and compute their own trust (in the process each acquiring and releasing the lock - but that is irrelevant to the problem here). Call them trust1 and trust2 . Both have been allocated in the cert arena, and neither can be freed. At this point, we don't know which one of trust1 or trust2 got computed first or second. Neither one of them is better.
Thread 1 then acquires the lock, assigns trust1 to cc->trust, then releases the lock.
Thread 2 then acquires the lock.
You are telling me now that it should not assign trust2 to cc->trust just because cc->trust is already set to trust1. I don't see a good reason why not. What correctness problem is solved by not making this assignment ?
After all, trust2 could have been computer after trust1, so it might be the "more accurate" value. I'll grant you that trust1 could after been computed later also. We don't know exactly. They are close, and most likely, they are two pointers to trust objects with the same value. Just don't tell me trust2 should never be assigned.
Your suggestion in comment 44 to never change cc->trust after it has been initially assigned anywhere in the patch would cause the cc->trust never to refresh, even when the fill function is called explicitly to do a refresh. So clearly never updating cc->trust is not correct - it would just break fill function .
In comment 39, Bob suggested I check for (!cc->trust) only in one place, which is in stan_GetCERTCertificate . Hopefully now I have convinced you both that the lack of that check is not a "correctness issue" with my patch.
If the trust objects were timestamped during their computation (while holding the lock), then you would be able to know which one is better to have by doing a timestamp comparison, and therefore you could know which one to discard or to keep, but they are not, and thus the new object should not automatically be discarded.
As it is now, you could call the rand() function to make the decision of whether to keep the new object or not. That code would still be as correct as my patch, or the change suggested by Bob in comment 39. They would all end up with the same effective trust value, because the race is about multiple threads trying to create the same cert object at the same time. Thus, the timestamping is unnecessary (it would require modification of a public structure), since we don't care which trust goes in within that particular block of code that only happens during the short period of the cert creation, as long as the trust is filled.
Reporter | ||
Comment 46•19 years ago
|
||
(Julien, somehow you are entering comments that have lines longer than 80
characters. The bugzilla bug forms are designed to prevent that by limiting
the input to 80 columns. So I'm curious how you're doing that!)
The problem arises when one thread says "I won the race" and stores a struct
pointer (say, trust) into the cert, then releases the lock, remembering the
address of the trust struct that it used. Then another thread does the same
thing. At all times, the struct only points to one or the other, but both
threads have pointers to different values, each thinking it has the pointer
to the value in the cert struct. All but one such thread is mistaken.
The correctness of the code demands that all threads acting on this cert
structure see the SAME values for all the members, including the members
that are structs, pointed to by the cert structure. That property just isn't
true of the patch as it stands. This can be fixed by ensuring that the
FIRST thread to win the race is the one whose answer prevails, and all
other copies are discarded. I'm not saying the solution requires monitors,
(although I think that by now, it is apparent that it is easier to get the
code correct with them), but it does require a slightly different patch than this one.
Assignee | ||
Comment 47•19 years ago
|
||
Nelson,
(must be a bug in Mozilla, I just typed my answer in the field. I let the browser wrap the sentences for me. )
You need to be more specific about which code requires all threads to see the same values for all the members of the cert structure. I don't see any code that has that requirement.
To recap :
One of the main purposes of the fill_CERTCertificateFields function is to update cert->trust in certain situations, such as token insertion, cert import to token. If we use your recommendation of never reassigning cert->trust if it is non-NULL, then cert->trust will never be refreshed, and thus this function will be functionally broken. This is what Bob said in comment 39, and I agreed with that in comment 45 . Bob agreed that my patch for the fill_CERTCertificateFields function is correct.
Thus, the only part of the code that should still be under discussion is the assignment at the end of stan_GetCERTCertificate. This is the code pasted by Nelson in comment 44, so I won't repeat it again here. I believe I have demonstrated in comment 45 that it doesn't matter one way or the other whether we check for a non-NULL cc->trust before the assignment. I'm willing to make that change if it makes you guys happy, but it won't solve any correctness problem, since the trust in both threads in that code will be computed to the same value - it will only be multiple different trust pointers for a very short period of time, but that really doesn't matter as these pointers all point to correct trust. And they are declared as local to a small block within the stan_GetCERTCertificate, and they go out of scope as soon as that block finishes execution. These trust pointers all point to memory in the arena, so there is no memory leak, only growth.
Assignee | ||
Comment 48•19 years ago
|
||
Comment on attachment 233029 [details] [diff] [review]
Fix nickname construction
Bob,
Could you please read comment 45 and review this patch again ? The comment explains why the change you recommended in comment 39 isn't necessary for correctness (though it is not harmful, either).
Attachment #233029 -
Flags: superreview- → superreview?(rrelyea)
Assignee | ||
Comment 49•19 years ago
|
||
Comment on attachment 233029 [details] [diff] [review]
Fix nickname construction
Nelson,
Please review this patch again in light of the following facts :
1) One of the roles of fill_CERTCertificateFields is to update cc->trust, so we can't skip the assignment if cc->trust is non-NULL without breaking the code
2) for the last assignment in stan_GetCERTCertificate (which I thought was the one you were referring to in comment 44, but actually isn't) :
+ if (!cc->nssCertificate || forceUpdate) {
+ fill_CERTCertificateFields(c, cc, forceUpdate);
+ } else if (!cc->trust && !c->object.cryptoContext) {
+ /* if it's a perm cert, it might have been stored before the
+ * trust, so look for the trust again. But a temp cert can be
+ * ignored.
+ */
+ CERTCertTrust* trust = NULL;
+ PZ_Unlock(c->object.lock);
+ trust = nssTrust_GetCERTCertTrustForCert(c, cc);
+ PZ_Lock(c->object.lock);
+ cc->trust = trust;
}
+
+ loser:
+ PZ_Unlock(c->object.lock);
return cc;
}
The problem you mentioned in comment 46 does not exist in this code - it doesn't do anything with the local trust structure after assigning it to cc->trust - trust just goes out of scope shortly after the assignment. If that will satisfy you, I will add a
trust = NULL;
assignment right after the
cc->trust = trust;
statement . This way we will be sure that the thread never operates on a local copy that is different from the one assigned by another thread.
Attachment #233029 -
Flags: review- → review?(nelson)
Comment 50•19 years ago
|
||
Julien,
You are correct, but I prefer not changing the data structure if it's not necessary. Of course we are stuck in the "Fill-in" case because it's purpose is to replace the existing trust value.. in that case we can't tell the difference between the instance where we are replacing an out of date trust and an instance where we lost the race and end up replacing a recently replaced trust.
my review stands. I prefer you make the change to the non-Fill-in case for an r+.
my evaluation also stands. It's clear both patches are holding long term locks. In the case of long term locks, monitors are preferred, so I prefer nelson's patch.
bob
Assignee | ||
Comment 51•19 years ago
|
||
Bob,
If you can name one problem your recommended change solves in my patch, then I will make that change.
I agree that monitors are preferrable to solve this race, but I disagree that Nelson's patch is the way to go. See comment 38 for the reason why.
Comment 52•19 years ago
|
||
I saw your comment 38, and responded to it. The extra memory cost is minimal, certainly not enough to worry about.
The code change:
if (!cc->nssCertificate || forceUpdate) {
+ fill_CERTCertificateFields(c, cc, forceUpdate);
+ } else if (!cc->trust && !c->object.cryptoContext) {
+ /* if it's a perm cert, it might have been stored before the
+ * trust, so look for the trust again. But a temp cert can be
+ * ignored.
+ */
+ CERTCertTrust* trust = NULL;
+ PZ_Unlock(c->object.lock);
+ trust = nssTrust_GetCERTCertTrustForCert(c, cc);
+ PZ_Lock(c->object.lock);
+ cc->trust = trust;
}
In this case text c->trust is still null before setting it.
bob
Assignee | ||
Comment 53•19 years ago
|
||
Bob,
I'm concerned not just about the size of the condition variables, but also the extra mallocs that they need. This affects performance of all Stan code, not just race cases. So, I developed this patch to avoid creating them when not necessary.
This patch uses a flag to decide which type of synchronization object to create in nssPKIObject - either a lock or a monitor.
In almost all cases, the base (sometimes called proto) nssPKIObject and the real object are created very close to each other, so it was easy to determine the value of the argument to pass.
The only non-obvious case of which type of lock type to use was for collections, in the add_object_instance function. It creates the proto-objects long before the real objects, and the type of the objects is determined by the collection->createObject callback . Because the collections are only of objects of the same type, I added a lockType flag field to the collection structure to determine the correct lock type for new proto-objects. This flag is set at collection creation time.
Attachment #234344 -
Flags: superreview?(rrelyea)
Attachment #234344 -
Flags: review?(nelson)
Reporter | ||
Comment 54•19 years ago
|
||
Comment on attachment 234344 [details] [diff] [review]
Use monitors, but only for nssCertificate objects
I agree with this patch in principle, and would give it r+ except for a few
implementation details (including style issues).
> static void
> fill_CERTCertificateFields(NSSCertificate *c, CERTCertificate *cc, PRBool forced)
> {
>+ CERTCertTrust* trust = NULL;
> NSSTrust *nssTrust;
> NSSCryptoContext *context = c->object.cryptoContext;
>- nssCryptokiInstance *instance = get_cert_instance(c);
>+ nssCryptokiInstance *instance;
> NSSUTF8 *stanNick = NULL;
>+
>+ /* We are holding the base class object's lock on entry of this function
>+ * This lock protects writes to fields of the CERTCertificate .
>+ * It is also needed by some functions to compute values such as trust.
>+ */
I'd like to see this code assert that the caller is, in fact, holding the
back class object's lock here. That's easy to do with monitors. I will
attach a separate comment with notes on how to do that. This is a SHOULD,
not a MUST.
>@@ -731,114 +739,129 @@
> if (instance && !PK11_IsInternal(instance->token->pk11slot)) {
> tokenName = nssToken_GetName(instance->token);
> tokenlen = nssUTF8_Size(tokenName, &nssrv);
> } else {
> /* don't use token name for internal slot; 3.3 didn't */
> tokenlen = 0;
> }
> if (stanNick) {
> nicklen = nssUTF8_Size(stanNick, &nssrv);
I'd like to see a comment here, reminding the reader that nssUTF8_Size
returns a count that includes the trailing NUL byte. I'd also add that
this code doesn't assume that the byte with that trailing NUL exists
or contains NUL, so we put our own NUL termination on it, instead of
relying on the NUL in the source string. This code makes much more sense
once one knows those two facts. (Another SHOULD).
>Index: lib/pki/pkit.h
>===================================================================
>RCS file: /cvsroot/mozilla/security/nss/lib/pki/pkit.h,v
>retrieving revision 1.17
>diff -u -1 -0 -r1.17 pkit.h
>--- lib/pki/pkit.h 20 Jan 2005 02:25:49 -0000 1.17
>+++ lib/pki/pkit.h 18 Aug 2006 01:26:06 -0000
>@@ -86,33 +86,42 @@
>+typedef enum {
>+ nssPKILock = 0,
>+ nssPKIMonitor = 1
>+} nssPKILockType;
Please don't use 0 as one of the enumerated values. We want to be able to
tell the difference (e.g. in a core file) between a properly initialized
value, and one that is uninitialized or clobbered with zeros.
Please treat this change as a MUST.
But maybe the best thing is to not use this enum in the nssPKIObjectStr.
> struct nssPKIObjectStr
> {
> /* The arena for all object memory */
> NSSArena *arena;
> /* Atomically incremented/decremented reference counting */
> PRInt32 refCount;
> /* lock protects the array of nssCryptokiInstance's of the object */
>- PZLock *lock;
>+ union {
>+ PZLock* lock;
>+ PZMonitor *mlock;
>+ } sync;
>+ nssPKILockType lockType;
Rather than having a pointer of ambiguous type and a separate discriminant
value (lockType) that attempts to disambiguate the pointer type, I recommend
that we have two separate pointers, and no discriminant. The two pointer
approach will take up no more memory than the pointer+discrimanent approach,
and it is more robust. The code cannot pass the pointer to the wrong unlock
or destroy method when the type of each pointer is unambiguous. SHOULD.
>Index: lib/pki/pkibase.c
>===================================================================
>+NSS_IMPLEMENT void
>+nssPKIObject_Lock(nssPKIObject * object)
>+{
>+ switch (object->lockType)
>+ {
>+ case nssPKIMonitor:
>+ PZ_EnterMonitor(object->sync.mlock);
>+ break;
>+ case nssPKILock:
>+ PZ_Lock(object->sync.lock);
>+ break;
>+ default:
>+ PORT_Assert(0);
>+ }
>+}
These 4 new functions (only 1 shown here) don't use the coding style of
the rest of the file. The opening brace for the switch should be on
the same line as the switch itself, and the case labels (including the
default case label) should not be indented more than the switch itself.
>+NSS_IMPLEMENT void
>+nssPKIObject_Unlock(nssPKIObject * object)
>+{
>+ switch (object->lockType)
>+ {
>+ case nssPKIMonitor:
>+ PZ_ExitMonitor(object->sync.mlock);
>+ break;
>+ case nssPKILock:
>+ PZ_Unlock(object->sync.lock);
>+ break;
>+ default:
>+ PORT_Assert(0);
>+ }
>+}
If we eliminate the lockType discriminant, this code simply checks each
type of lock pointer for non-NULL and invokes the proper methond on it.
Code counts the number of methods invokes, and asserts that the count is 1.
>+NSS_IMPLEMENT void
>+nssPKIObject_DestroyLock(nssPKIObject * object)
Here too, rather than relying on a discriminant, simply destroy any
lock gizmo with a non-NULL pointer.
Attachment #234344 -
Flags: review?(nelson) → review-
Reporter | ||
Comment 55•19 years ago
|
||
There is a macro named PZ_InMonitor() that takes the address of a PRMonitor
as its argument, and returns 0 (false) or non-zero (true) reflecting (IIRC)
the truth of "my thread holds this monitor". You can use it in an assert.
I'd recommend creating a 5th function, e.g. nssPKIObject_HaveLock, that
returns this value and assert that this function returns non-false.
Reporter | ||
Comment 56•19 years ago
|
||
Just to be clear, the two MUSTs in the above review are:
a) don't use 0 as one of the values in the lock type enumeration
b) fix the style of the new nssPKIObject functions to match the file.
Assignee | ||
Comment 57•19 years ago
|
||
Nelson,
Regarding the suggestion to assert that the lock is held, I would rather not add an nssPKIObject_HaveLock, because that function can only be implemented with a monitor, but not a regular lock, and having a function that works only on one type of lock would break the abstraction. I think the comments are sufficiently clear. My preferred solution to clarify things would be to merge the fill function with its caller. Also note that your own patches for monitors didn't include this change and got two reviews - I used your patch as a starting point.
Re: the size issue, I don't want to add the comment to every nssUTF8_Size call. I think the call is OK given all the memcpy using x-1 as their length argument . This is also the same as in your patch.
Re: the discriminant value, I will change the enum to use 1 and 2. I don't think we should get rid of the discriminant. Keeping it will make it easier to add another type of lock later if need be (say, a r/w lock) by just adding case statements.
I will attach a patch that fixes the indentation and the enum values.
Assignee | ||
Comment 58•19 years ago
|
||
Attachment #233022 -
Attachment is obsolete: true
Attachment #233029 -
Attachment is obsolete: true
Attachment #233030 -
Attachment is obsolete: true
Attachment #234344 -
Attachment is obsolete: true
Attachment #234949 -
Flags: superreview?(rrelyea)
Attachment #234949 -
Flags: review?(nelson)
Attachment #233029 -
Flags: superreview?(rrelyea)
Attachment #233029 -
Flags: review?(nelson)
Attachment #234344 -
Flags: superreview?(rrelyea)
Reporter | ||
Comment 59•19 years ago
|
||
Comment on attachment 234949 [details] [diff] [review]
Fix discriminant value and indentation
r+ = nelson, with one more change (which I overlooked last time). In
>+nssPKIObject_NewLock(nssPKIObject * object, nssPKILockType lockType)
>+ default:
>+ PORT_Assert(0);
need to return NULL here.
>+ }
>+}
Attachment #234949 -
Flags: review?(nelson) → review+
Assignee | ||
Comment 60•19 years ago
|
||
You mean PR_FAILURE .
Assignee | ||
Comment 61•19 years ago
|
||
Thanks for the review, Nelson . I have checked in the patch to the tip, including the change from comment 60 .
Checking in certdb/stanpcertdb.c;
/cvsroot/mozilla/security/nss/lib/certdb/stanpcertdb.c,v <-- stanpcertdb.c
new revision: 1.72; previous revision: 1.71
done
Checking in pk11wrap/pk11cert.c;
/cvsroot/mozilla/security/nss/lib/pk11wrap/pk11cert.c,v <-- pk11cert.c
new revision: 1.153; previous revision: 1.152
done
Checking in pki/certificate.c;
/cvsroot/mozilla/security/nss/lib/pki/certificate.c,v <-- certificate.c
new revision: 1.59; previous revision: 1.58
done
Checking in pki/pki3hack.c;
/cvsroot/mozilla/security/nss/lib/pki/pki3hack.c,v <-- pki3hack.c
new revision: 1.89; previous revision: 1.88
done
Checking in pki/pkibase.c;
/cvsroot/mozilla/security/nss/lib/pki/pkibase.c,v <-- pkibase.c
new revision: 1.26; previous revision: 1.25
done
Checking in pki/pkim.h;
/cvsroot/mozilla/security/nss/lib/pki/pkim.h,v <-- pkim.h
new revision: 1.28; previous revision: 1.27
done
Checking in pki/pkit.h;
/cvsroot/mozilla/security/nss/lib/pki/pkit.h,v <-- pkit.h
new revision: 1.18; previous revision: 1.17
done
Checking in pki/tdcache.c;
/cvsroot/mozilla/security/nss/lib/pki/tdcache.c,v <-- tdcache.c
new revision: 1.44; previous revision: 1.43
done
Checking in pki/trustdomain.c;
/cvsroot/mozilla/security/nss/lib/pki/trustdomain.c,v <-- trustdomain.c
new revision: 1.53; previous revision: 1.52
done
I'll wait for Bob's review to check in to NSS_3_11_BRANCH .
Comment 62•19 years ago
|
||
Comment on attachment 234949 [details] [diff] [review]
Fix discriminant value and indentation
r+ yes, that should have been PR_FAILURE;)
bob
Attachment #234949 -
Flags: superreview?(rrelyea) → superreview+
Assignee | ||
Comment 63•19 years ago
|
||
Thanks for the review, Bob ! I have checked in the fix to NSS_3_11_BRANCH . Marking fixed.
Checking in certdb/stanpcertdb.c;
/cvsroot/mozilla/security/nss/lib/certdb/stanpcertdb.c,v <-- stanpcertdb.c
new revision: 1.67.28.5; previous revision: 1.67.28.4
done
Checking in pk11wrap/pk11cert.c;
/cvsroot/mozilla/security/nss/lib/pk11wrap/pk11cert.c,v <-- pk11cert.c
new revision: 1.143.2.9; previous revision: 1.143.2.8
done
Checking in pki/certificate.c;
/cvsroot/mozilla/security/nss/lib/pki/certificate.c,v <-- certificate.c
new revision: 1.56.2.3; previous revision: 1.56.2.2
done
Checking in pki/pki3hack.c;
/cvsroot/mozilla/security/nss/lib/pki/pki3hack.c,v <-- pki3hack.c
new revision: 1.86.28.3; previous revision: 1.86.28.2
done
Checking in pki/pkibase.c;
/cvsroot/mozilla/security/nss/lib/pki/pkibase.c,v <-- pkibase.c
new revision: 1.25.28.1; previous revision: 1.25
done
Checking in pki/pkim.h;
/cvsroot/mozilla/security/nss/lib/pki/pkim.h,v <-- pkim.h
new revision: 1.27.2.1; previous revision: 1.27
done
Checking in pki/pkit.h;
/cvsroot/mozilla/security/nss/lib/pki/pkit.h,v <-- pkit.h
new revision: 1.17.28.1; previous revision: 1.17
done
Checking in pki/tdcache.c;
/cvsroot/mozilla/security/nss/lib/pki/tdcache.c,v <-- tdcache.c
new revision: 1.42.2.2; previous revision: 1.42.2.1
done
Checking in pki/trustdomain.c;
/cvsroot/mozilla/security/nss/lib/pki/trustdomain.c,v <-- trustdomain.c
new revision: 1.51.28.2; previous revision: 1.51.28.1
done
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•