Closed Bug 129067 Opened 22 years ago Closed 22 years ago

Deleted certs still appear in Cert Manager M1RC2 [@ nsNSSCertificate::GetCert]

Categories

(Core Graveyard :: Security: UI, defect, P1)

1.0 Branch
defect

Tracking

(Not tracked)

VERIFIED FIXED
psm2.3

People

(Reporter: junruh, Assigned: KaiE)

References

()

Details

(Keywords: crash, topcrash, Whiteboard: [adt2 RTM] [Needs a=])

Crash Data

Attachments

(1 file, 3 obsolete files)

3/4 Linux, Mac and Win32 trunk builds.
1.) Get a cert from the above URL or Verisign.
2.) Open the Cert Manager and delete the cert.
What happens: The cert still appears in the Cert Manager.
Hopefully this is solved by the latest fixes in NSS3.4. cc relyea and wtc.
Let's try today and/or tomorrow's builds.
Keywords: nsbeta1+
Priority: -- → P1
Target Milestone: --- → 2.2
*** Bug 130689 has been marked as a duplicate of this bug. ***
kai
Assignee: ssaux → kaie
if you attempt to delete the cert twice the browser crashes.
I believe this is NSS related.
nsbeta-, because there's a workaround:
delete the cert, then do nothing. Restarting the browser shows that the cert is
gone.
Keywords: nsbeta1+crash, nsbeta1-
Adding relnote keyword.
Keywords: relnote
*** Bug 135752 has been marked as a duplicate of this bug. ***
*** Bug 135606 has been marked as a duplicate of this bug. ***
Relnote - Certificates deleted from the Certificate Manager will still appear, 
until you restart the browser.
This is a pretty common crash. How about nsbeta1+ ?
http://climate.netscape.com/reports/VeryFastSearchStackSigNEW.cfm?stacksig=nsNSS
Certificate
Target Milestone: 2.2 → 2.3
Version: 2.2 → 2.3
Keywords: nsbeta1-nsbeta1+
Addition to comment #4: If you really want to crash Mozilla, you only have to
work with certificate manager. In my opinion this part of Mozilla is the most
instable. I always have a bad feeling if I use the cert manager. (Not a bug
report, only a personal comment)
I tried to find a fix for this bug, but I did not yet succeed.

I saw, the PSM memory object, which contains the reference to the
CERT_Certificate, which is being destroyed, is leaked in certificate manager.
Therefore, this memory objects does not get destroyed, and
CERT_DestroyCertificate does not get called.

Of course, I don't know whether that is even required, after the certificate has
been destroyed.

At least I was able to find some leaks in cert manager, and I was able to reach
the state, where the memory object does get deleted.

But that did not help, even after closing the cert manager, the deleted cert is
still found.

It looks like PK11_ListCerts(PK11CertListUser) does still deliver the deleted
cert, although it had been deleted using PK11_DeleteTokenCertAndKey.

The interesting thing is, using the cert's db key to lookup the deleted cert,
does no longer succeed (used by the cert manager UI).

Because the certificate is no longer delivered after an application restart, I
would expect that PK11_ListCerts should immediately stop delivering the cert
after it has been deleted.
Attached patch Suggested Fix (obsolete) — Splinter Review
- uses ugly slow workaround for unexpected NSS behaviour
- fixes the display
- avoids crash
- fixes a memory leak, which will help for bug 125561
Javi, can you please review?
Status: NEW → ASSIGNED
Bob Relyea can you review as well?
Note to help reviewing:

The difficult part of that patch is where I try to check whether a cert is still
findable in the cert db.

But this part was put together by using logic from two existent methods,
nsNSSCertificate::GetDbKey and nsNSSCertificateDB::GetCertByDBKey.

Note that I filed NSS bug 138626, which I believe is the real cause of our problem.
Comment on attachment 80057 [details] [diff] [review]
Suggested Fix

argh.  I reviewed this last week, but somehow my commit didn't go through.  :(

Everything looks good.	If there is a function provided by NSS for creatint the
db Key from a cert I think we should use it.  Don't want a future change to NSS
to hurt us too much.

r=javi
Attachment #80057 - Flags: review+
Depends on: 138626
Right, it would be better to use a NSS internal key. I don't know if there is
one. But even if I knew, I would have to test whether that key still helps for
the workaround.

NOTE TO SELF: Before I check this in, I will add a comment before the workaround
block, that indicates the block as being a workaround around bug 138626.
Whiteboard: [adt2]
relyea said, the patch looks ok.

Alec, can you please sr= ?
Alec, can you please review?
Attached patch Slightly corrected patch (obsolete) — Splinter Review
This patch has one added comment, and I removed one hunk from the patch, that
is unrelated to this bug, and I moved it over to the separate bug 139349.
Attachment #80057 - Attachment is obsolete: true
Comment on attachment 80509 [details] [diff] [review]
Slightly corrected patch

sr=alecf, assuming javi is still ok with this
Attachment #80509 - Flags: superreview+
Keywords: adt1.0.0
Checked in to trunk, fixed.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Verified
Status: RESOLVED → VERIFIED
adding adt1.0.0+.  Please check this in to the branch as soon as possible and
add the fixed1.0.0 keyword.
Keywords: adt1.0.0adt1.0.0+
Attachment #80509 - Flags: needs-work+
Comment on attachment 80509 [details] [diff] [review]
Slightly corrected patch

After doing more testing I'm sorry, but I can not check this patch in.

I saw this patch works stable in debug mode.
It does work *sometimes* in optimized mode, but most of the time it works not.

Detailed explanation follows.


I would like to request permission to back out this patch from the trunk.
More explanation why the previous patch is bad.

The class nsNSSCertificate is using delayed deletion of the underlying NSS cert.
As long as this class is still refcounted, no deletion happens.

Because of the JS implementation of the cert manager, some references to the
deleted object stay around as long as the window is up.

Surprisingly, my code addition from the previous patch worked in debug mode. It
seems, my instructions, that assigned null to variables caused the references go
away.
But this did not work in optimized mode all the time, the order of deletion
seems to be random, and that influences whether the certs go away or not.

Because I don't have an influence on the order of deletion by Garbage Collection
from JS, I need to use a completely different approach to solve this bug.

Explanation of new approach follows.
With my next patch, I'll be using the following apporach, which seems to be
working very well. I did a lot of testing.

Instead of reloading the cert manager contents each time the user deletes a
cert, I'll now simply remove that cert from the list already in memory.
This makes it unnecessary to query the NSS cert database again as long as the
cert manager stays open.

Because when the cert manager gets closed, all objects will be freed, the
delayed deletion will happen at that time (this has always been that way).

With my new patch, the next time the cert manager gets opened, the deleted certs
do no longer appear in the list, i.e. the workaround for PK11_ListCerts from the
previous patch is not necessary.

Note that the new patch has a new method RemoveCert, which looks complicated.
But the logic were cloned from an already existing function (GetCertAtIndex) in
the same class.

In addition, another fix was necessary, that made the behaviour of certificate
deletion really bad.

For deleting a certificate, write access to the certificate storage is required.
If the user did not yet enter the master certificate in the session, he will
have to enter it before deletion can work.

When I did my additional testing with the previous patch, I randomly saw that I
got prompted after a random delay.
Sometimes I was not prompted before closing the cert manager dialog.

In order to add correctness, this patch calls the required NSS function, that
will ensure that the user already logged in, and will prompt the user
immediately before deletion is being attempted.
The approach to do that was explained to me on IRC by Ian, and in fact, I can
see that this approach is already used in different locations in the existing code.

I tested the patch with all types of certificates in certificate manager.
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
New patch as explained in previous comments.
Removing adt1.0.0+, until the new patch is r/sr=, and verified by QA.
Keywords: adt1.0.0+
Comment on attachment 81109 [details] [diff] [review]
Correct patch, replacing previous attempt

looks good to me.

r=javi
Attachment #81109 - Flags: review+
Alec, could you please sr= the new patch?
Attachment #81109 - Flags: superreview+
Comment on attachment 81109 [details] [diff] [review]
Correct patch, replacing previous attempt

I'm not a huge fan of making "params" a global variable, but if you're going to
do that, you need to name it "gParams" to indicate that is global.

other than that, this looks fine. sr=alecf with that change.
Nominating ADT.
Status: REOPENED → ASSIGNED
Keywords: adt1.0.0
How often are users deleting certs?
Blocks: 136814
Good question, I don't know how often users would try to delete them.
Adding Dave, to give us the marketing analysis for how often users delet certs?
Keywords: approval
Whiteboard: [adt2] → [adt2] [Needs a= & ADT approval]
I don't have a good sense for how often people delete certs - my feeling is not
very often. I've never done this myself... will try to gather more data.
Attached patch Updated patchSplinter Review
A quick summary where we are:
=============================

1.)
The first patch from this bug is on the trunk already.
It works unreliably, as explained before.
I will back it out from the trunk at the time I check in the new patch to the
trunk.

2.)
I'm attaching an updated version of the better patch. It is identical to the
previous patch, it only has one variable renamed, as suggested by alecf.

3.)
Note that this bug is still blocked by 138626. The new patch does not work
without having the patch from bug 138626 applied.

4.)
We are free to check this in to the branch if we want to, as long as we check
in both 

5.)
Before we land on the trunk, we need to wait a little - we are currently
working on updating the NSS version used by Mozilla trunk to include all the
latest fixes to NSS.
Attachment #80509 - Attachment is obsolete: true
Removing adt1.0.0, as there is new patch without reviews.
Keywords: adt1.0.0
Whiteboard: [adt2] [Needs a= & ADT approval] → [adt2]
Comment on attachment 82106 [details] [diff] [review]
Updated patch

This patch is the same as the patch before, and has one change as requested by
the superreviewer. It still has r=javi and sr=alecf.
Attachment #82106 - Flags: superreview+
Attachment #82106 - Flags: review+
I'm undoing Jaime's changes to keywords and whiteboard. I should have marked the
new patch as having reviews earlier. Sorry for the confusion.
Keywords: adt1.0.0
Whiteboard: [adt2] → [adt2] [Needs a= & ADT approval]
Attachment #81109 - Attachment is obsolete: true
Let's review this one in the RTM timeframe. adt1.0.0- [adt2 RTM] because
deleting certs does not appear something most users will do.
Keywords: adt1.0.0adt1.0.0-
Whiteboard: [adt2] [Needs a= & ADT approval] → [adt2 RTM]
Blocks: 143047
Blocks: 138818
*** Bug 144198 has been marked as a duplicate of this bug. ***
Marking as topcrash - the early M1RC2 talkback returns show this crash (deleting 
certs) in the top 20 crashes.
Keywords: topcrash
Summary: Deleted certs still appear in Cert Manager → Deleted certs still appear in Cert Manager M1RC2 [@ nsNSSCertificate::GetCert]
adding back adt1.0.0 nomination for rtm.  Since we can't land this on the trunk,
could you make a test build that QA can try out, or can others apply the patch
and the patch for 138626 and verify that this is fixed?
Keywords: adt1.0.0-adt1.0.0
As requested, I have created test builds, based on this morning's 1.0.0 branch,
having applied the patches from 138626 and 129067 (this one).

I have uploaded Windows and Linux builds to my home directory /u/kaie, file names
  mozilla-win32-1.0.0-branch-138626-129067.zip
  mozilla-linux-1.0.0-branch-138626-129067.tar.gz

John, can you please test whether the problem is fixed with the build?
It looks good on Win32 and Linux. I have no problem deleting any kind of cert.
Previous unreliable patch backed out from trunk.

New patch checked in to trunk.

Marking fixed.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago22 years ago
Resolution: --- → FIXED
Verified on the 5/16 Win32 trunk.
Kai,

I just wanted to confirm that your patch for this bug
requires the fix for NSS bug 138626.
Wan-Teh: Yes, 138626 is required for 129067.
Can we move some of these fixes into the branch?  QA is supposed to be testing
the branch, but so far, I've been using trunk since branch hasn't changed since
beta.

Thanks

- marked verified on trunk.  Add fixed1.0.0 once in the branch
Status: RESOLVED → VERIFIED
Whiteboard: [adt2 RTM] → [adt2 RTM] [Needs a=]
adt1.0.0+ (on ADT's behalf) for approval to checkin to the 1.0 branch, pending
Driver's approval.  After, checking in, please add the fixed1.0 keyword.
Keywords: adt1.0.0adt1.0.0+
changing to adt1.0.1+ for checkin to the 1.0 branch.  Please get drivers
approval before checking in.
Keywords: adt1.0.0+adt1.0.1+
Keywords: mozilla1.0.1
Update: I was about to request approval from drivers, but realized we are still
blocked by 145836 (which includes the fix for 138626 for landing on the branch).
Depends on: 145836
(dummy change, because I interrupted the bugmail when doing the previous change).
please checkin to the 1.0.1 branch. once there, remove the "mozilla1.0.1+"
keyword and add the "fixed1.0.1" keyword.
Attachment #82106 - Flags: approval+
Checked in to branch.
Verified on branch.
*** Bug 149621 has been marked as a duplicate of this bug. ***
This bug exists in Mozilla 1.1b.
Drew: Is it possible that what you see is bug 154040 instead?
Product: PSM → Core
Version: psm2.3 → 1.0 Branch
Crash Signature: [@ nsNSSCertificate::GetCert]
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: