Last Comment Bug 408434 - Crash with PKIX based verify
: Crash with PKIX based verify
Status: RESOLVED FIXED
PKIX NSS312B1
: crash
Product: NSS
Classification: Components
Component: Libraries (show other bugs)
: trunk
: x86 Linux
: P1 normal (vote)
: 3.12
Assigned To: Alexei Volkov
:
:
Mentors:
https://webmail.ug.bilkent.edu.tr/bcc...
Depends on: 395093
Blocks:
  Show dependency treegraph
 
Reported: 2007-12-14 15:56 PST by Kai Engert (:kaie)
Modified: 2008-01-16 15:32 PST (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
stack (4.76 KB, text/plain)
2007-12-14 16:00 PST, Kai Engert (:kaie)
no flags Details
Patch v1. (23.40 KB, patch)
2007-12-19 16:18 PST, Alexei Volkov
no flags Details | Diff | Splinter Review
Patch v2 (15.87 KB, patch)
2007-12-19 17:20 PST, Alexei Volkov
no flags Details | Diff | Splinter Review
Patch v3. (14.04 KB, patch)
2008-01-14 11:59 PST, Alexei Volkov
nelson: review+
Details | Diff | Splinter Review

Description Kai Engert (:kaie) 2007-12-14 15:56:53 PST
I was trying to reproduce bug 408432.
I was running firefox with latest trunk NSS and latest trunk NSPR.

I also happened to run with the testing env vars set:

 NSS_ENABLE_PKIX_VERIFY=1 
 NSS_EV_TEST_HACK=USE_PKIX

When I visit above URL I crash (not always, but every 2nd or 3rd attempt).
Comment 1 Kai Engert (:kaie) 2007-12-14 15:57:42 PST
Adding flags to whiteboard. Please remove if you think its wrong.
Comment 2 Kai Engert (:kaie) 2007-12-14 16:00:26 PST
Created attachment 293230 [details]
stack
Comment 3 Alexei Volkov 2007-12-14 21:39:19 PST
Such stacks are an indication that a destructed object(or may be uninitialized) got used. In this particular case it get decrefed. But I didn't see the crash in pkix_pl_aiamanager.c:390 before.
Comment 4 Alexei Volkov 2007-12-19 16:18:56 PST
Created attachment 293964 [details] [diff] [review]
Patch v1.

This patch for testing perposes only.

Proposed patch takes care of the problem related to request and session destruction(file pkix_pl_aiamgr.c) that with use of external to libpkix httpclient can not be destructed by libpkix decref function.

In addition some code clean up in pkix_pl_httpdefaultclient.c. For additional testing, this patch is also combined with patch for bug 395093.
Comment 5 Kai Engert (:kaie) 2007-12-19 17:11:08 PST
I tested this patch and had a chat with Alexei to discuss the results.

(First, I want to clarify, the crash I saw was a "signal 6", sigabrt. I realize the application can still be used in some way, even though one thread aborted. Strange. But I think, this still classifies as a crash.)

After I apply the patch, I no longer crash. This is good news. The patch helps with the crash!


But I was still confused, because when using pkix_verify, the connection does not work. It "stalls". Firefox says "connected", but nothing else happens.


Alexei discovered the problem is with LDAP.
The server cert contains an AIA extension that points to a ldap URL.

Indeed, when Firefox hangs, I can see an open connect to a server with ldap port 389!
Comment 6 Alexei Volkov 2007-12-19 17:20:54 PST
Created attachment 293971 [details] [diff] [review]
Patch v2

Patch has fix for the crash reported in the bug. Also changes ldap client to be blocked with io. The patch also has a cleanup for code in pkix_pl_httpdefaultclient.c.
Comment 7 Nelson Bolyard (seldom reads bugmail) 2008-01-03 20:41:14 PST
Comment on attachment 293971 [details] [diff] [review]
Patch v2

Review questions.  

1.  This looks like a workaround for a (potential) bug.  
Is there a bugzilla bug filed about this, to track it so we don't forget
to undo this later?

>                 /* No, create a connection (and cache it) */
>                 PKIX_CHECK(PKIX_PL_LdapDefaultClient_CreateByName
>                         (domainName,
>-                        PR_INTERVAL_NO_WAIT, /* PR_INTERVAL_NO_TIMEOUT, */
>+                         /* Do not use NBIO until we verify, that
>+                          * it is working */
>+                        PR_INTERVAL_NO_TIMEOUT, /* PR_INTERVAL_NO_WAIT, */
>                         NULL,
>                         &client,
>                         plContext),
>                         PKIX_LDAPDEFAULTCLIENTCREATEBYNAMEFAILED);

2. Should the following Session values (requestSession and serverSession) be 
zeroed after they are passed to their respective free functions?  If not, 
is there any chance of a double free occurring?

>+                /* Session and request cleanup in case of success */
>+                if (aiaMgr->client.hdata.requestSession != NULL) {
>+                    (*hcv1->freeFcn)(aiaMgr->client.hdata.requestSession);
>+                }
>+                if (aiaMgr->client.hdata.serverSession != NULL) {
>+                    (*hcv1->freeSessionFcn)(aiaMgr->client.hdata.serverSession);
>+                }
>+                aiaMgr->client.hdata.httpClient = 0; /* callback fn */
> 
>         } else  {
> 		PKIX_ERROR(PKIX_UNSUPPORTEDVERSIONOFHTTPCLIENT);
> 	}
> 
> cleanup:
>-        if (aiaMgr) {
>-            PKIX_DECREF(aiaMgr->client.hdata.requestSession);
>-            PKIX_DECREF(aiaMgr->client.hdata.serverSession);
>+        /* Session and request cleanup in case of error. Passing through without cleanup
>+         * if interrupted by blocked IO. */
>+        if (PKIX_ERROR_RECEIVED && aiaMgr) {
>+            if (aiaMgr->client.hdata.requestSession != NULL) {
>+                (*hcv1->freeFcn)(aiaMgr->client.hdata.requestSession);
>+            }
>+            if (aiaMgr->client.hdata.serverSession != NULL) {
>+                (*hcv1->freeSessionFcn)(aiaMgr->client.hdata.serverSession);
>+            }
>             aiaMgr->client.hdata.httpClient = 0; /* callback fn */
>         }

3. There are many pkix_pl_ functions (like the following one) that return a 
SECStatus, but do not set any error code (e.g. don't call PORT_SetError).  
How is the caller supposed to know what went wrong if they fail?

> SECStatus
> pkix_pl_HttpDefaultClient_CreateSessionFcn(
>         const char *host,
>         PRUint16 portnum,
>         SEC_HTTP_SERVER_SESSION *pSession)
> {
>         PKIX_Error *err = pkix_pl_HttpDefaultClient_CreateSession
>                 (host, portnum, pSession, plContext);
> 
>-        if (err == NULL) {
>-                return SECSuccess;
>-        } else {
>+        if (err) {
>                 PKIX_PL_Object_DecRef((PKIX_PL_Object *)err, plContext);
>                 return SECFailure;
>         }
>+        return SECSuccess;
> }

4. This patch removes a large section from all.sh.  
I think this will have the effect of removing all testing for PKIX, 
DB upgrades and shared DB, and removes the "cleanup" call.
Did you really intend to commit that change?
Are these things now being done in some other script?

>Index: tests/all.sh

>-# PKIX tests
>-if [ -z "$NSS_TEST_DISABLE_PKIX" ] ; then

>-# upgrade cert dbs to shared db + run tests there
>-if [ -z "$NSS_TEST_DISABLE_UPGRADE_DB" ] ; then

>-# tests for native sharedb support
>-if [ -z "$NSS_TEST_DISABLE_SHARED_DB" ] ; then

>-SCRIPTNAME=all.sh
>-
>-. ${QADIR}/common/cleanup.sh
Comment 8 Alexei Volkov 2008-01-14 11:20:18 PST
(In reply to comment #7)
> (From update of attachment 293971 [details] [diff] [review])
> Review questions.  
> 
> 1.  This looks like a workaround for a (potential) bug.  
> Is there a bugzilla bug filed about this, to track it so we don't forget
> to undo this later?
Right. Bug 412311 is filed to track this change. 

> 2. Should the following Session values (requestSession and serverSession) be 
> zeroed after they are passed to their respective free functions?  If not, 
> is there any chance of a double free occurring?
Those should be set to null. Thanks, I've missed this.

> 3. There are many pkix_pl_ functions (like the following one) that return a 
> SECStatus, but do not set any error code (e.g. don't call PORT_SetError).  
> How is the caller supposed to know what went wrong if they fail?

This issue should be resolved before we try to solve libpkix error code conversion to nss error code. Filed a new bug 412318 to track it.
> 4. This patch removes a large section from all.sh.  
> I think this will have the effect of removing all testing for PKIX, 
> DB upgrades and shared DB, and removes the "cleanup" call.
> Did you really intend to commit that change?
> Are these things now being done in some other script?

This is a mistake to have those changes in the patch. New patch will not have them.
Comment 9 Alexei Volkov 2008-01-14 11:59:04 PST
Created attachment 297020 [details] [diff] [review]
Patch v3.

zero the session pointer after it got freed.
Comment 10 Nelson Bolyard (seldom reads bugmail) 2008-01-16 14:34:30 PST
Comment on attachment 297020 [details] [diff] [review]
Patch v3.

Bug 412311 should be marked with NSS312B2.

r=nelson
Comment 11 Alexei Volkov 2008-01-16 15:32:48 PST
attachment 297020 [details] [diff] [review] is integrated.

Note You need to log in before you can comment on or make changes to this bug.