re-enable full-duplex testing of libSSL in selfserv

NEW
Assigned to

Status

NSS
Tools
P2
enhancement
11 years ago
3 years ago

People

(Reporter: Nelson Bolyard (seldom reads bugmail), Assigned: Julien Pierre)

Tracking

3.21.1
All
Windows XP

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 4 obsolete attachments)

SSL is not inherently a half-duplex protcol.  It is capable of doing two-way simultaneous (a.k.a. "full duplex") communication, and NSS's 
libSSL is capable of supporting such communication (although that is not its default mode of operation).  

No products are known to use the full duplex feature of libSSL. 
There is no known demand for that feature.

At one time, long ago, NSS's test clients and servers tested libSSL's  full duplex capability.  But then it was discovered that selfserv's design had some serious performance limitations, and work was done to eliminate those limitations.  To hasten the completion of that performance work, the FULL_DUPLEX_CAPABLE macro was put into selfserv.c to temporarily(!) stop building the code whose purpose was to test libSSL's full-duplex capability.  When the high speed testing work was completed, the work to re-enable full duplex testing in selfserv was never done.  

I suspect that the code inside that ifdef no longer builds (if the symbol is defined), or (if it does) that the code is never invoked, or does not work properly.
(Reporter)

Updated

11 years ago
Priority: -- → P4
(Assignee)

Comment 1

3 years ago
It may not be correct that there are no apps using full duplex. Anyway, I will be restoring this full duplex in selfserv.
(Assignee)

Updated

3 years ago
Assignee: nobody → julien.pierre
(Assignee)

Updated

3 years ago
Priority: P4 → P2
(Assignee)

Updated

3 years ago
Target Milestone: --- → 3.21.1
(Assignee)

Comment 2

3 years ago
Created attachment 8671359 [details] [diff] [review]
Initial patch

Implement streaming mode on /streaming URL .
The server will send as much data as the client sent.
Thus, the client must also be sending streaming data through POST in order for this feature to work.

The only additional selfserv command-line option needed to enable this mode is -F .

Lots of code cleanup, also.
(Assignee)

Comment 3

3 years ago
Comment on attachment 8671359 [details] [diff] [review]
Initial patch

>diff -r 1efea96a3c71 cmd/selfserv/selfserv.c
>--- a/cmd/selfserv/selfserv.c	Tue Oct 06 17:18:56 2015 +0200
>+++ b/cmd/selfserv/selfserv.c	Thu Oct 08 06:25:20 2015 -0700
>@@ -63,6 +63,7 @@
> 
> #define DEFAULT_BULK_TEST 16384
> #define MAX_BULK_TEST     1048576 /* 1 MB */
>+static PRBool fullDuplex;
> static PRBool testBulk;
> static PRUint32 testBulkSize       = DEFAULT_BULK_TEST;
> static PRInt32 testBulkTotal;
>@@ -140,7 +141,6 @@
> static PRBool  noDelay;
> static int     requestCert;
> static int	verbose;
>-static SECItem	bigBuf;
> static int configureDHE = -1; /* -1: don't configure, 0 disable, >=1 enable*/
> static int configureReuseECDHE = -1; /* -1: don't configure, 0 refresh, >=1 reuse*/
> static int configureWeakDHE = -1; /* -1: don't configure, 0 disable, >=1 enable*/
>@@ -225,6 +225,7 @@
> "-W override default DHE server weak parameters support, 0: disable, 1: enable\n"
> "-c Restrict ciphers\n"
> "-Y prints cipher values allowed for parameter -c and exits\n"
>+"-F enable full-duplex mode on /streaming URI\n"
>     , stderr);
> }
> 
>@@ -280,10 +281,11 @@
> errWarn(char * funcString)
> {
>     PRErrorCode  perr      = PR_GetError();
>+    PRInt32      oserr     = PR_GetOSError();
>     const char * errString = SECU_Strerror(perr);
> 
>-    fprintf(stderr, "selfserv: %s returned error %d:\n%s\n",
>-            funcString, perr, errString);
>+    fprintf(stderr, "selfserv: %s returned error %d, OS error %d: %s\n",
>+            funcString, perr, oserr, errString);
>     return errString;
> }
> 
>@@ -866,171 +868,6 @@
> 	cipherlist[nciphers++] = (PRUint16)c;
> }
> 
>-
>-#ifdef FULL_DUPLEX_CAPABLE
>-
>-struct lockedVarsStr {
>-    PZLock *	lock;
>-    int		count;
>-    int		waiters;
>-    PZCondVar *	condVar;
>-};
>-
>-typedef struct lockedVarsStr lockedVars;
>-
>-void 
>-lockedVars_Init( lockedVars * lv)
>-{
>-    lv->count   = 0;
>-    lv->waiters = 0;
>-    lv->lock    = PZ_NewLock(nssILockSelfServ);
>-    lv->condVar = PZ_NewCondVar(lv->lock);
>-}
>-
>-void
>-lockedVars_Destroy( lockedVars * lv)
>-{
>-    PZ_DestroyCondVar(lv->condVar);
>-    lv->condVar = NULL;
>-
>-    PZ_DestroyLock(lv->lock);
>-    lv->lock = NULL;
>-}
>-
>-void
>-lockedVars_WaitForDone(lockedVars * lv)
>-{
>-    PZ_Lock(lv->lock);
>-    while (lv->count > 0) {
>-    	PZ_WaitCondVar(lv->condVar, PR_INTERVAL_NO_TIMEOUT);
>-    }
>-    PZ_Unlock(lv->lock);
>-}
>-
>-int	/* returns count */
>-lockedVars_AddToCount(lockedVars * lv, int addend)
>-{
>-    int rv;
>-
>-    PZ_Lock(lv->lock);
>-    rv = lv->count += addend;
>-    if (rv <= 0) {
>-	PZ_NotifyCondVar(lv->condVar);
>-    }
>-    PZ_Unlock(lv->lock);
>-    return rv;
>-}
>-
>-int
>-do_writes(
>-    PRFileDesc *	ssl_sock,
>-    PRFileDesc *	model_sock,
>-    int         	requestCert
>-    )
>-{
>-    int			sent  = 0;
>-    int 		count = 0;
>-    lockedVars *	lv = (lockedVars *)model_sock;
>-
>-    VLOG(("selfserv: do_writes: starting"));
>-    while (sent < bigBuf.len) {
>-
>-	count = PR_Write(ssl_sock, bigBuf.data + sent, bigBuf.len - sent);
>-	if (count < 0) {
>-	    errWarn("PR_Write bigBuf");
>-	    break;
>-	}
>-	FPRINTF(stderr, "selfserv: PR_Write wrote %d bytes from bigBuf\n", count );
>-	sent += count;
>-    }
>-    if (count >= 0) {	/* last write didn't fail. */
>-    	PR_Shutdown(ssl_sock, PR_SHUTDOWN_SEND);
>-    }
>-
>-    /* notify the reader that we're done. */
>-    lockedVars_AddToCount(lv, -1);
>-    FLUSH;
>-    VLOG(("selfserv: do_writes: exiting"));
>-    return (sent < bigBuf.len) ? SECFailure : SECSuccess;
>-}
>-
>-static int 
>-handle_fdx_connection(
>-    PRFileDesc *       tcp_sock,
>-    PRFileDesc *       model_sock,
>-    int                requestCert
>-    )
>-{
>-    PRFileDesc *       ssl_sock	= NULL;
>-    SECStatus          result;
>-    int                firstTime = 1;
>-    lockedVars         lv;
>-    PRSocketOptionData opt;
>-    char               buf[10240];
>-
>-
>-    VLOG(("selfserv: handle_fdx_connection: starting"));
>-    opt.option             = PR_SockOpt_Nonblocking;
>-    opt.value.non_blocking = PR_FALSE;
>-    PR_SetSocketOption(tcp_sock, &opt);
>-
>-    if (useModelSocket && model_sock) {
>-	SECStatus rv;
>-	ssl_sock = SSL_ImportFD(model_sock, tcp_sock);
>-	if (!ssl_sock) {
>-	    errWarn("SSL_ImportFD with model");
>-	    goto cleanup;
>-	}
>-	rv = SSL_ResetHandshake(ssl_sock, /* asServer */ 1);
>-	if (rv != SECSuccess) {
>-	    errWarn("SSL_ResetHandshake");
>-	    goto cleanup;
>-	}
>-    } else {
>-	ssl_sock = tcp_sock;
>-    }
>-
>-    lockedVars_Init(&lv);
>-    lockedVars_AddToCount(&lv, 1);
>-
>-    /* Attempt to launch the writer thread. */
>-    result = launch_thread(do_writes, ssl_sock, (PRFileDesc *)&lv, 
>-                           requestCert);
>-
>-    if (result == SECSuccess) 
>-      do {
>-	/* do reads here. */
>-	int count;
>-	count = PR_Read(ssl_sock, buf, sizeof buf);
>-	if (count < 0) {
>-	    errWarn("FDX PR_Read");
>-	    break;
>-	}
>-	FPRINTF(stderr, "selfserv: FDX PR_Read read %d bytes.\n", count );
>-	if (firstTime) {
>-	    firstTime = 0;
>-	    printSecurityInfo(ssl_sock);
>-	}
>-    } while (lockedVars_AddToCount(&lv, 0) > 0);
>-
>-    /* Wait for writer to finish */
>-    lockedVars_WaitForDone(&lv);
>-    lockedVars_Destroy(&lv);
>-    FLUSH;
>-
>-cleanup:
>-    if (ssl_sock) {
>-	PR_Close(ssl_sock);
>-    } else if (tcp_sock) {
>-	PR_Close(tcp_sock);
>-    }
>-
>-    VLOG(("selfserv: handle_fdx_connection: exiting"));
>-    return SECSuccess;
>-}
>-
>-#endif
>-
> static SECItem *lastLoadedCrl = NULL;
> 
> static SECStatus
>@@ -1252,6 +1089,121 @@
>    }
> }
> 
>+PRIntervalTime maxInterval    = PR_INTERVAL_NO_TIMEOUT;
>+
>+/* used for full-duplex mode only */
>+PRBool send_buf(PRFileDesc* fd, SECItem* buf)
>+{
>+    unsigned int sent = 0;
>+    int count = 0;
>+
>+    FPRINTF(stderr, "strsclnt: sending buffer of size %d .\n", 
>+            buf->len );
>+
>+    while (sent < buf->len) {
>+
>+        count = PR_Send(fd, buf->data + sent, buf->len - sent, 
>+                        0, maxInterval);
>+        if (count < 0) {
>+            errWarn("PR_Send buffer");
>+            break;
>+        }
>+        FPRINTF(stderr, "strsclnt: PR_Send wrote %d bytes from buffer\n", 
>+                count );
>+        sent += count;
>+    }
>+    if (sent != buf->len) {
>+        return PR_FALSE;
>+    } else {
>+        return PR_TRUE;
>+    }
>+}
>+
>+/* used for full-duplex mode only */
>+PRBool send_string(PRFileDesc* fd, const char* str)
>+{
>+    SECItem buf;
>+    buf.len = strlen(str);
>+    PR_ASSERT(buf.len);
>+    buf.data = (void*) str;
>+    return send_buf(fd, &buf);
>+}
>+
>+#define CHUNK_HEADER_FOOTER_SIZE    20
>+
>+void generate_chunk(char* chunk_buf, int chunk_size)
>+{
>+    unsigned int hdrlen = sprintf(chunk_buf, "%X\n", chunk_size);
>+    memset(chunk_buf + hdrlen, 'a', chunk_size);
>+    chunk_buf[hdrlen+chunk_size] = '\n';
>+    chunk_buf[hdrlen+chunk_size+1] = 0;
>+}
>+
>+typedef struct perWriteThreadStr {
>+    PRFileDesc* ssl_sock;
>+    PRInt32 counter;
>+    PRBool done;
>+} perWriteThread ;
>+
>+/* writer thread, used only for full-duplex connections - ie 2. threads per socket .
>+   This will continuously send chunks to the client of size equal to the client request,
>+   until the reader thread reaches EOF . The synchronization is lock-free using only
>+   atomics */
>+void do_writes(void* a)
>+{
>+    perWriteThread* data = (perWriteThread*) a;
>+    PRFileDesc *	ssl_sock	= data->ssl_sock;
>+
>+    /* send HTTP server response header */
>+    const char* defaultHeader =
>+    "HTTP/1.1 200 OK\r\n"
>+    "Server: Generic Web Server\r\n"
>+    "Content-type: text/plain\r\n"
>+    "Connection: Keep-Alive\r\n"
>+    "Transfer-Encoding: chunked\r\n"
>+    "\r\n";
>+
>+    /* send HTTP/1.1 chunks */
>+    const char* last_chunk = "0\n\n";
>+
>+    send_string(ssl_sock, defaultHeader);
>+
>+    unsigned int chunk_buffer_size = 16384;
>+    char* chunk_buf = (char*) malloc(chunk_buffer_size);
>+
>+    /* send chunks of the same size as client data */
>+    do {
>+        PRInt32 toSend = PR_AtomicSet(&data->counter, 0);
>+        if (0 == toSend) {
>+            if (data->done) {
>+                /* reader thread reached EOF, stop sending chunks */
>+                break;
>+            }
>+            /* the client may still send more POST data. Yield to avoid spinning the CPU. */
>+            PR_Sleep(0);
>+            continue;
>+        }
>+        while (toSend) {
>+            PRInt32 this_chunk_size = PR_MIN(toSend, chunk_buffer_size - CHUNK_HEADER_FOOTER_SIZE);
>+            generate_chunk(chunk_buf, this_chunk_size);
>+            if (PR_FALSE == send_string(ssl_sock, chunk_buf)) {
>+                break;
>+            }
>+            toSend -= this_chunk_size;
>+        }
>+        if (toSend) {
>+            /* could not send last chunk, abort */
>+            break;
>+        }
>+    } while (1);
>+
>+    free(chunk_buf);
>+
>+    send_string(ssl_sock, last_chunk);
>+    PR_Shutdown(ssl_sock, PR_SHUTDOWN_SEND);
>+}
>+
>+
> int
> handle_connection( 
>     PRFileDesc *tcp_sock,
>@@ -1261,7 +1213,7 @@
> {
>     PRFileDesc *       ssl_sock = NULL;
>     PRFileDesc *       local_file_fd = NULL;
>-    char  *            post;
>+    char  *            post = NULL;
>     char  *            pBuf;			/* unused space at end of buf */
>     const char *       errString;
>     PRStatus           status;
>@@ -1342,6 +1294,8 @@
> 	}
>     }
> 
>+    PRUint32 received = 0;
>+
>     while (1) {
> 	newln = 0;
> 	reqLen     = 0;
>@@ -1349,13 +1303,14 @@
> 	if (rv == 0 || 
> 	    (rv < 0 && PR_END_OF_FILE_ERROR == PR_GetError())) {
> 	    if (verbose)
>-		errWarn("HDX PR_Read hit EOF");
>+		errWarn("PR_Read hit EOF");
> 	    break;
> 	}
> 	if (rv < 0) {
>-	    errWarn("HDX PR_Read");
>+	    errWarn("PR_Recv");
> 	    goto cleanup;
> 	}
>+        received += rv;
> 	/* NULL termination */
> 	pBuf[rv] = 0;
> 	if (firstTime) {
>@@ -1363,6 +1318,11 @@
> 	    printSecurityInfo(ssl_sock);
> 	}
> 
>+        if (fullDuplex) {
>+            /* print request content */
>+            FPRINTF(stderr, "%s", pBuf);
>+        }
>+
> 	pBuf   += rv;
> 	bufRem -= rv;
> 	bufDat = pBuf - buf;
>@@ -1392,7 +1352,7 @@
> 	 * This parsing is a hack, but ok for SSL test purposes.
> 	 */
> 	post = PORT_Strstr(buf, "POST ");
>-	if (!post || *post != 'P') 
>+	if (!post) 
> 	    break;
> 
> 	/* It's a post, so look for the next and final CR/LF. */
>@@ -1407,6 +1367,56 @@
> 	    break;
>     } /* read loop */
> 
>+    if (fullDuplex) {
>+        /* parse URI*/
>+        char* uri = NULL;
>+        if (post) {
>+            uri = strchr(post, ' ');
>+            if (uri) {
>+                uri++;
>+            }
>+            if (uri) {
>+                char* space = PORT_Strchr(uri, ' ');
>+                if (space) {
>+                    *space = 0; /* NULL termination */
>+                }
>+            }
>+        }
>+
>+        /* turn on full duplex mode on /streaming URI */
>+        if (uri && (0 == strcmp(uri, "/streaming"))) {
>+            FPRINTF(stderr, "Enabling full-duplex mode.\n");
>+            /* launch writer thread */
>+            perWriteThread tdata;
>+            tdata.ssl_sock = ssl_sock;
>+            tdata.counter = received;
>+            tdata.done = PR_FALSE;
>+            PRThread* writer =
>+                PR_CreateThread(PR_USER_THREAD, do_writes, (void*)&tdata,
>+                PR_PRIORITY_NORMAL, PR_GLOBAL_THREAD, PR_JOINABLE_THREAD, 0);
>+            if (!writer) {
>+                goto cleanup;
>+            }
>+
>+            /* read rest of client request until EOF or error */
>+            do {
>+                rv = PR_Read(ssl_sock, buf, sizeof(buf)-1);
>+                if (rv>0) {
>+                    buf[rv] = 0;
>+                    FPRINTF(stderr, "%s", buf);
>+                    PR_AtomicAdd(&tdata.counter, rv);
>+                }
>+            } while (rv > 0);
>+            /* notify writer thread that it can stop sending chunks. */
>+            /* no lock is needed here, the writer thread will test this flag when idle */
>+            tdata.done = PR_TRUE;
>+
>+            PRStatus rc = PR_JoinThread(writer);
>+            PR_ASSERT(PR_SUCCESS == rc);
>+            goto cleanup;
>+        }
>+    }
>+
>     bufDat = pBuf - buf;
>     if (bufDat) do {	/* just close if no data */
> 	/* Have either (a) a complete get, (b) a complete post, (c) EOF */
>@@ -1702,7 +1712,7 @@
> getBoundListenSocket(unsigned short port)
> {
>     PRFileDesc *       listen_sock;
>-    int                listenQueueDepth = 5 + (2 * maxThreads);
>+    int                listenQueueDepth = 1024 + (2 * maxThreads);
>     PRStatus	       prStatus;
>     PRNetAddr          addr;
>     PRSocketOptionData opt;
>@@ -1961,7 +1971,7 @@
> 	}
>     }
> 
>-    if (bigBuf.data) { /* doing FDX */
>+    if (fullDuplex) { /* doing FDX */
> 	rv = SSL_OptionSet(model_sock, SSL_ENABLE_FDX, 1);
> 	if (rv < 0) {
> 	    errExit("SSL_OptionSet SSL_ENABLE_FDX");
>@@ -2024,47 +2034,6 @@
> 
> }
> 
>-SECStatus
>-readBigFile(const char * fileName)
>-{
>-    PRFileInfo  info;
>-    PRStatus	status;
>-    SECStatus	rv	= SECFailure;
>-    int		count;
>-    int		hdrLen;
>-    PRFileDesc *local_file_fd = NULL;
>-
>-    status = PR_GetFileInfo(fileName, &info);
>-
>-    if (status == PR_SUCCESS &&
>-	info.type == PR_FILE_FILE &&
>-	info.size > 0 &&
>-	NULL != (local_file_fd = PR_Open(fileName, PR_RDONLY, 0))) {
>-
>-	hdrLen      = PORT_Strlen(outHeader);
>-	bigBuf.len  = hdrLen + info.size;
>-	bigBuf.data = PORT_Malloc(bigBuf.len + 4095);
>-	if (!bigBuf.data) {
>-	    errWarn("PORT_Malloc");
>-	    goto done;
>-	}
>-
>-	PORT_Memcpy(bigBuf.data, outHeader, hdrLen);
>-
>-	count = PR_Read(local_file_fd, bigBuf.data + hdrLen, info.size);
>-	if (count != info.size) {
>-	    errWarn("PR_Read local file");
>-	    goto done;
>-	}
>-	rv = SECSuccess;
>-done:
>-        if (local_file_fd) {
>-            PR_Close(local_file_fd);
>-        }
>-    }
>-    return rv;
>-}
>-
> int          numChildren;
> PRProcess *  child[MAX_PROCS];
> 
>@@ -2184,7 +2153,6 @@
>     char *               ecNickName   = NULL;
> #endif
>     char *               dsaNickName  = NULL;
>-    const char *         fileName    = NULL;
>     char *               cipherString= NULL;
>     const char *         dir         = ".";
>     char *               passwd      = NULL;
>@@ -2228,12 +2196,10 @@
>     ** numbers, then capital letters, then lower case, alphabetical. 
>     */
>     optstate = PL_CreateOptState(argc, argv, 
>-        "2:A:BC:DEGH:L:M:NP:RS:T:U:V:W:Ya:bc:d:e:f:g:hi:jk:lmn:op:qrst:uvw:xyz");
>+        "A:BC:DEFGH:L:M:NP:RS:T:U:V:W:Ya:bc:d:e:f:g:hi:jk:lmn:op:qrst:uvw:xyz");
>     while ((status = PL_GetNextOpt(optstate)) == PL_OPT_OK) {
> 	++optionsFound;
> 	switch(optstate->option) {
>-	case '2': fileName = optstate->value; break;
>-
> 	case 'A': ocspStaplingCA = PORT_Strdup(optstate->value); break;
> 
> 	case 'B': bypassPKCS11 = PR_TRUE; break;
>@@ -2242,6 +2208,7 @@
> 
> 	case 'D': noDelay = PR_TRUE; break;
> 	case 'E': disableStepDown = PR_TRUE; break;
>+	case 'F': fullDuplex = PR_TRUE; break;
> 	case 'H': configureDHE = (PORT_Atoi(optstate->value) != 0); break;
> 
>         case 'G': enableExtendedMasterSecret = PR_TRUE; break;
>@@ -2512,9 +2479,6 @@
> 
>     lm = PR_NewLogModule("TestCase");
> 
>-    if (fileName)
>-    	readBigFile(fileName);
>-
>     /* set our password function */
>     PK11_SetPasswordFunc(SECU_GetModulePassword);
(Assignee)

Comment 4

3 years ago
Created attachment 8672204 [details] [diff] [review]
Patch update - remove newline at the end
Attachment #8671359 - Attachment is obsolete: true
(Assignee)

Comment 5

3 years ago
I tried to remove the unnecessary newline at the end of the first patch, but that just created a comment and didn't fix the attachment.
I then tried to upload the patch again, still using cut and paste. But it's still showing the error when applying the patch.

I think the copy and paste feature with bugzilla is broken somehow. I copied the patch into the clipboard on my Linux VM, and pasted it into Firefox on my Linux host.
It seems that I will be forced to run Firefox on Linux to update the actual patch file. Hopefully, third time will be the charm.
(Assignee)

Comment 6

3 years ago
Created attachment 8672205 [details] [diff] [review]
Initial patch, minus newline at the end that was causing issues when applying
Attachment #8672204 - Attachment is obsolete: true

Comment 7

3 years ago
I reviewed this patch at:
https://codereview.appspot.com/270090043/
(Assignee)

Comment 8

3 years ago
Eric,
I must say I did not expect 26 comments. Is there a way for me to respond to each one in the review tool you used ?

Comment 9

3 years ago
Yes, there should be a "sign-in" link in the upper-right hand corner. You will need
a Google account to log in. Once you have done that, you can click "reply" in each
comment box.
(Assignee)

Comment 10

3 years ago
Yes, I saw the signin link, but I didn't want to use my personal google for this, and creating another just seems like more trouble. I'd like to be able to sign in with my bugzilla account. I will just respond to your comments here.
(Assignee)

Comment 11

3 years ago
Created attachment 8672897 [details] [diff] [review]
Update from Eric's review.

Here is my response to your review :

1) cmd/selfserv/selfserv.c:1395: if (!post || *post != 'P')
What was this check for?

I believe this was to check if the preceding PORT_Strstr is really doing its job.
I don't believe it is necessary, so I removed it. If we want to check that PORT_Strstr really works, it should not be tested in its own test, not selfserv.

2) I don't understand the next comment - the one that says 
File cmd/selfserv/selfserv.c (right):
Are you asking me to change something, and if so, where ?

3) cmd/selfserv/selfserv.c:228: "-F enable full-duplex mode on /streaming URI\n"
Why on /streaming alone?

This makes it easier to write a test script that can run regular tests as well as streaming tests without having to restart the server.

4) 
cmd/selfserv/selfserv.c:284: PRInt32      oserr     = PR_GetOSError();
Not all errors are os errors. you should probably check first somehow

I just copied the code we had in strsclnt already to make it more consistent.
AFAIK, there is no way to tell for sure if the NSPR error code is also due to an OS error. It could be a stale OS error code, or not. So, I just log it. If you know of a way to make sure that the OS error code is not stale, then I can change it and log the OS error only in that case.

5) 

done.

6) it would be very helpful if you could qualify what you mean by "this code", especially as the line numbers between stsrclnt and selfserv are different, and the review comments in strsclnt do not specify which function they apply to. Line 1140 in selfserv is in generate_chunk, but that function is a little different between strsclnt and selfserv. Also, I started looking at your selfserv review first. I can't really address this comment properly without clarification.

7) I have rewritten the comment, but you could have been more precise as to what you expected, as I'm still not sure the rewritten comment meets your expectations.

8)
cmd/selfserv/selfserv.c:1154: perWriteThread* data = (perWriteThread*) a;
This does not need a cast.

The cast is also not harmful, and this is the existing coding style NSS uses throughout for thread callbacks, including the existing function thread_wrapper in both selfserv and strsclnt .
Leaving the cast also facilitates any eventual migration to C++, or reuse of this code in a C++ program, if needed. So, I would prefer to retain this cast.


9)
cmd/selfserv/selfserv.c:1155: PRFileDesc *	ssl_sock	= data->ssl_sock;
Weird spacing here.

Fixed. Looks like there were tabs in the original legacy code in strsclnt that I copied from strsclnt to selfserv.
I fixed it in both places.

10)
cmd/selfserv/selfserv.c:1169: send_string(ssl_sock, defaultHeader);
Check for success

Fixed.

11)
cmd/selfserv/selfserv.c:1177: if (0 == toSend) {
if (!toSend)

I believe this qualifies as a nit, but I fixed it anyway.

12)
cmd/selfserv/selfserv.c:1181: }
See below: this isn't thread safe.

I believe it is, see below explanation.

13)
Because the claimed behavior is matching chunk size, you should generate an
error if the chunk is too big, rather than break it up like ths.

I have changed the comment to make it match the behavior.

14)
cmd/selfserv/selfserv.c:1196: break;
Print some kind of error here.

Fixed.

15)
cmd/selfserv/selfserv.c:1298: 
Declare variables at the top of the function.

Fixed.

16) 
cmd/selfserv/selfserv.c:1370: if (fullDuplex) {
if (fullDuplex && post) would make the code below simpler.

Agree, fixed.

17)
cmd/selfserv/selfserv.c:1373: if (post) {
vertical whitespace after declaration

Fixed this nit.

18)
cmd/selfserv/selfserv.c:1376: uri++;
Put this inside the next block

Fixed this nit.

19)
cmd/selfserv/selfserv.c:1379: char* space = PORT_Strchr(uri, ' ');
Why are you using strchr above and PORT_Strchr here? I would just use stdlib,
but please be consistent.

Fixed. It looks like the majority of the code in selfserv, with few exceptions, uses PORT_ functions, so I chose to do that as well for both uses here.

20)
cmd/selfserv/selfserv.c:1387: if (uri && (0 == strcmp(uri, "/streaming"))) {
!strcmp()

This is a nit. I fixed it since most of the file uses the style you suggest. But I want to point out that I really prefer this style of coding, especially for strcmp, which,
to me, has confusing return values semantics, and for which I always have to read the man page. The == test really makes it ovious that this is an equality match case, rather
than the (! which I read as "not", which does not seem quite right, given that strcmp actually does not return a boolean value.
I also often use (constant == variable) test to avoid the possibility of variables being overwritten, which can easily happen when an == becomes an = accidentally.
I believe this is a more secure coding style. If you see that style in NSS, then I probably wrote the code.
Of course, for (constant == function()) tests, there is no possibility of overwriting the function.

21) 
cmd/selfserv/selfserv.c:1390: perWriteThread tdata;
Declare at top of block.

Fixed.

22)
cmd/selfserv/selfserv.c:1395: PR_CreateThread(PR_USER_THREAD, do_writes,
(void*)&tdata,
You do not need to cast pointers to (void *)

I refer you to 9) . This cast is not harmful. While the PR_CreateThread call for thread_wrapper does not cast, I found many calls to PR_CreateThread in NSPR and NSS source
code that do the cast. And this facilitates an eventual C++ migration. IMO, the cast here also makes the code more readable as one knows immediately
that this argument is actually the thread variable argument.

23)
cmd/selfserv/selfserv.c:1405: buf[rv] = 0;
Do you do anything with buf? If not, why are you bothering to null terminate.

Yes, print it when selfserv is in verbose mode, on the very next line.

24)
cmd/selfserv/selfserv.c:1407: PR_AtomicAdd(&tdata.counter, rv);
This does not produce the claimed semantics of writing the same size chunks
since it consolidates chunks if there is no intermediate write.

I have corrected the earlier comment to match the behavior. The intent was only to try to match the amount of data between both sides, to ensure maximum concurrency to test the
full duplex locks.

25)
cmd/selfserv/selfserv.c:1412: tdata.done = PR_TRUE;
"I do not believe that this is safe.

Consider what happens when no data has been read.  tdata.counter is 0.
This thread waits on PR_recv() and the other thread busy-waits on the
PR_AtomicSet(). Eventually, PR_recv() returns 0 and then tdata.done is 
written with PR_DONE and read in the other thread with no memory barrier."

I believe PR_Sleep provides an implicit memory barrier. It is not necessary
that the very first inspection of tdata.done by do_writes notice the new status
immediately. All that matters is that it notice the flag status eventually.
I have actually tested this code by commenting the PR_Read and replaced it with rv=0 .
There is no deadlock. I did not try to log how many times do_writes inspected
tdata.done, but it is irrelevant, IMO.

Unless a platform exists on which PR_Sleep does not provide an implicit memory barrier,
I believe the code is safe. I don't see how this could be, however, on an OS that does
pre-emptive multitasking, which could reschedule the thread to run on another core or CPU
after the yield. The PR_Sleep implicitly provides a lock, thus the scheme is not technically
entirely "lock-free" - whenever the writer thread catches up with the reader, it locks, as
part of the PR_Sleep(0). Thus, the purpose of the PR_Sleep(0) is not just to avoid spinning
the CPU, but also to provide a memory barrier and synchronize the cache. I will add a comment
to that effect.

If you think such a platform exists, then the "done" flag itself could be redeclared as PRInt32
and set with an atomic operation as well.

In the reader thread :

            /* notify writer thread that it can stop sending chunks. */
            /* no lock is needed here, the writer thread will test this when idle */
            PR_AtomicIncrement(&tdata.done);

In the writer thread :
        if (!toSend) {
            PRInt32 done = PR_AtomicDecrement(data->done);
            if (0 == data->done) {
                /* reader thread reached EOF, stop sending chunks */
                break;
            }
            PR_ASSERT(-1 == done);
            done = PR_AtomicIncrement(data->done);
            if (1 == data->done) {
                /* reader thread reached EOF, stop sending chunks */
                break;
            }
            /* the client may still send more POST data. Yield to avoid spinning the CPU. */
            PR_Sleep(0);
            continue;
        }

Of course, this is much more code, and I believe the second atomic is simply unnecessary.
Attachment #8672205 - Attachment is obsolete: true
Attachment #8672897 - Flags: review?
(Assignee)

Updated

3 years ago
Attachment #8672897 - Flags: review? → review?(ekr)
(Assignee)

Comment 12

3 years ago
Created attachment 8673536 [details] [diff] [review]
Updated patch

Updated patch. Just copied send_buf, send_string and generate_chunk implementations from the strsclnt patch, per comments in bug 1212386 .
Attachment #8672897 - Attachment is obsolete: true
Attachment #8672897 - Flags: review?(ekr)
Attachment #8673536 - Flags: review?(ekr)
(In reply to Julien Pierre from comment #11)
> Created attachment 8672897 [details] [diff] [review]
> Update from Eric's review.
> 
> Here is my response to your review :

This response is extremely dificult to deal with because it doesn't
come with any context and Rietveld does not report comments
in order. In future, please respond on Rietveld.


> 1) cmd/selfserv/selfserv.c:1395: if (!post || *post != 'P')
> What was this check for?
> 
> I believe this was to check if the preceding PORT_Strstr is really doing its
> job.
> I don't believe it is necessary, so I removed it. If we want to check that
> PORT_Strstr really works, it should not be tested in its own test, not
> selfserv.

All right.


> 2) I don't understand the next comment - the one that says 
> File cmd/selfserv/selfserv.c (right):
> Are you asking me to change something, and if so, where ?

It's hard to tell without the Rietveld context.


> 3) cmd/selfserv/selfserv.c:228: "-F enable full-duplex mode on /streaming
> URI\n"
> Why on /streaming alone?
> 
> This makes it easier to write a test script that can run regular tests as
> well as streaming tests without having to restart the server.

All right.


> 4) 
> cmd/selfserv/selfserv.c:284: PRInt32      oserr     = PR_GetOSError();
> Not all errors are os errors. you should probably check first somehow
> 
> I just copied the code we had in strsclnt already to make it more consistent.
> AFAIK, there is no way to tell for sure if the NSPR error code is also due
> to an OS error. It could be a stale OS error code, or not. So, I just log
> it. If you know of a way to make sure that the OS error code is not stale,
> then I can change it and log the OS error only in that case.

If you can't tell, then it's better not to log the OS error code.


> 5) 
> 
> done.
> 
> 6) it would be very helpful if you could qualify what you mean by "this
> code", especially as the line numbers between stsrclnt and selfserv are
> different, and the review comments in strsclnt do not specify which function
> they apply to. Line 1140 in selfserv is in generate_chunk, but that function
> is a little different between strsclnt and selfserv. Also, I started looking
> at your selfserv review first. I can't really address this comment properly
> without clarification.

send_buf, generate_chunk, and send_str


> 7) I have rewritten the comment, but you could have been more precise as to
> what you expected, as I'm still not sure the rewritten comment meets your
> expectations.

In general it's ungrammatical.

The first sentence doesn't start with a capital letter. "i.e.," is rendered
as "ie". There is a space before a period and then the last sentence doesn't
end in a period. Generally, comments should be grammatical.


> 22)
> cmd/selfserv/selfserv.c:1395: PR_CreateThread(PR_USER_THREAD, do_writes,
> (void*)&tdata,
> You do not need to cast pointers to (void *)
> 
> I refer you to 9) . This cast is not harmful. While the PR_CreateThread call
> for thread_wrapper does not cast, I found many calls to PR_CreateThread in
> NSPR and NSS source
> code that do the cast.

Sure, and all those casts are redundant.


> And this facilitates an eventual C++ migration. IMO,
> the cast here also makes the code more readable as one knows immediately
> that this argument is actually the thread variable argument.

C++ does not need a cast to void * either. It's only the void* -> T*.
direction that needs casting.
See: http://www.cplusplus.com/doc/tutorial/typecasting/



> 23)
> cmd/selfserv/selfserv.c:1405: buf[rv] = 0;
> Do you do anything with buf? If not, why are you bothering to null terminate.
> 
> Yes, print it when selfserv is in verbose mode, on the very next line.

All right.


> 25)
> cmd/selfserv/selfserv.c:1412: tdata.done = PR_TRUE;
> "I do not believe that this is safe.
> 
> Consider what happens when no data has been read.  tdata.counter is 0.
> This thread waits on PR_recv() and the other thread busy-waits on the
> PR_AtomicSet(). Eventually, PR_recv() returns 0 and then tdata.done is 
> written with PR_DONE and read in the other thread with no memory barrier."
> 
> I believe PR_Sleep provides an implicit memory barrier. It is not necessary
> that the very first inspection of tdata.done by do_writes notice the new
> status
> immediately. All that matters is that it notice the flag status eventually.
> I have actually tested this code by commenting the PR_Read and replaced it
> with rv=0 .
> There is no deadlock. I did not try to log how many times do_writes inspected
> tdata.done, but it is irrelevant, IMO.
>
> Unless a platform exists on which PR_Sleep does not provide an implicit
> memory barrier,
> I believe the code is safe. I don't see how this could be, however, on an OS
> that does
> pre-emptive multitasking, which could reschedule the thread to run on
> another core or CPU
> after the yield. The PR_Sleep implicitly provides a lock, thus the scheme is
> not technically
> entirely "lock-free" - whenever the writer thread catches up with the
> reader, it locks, as
> part of the PR_Sleep(0). Thus, the purpose of the PR_Sleep(0) is not just to
> avoid spinning
> the CPU, but also to provide a memory barrier and synchronize the cache. I
> will add a comment
> to that effect.

I don't believe this is correct.

Consider the following sequence:

Reader                           Writer
PR_AtomicAdd()
                                 PR_AtomicSet()
                                 PR_Sleep()
                                 PR_AtomicSet()
PR_Recv()                                 
tdata.done = PR_TRUE             if (data->done)  <--

The lines marked with <-- operate on the same data without any locking.
This is not guaranteed to be safe.

The issue here is not simply a matter of potential deadlock but of
undefined behavior. See:
https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
(Assignee)

Comment 14

3 years ago
Created attachment 8674082 [details] [diff] [review]
Patch v3


Eric,

>This response is extremely dificult to deal with because it doesn't
>come with any context and Rietveld does not report comments
>in order. In future, please respond on Rietveld.

I am not sold on the requirement to use a google account with Rietveld. I would love to use it if it was integrated with the bugzilla account, instead.
Your review was extremely difficult to deal with, also, because it was in Rietveld rather than bugzilla, and not enumerated. The lack of numbering for
review comments in Rietveld, or any other reference that could be used in a text response that I could find, is a major problem with Rietveld.
Did I somehow miss something in the tool, or can it really only be useful if all the participants are logged in to Rietveld ?

4)
>If you can't tell, then it's better not to log the OS error code.

Actually, the OS error code was very helpful while debugging, and that is why I added it. A large proportion of the calls to errWarn in selfserv is NSPR I/O calls.
I think it belongs. One can easily ignore the OS error code when it is logged in response to an NSS call that does not do I/O. The information
about what the error was in response to is always available. Sure, we could split the logging calls between errWarn_NSPR err_Warn_NSS, but other
programs in NSS such as strsclnt don't do that. IMO, it is unnecessary.

5)
>send_buf, generate_chunk, and send_str

I integrated those in a patch update last night, after I made the changes in strsclnt. These functions are now identical between the two programs. They weren't previously.
They could arguably be moved to the static lib in cmd/lib , but they really don't do much and are not very likely to be reused by other programs.

6) 
>In general it's ungrammatical.

>The first sentence doesn't start with a capital letter. "i.e.," is rendered
>as "ie". There is a space before a period and then the last sentence doesn't
>end in a period. Generally, comments should be grammatical.

I will fix this, but I think you will find that most comments in NSS, including in selfserv, do not start with a capital letter, and also do not end with a period.
This is true even for many comments that are multiple sentences long. IMO, it doesn't make a lot of sense to start enforcing this now for new comments, when so many existing
comments already break the rules. I believe the content of the comment was clear, and it was spelled correctly. Trying to enforce this now when several NSS contributors, past and
present, are not native US english speakers, is a not a productive use of time, IMO.

7)
>C++ does not need a cast to void * either. It's only the void* -> T*.
>direction that needs casting.

Fair enough, I will remove this one.

22) Thanks. It was an interesting read, and I'm actually familiar with many of those potential issues that result in this sort of problem.
However, I don't think any of the situations described in the article actually applies here.
That is because a pointer to the perWrite Thread structure is passed to the external function PR_CreateThread.
The compiler has no way of knowing that PR_CreateThread eventually calls do_writes . And indeed, since the implementation of PR_CreateThread is dynamic,
it may or may not ever do so. What that means is that the compiler must pack the perWriteThread memory structure in a standard fashion, so that PR_CreateThread
can inspect the structure, even though PR_CreateThread itself never looks at its content (but could, if the implementation in libnspr suddenly changed).
It must also not reuse any of the space inside the stack variable tdata after the call to PR_CreateThread .

do_writes is never called directly . It is invoked through a callback only. That means inlining must be disabled for its generated code.
Abnd thus, the generated code for do_writes must also respect the standard packing of the perWriteThread memory structure, as used before a pointer to a
stack structure is passed to PR_CreateThread.

Let's go over the various situations mentioned in the article :
I) the compiler can legally reuse X’s storage for any temporal data inside of some region of code before the store (e.g. to reduce stack frame size)
IMO, this cannot be true here, due to the call to the external function PR_CreateThread, and the fact that a pointer to the structured is used.
II) Consider that the compiler does the same transformation as in the example above (spills some temp value into op_count)
Again, this is not possible here, for the same reason as 1)
III) Now let’s consider that several threads write to a shared variable without proper synchronization:
I believe this case does not apply, as there is only one writer thread, and one reader thread.
IV) But what if we modify shared state with atomic operations but read with plain loads? Like in the following example:
...
This can also break after all. The compiler can legally re-read the stat2 variable after the check, and it yet leads to division by zero exception.
...
You may say that it can’t possibly affect your code, because you have a very simple code along the lines of:
..

This is probably the closest case to the situation here, though I'm not actually using an atomic to modify the shared state.

However, he goes on to write :
Consider that a thread loads ref_count_ = 2 into a register, then another thread increments it to 3, then the first thread stores 2 back into ref_count_. Oops.
So we are back to multiple writer threads, which is not the case here.

Now, that doesn't mean nothing can go wrong with the code I wrote, just that nothing describe in the article appears to apply here.

Let's ask the question "what could possibly go wrong ?"
a) One potential problem would be that the do_writes thread, which does the so-called "racy read", would never notice the changed value.
Ie. this is the memory barrier issue. I believe I ruled that out because of the PR_Sleep(0) . You didn't indicate whether you were convinced or not by this
argument.
b) Another potential problem is that the do_writes thread, which does the "racy read", might see the wrong value. This is unlikely due to the size of the variable here,
but it's conceivable that if structure packing was set to 1, on some CPU, the compiler might decide to locate the variable at a memory address that would require two writes
to update, or two reads. Thus, it's not impossible to imagine a case where the "racy read" might see a bogus value. But this is not relevant, since the value itself is
not used, it is only used a flag. All do_writes want to know is that the variable at this memory address has changed.
c) A last potential problem is ordering, the possibility that the assignment
tdata.done = PR_TRUE;
might actually occur too early, and be reordered by the compiler or CPU, and actually occur before the preceding while loop is complete, and be seen by the do_writes thread
before the while loop is complete. This would imply the assignment actually occurs before the line :
PR_AtomicAdd(&tdata.counter, rv);.
I don't know how likely it is that this will actually be the case, or whether a compiler or CPU would indeed be allowed to do this reordering or not.
Even if this assignment were to actually occur even before the loop, immediately after the PR_CreateThread, this would merely cause the do_writes thread to terminate early, which
would be bad, and might be considered "launch nuclear missile" worst case.
It seems unlikely, since the assignment is explicitly written after the call to PR_AtomicAdd, which is an external function, and that should cause at least some sorts of
optimizations to be disabled, but perhaps not necessarily the reordering here, especially as PR_AtomicAdd is operating on tdata->counter but not tdata->done .
I would be interested to know if this is actually a real possibility or not. If so, it would be a legitimate issue.
In any case, it seems to me that the fix may be simpler than what I suggested in my last comment.
Simply replacing 
tdata.done = PR_TRUE 
 with
PR_AtomicIncrement((PRInt32*)&tdata.done, 1);
should take care of doing an atomic write.
And the in do_writes :
if (data->done)
with
if (PR_AtomicAdd((PRInt32*)&data->done, 0))
I believe the compiler could not reorder the calls to PR_AtomicAdd and PR_Read, since they are externals function implemented in a shared lib.
That leaves the possibility of the CPU interferring, somehow, depending on its store memory model.
A problem might occur if the last AtomicAdd - the one to data->done - actually was seen by do_writes before the last (failed) PR_Read call .
However, this last AtomicAdd is conditioned on the resulting failure of that PR_Read so I'm not sure how this could happen, unless the CPU has some
sort of time machine, in which case, I would like to buy stock in the company.
Attachment #8674082 - Flags: review?(ekr)
You need to log in before you can comment on or make changes to this bug.