Closed Bug 107777 Opened 23 years ago Closed 23 years ago

selfserv needs "log stats" mode

Categories

(NSS :: Tools, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kirk.erickson, Assigned: kirk.erickson)

Details

Attachments

(2 files, 7 obsolete files)

When benchmarking selfserv, recording connections per second based on
an entire run of strsclnt is not telling.  Its the numbers gathered when
the server is pegging the machine under test (100% cpu utilization).

Strsclnt should have a mode similiar to httptest (Julian's iWS driver),
which offers:

-L n          Run tests in infinite loop and report stats every 'n' seconds

n=30 would be a good default period between logging events.  

To benchmark, a user will start selfserv on the machine under test.
Then start clients on machines in the same subnet till vmstat or the like
reports that the server machine is under full load.  Subsequently, waiting
30 seconds  and summing the last value reported by each of the clients
will yield desired benchmarking results.
IMO, the server should report the stats every so often, not the client.
This is because there will be many clients but only one server in the test.
Shifting from strsclnt to selfserv for this change (updated Summary).
This will simplify benchmarking since multiple client stats won't need
to be summed.
Summary: strsclnt needs infinite ops/sec log mode → selfserv needs "log stats" mode
Comment on attachment 56786 [details] [diff] [review]
Added '-L <seconds>' (log ops per second periodically)

Kirk, could you regenerate the patch with "unified diffs", that is,
  % cvs diff -u selfserv.c
Attachment #56786 - Flags: needs-work+
Attached patch diff -u (obsolete) — Splinter Review
Attached patch Diffs with the tip at checkin (obsolete) — Splinter Review
Adds 'L <seconds>' to selfserv command line.
Reviewed by Nelson.
Ended up pulling a fresh version from the tip (merge was complex).
Tested on box (Linux 7.1), and iws-perf (Solaris 8).
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Reopened because Wan-Teh made some important observations:

Kirk,

I noticed that you incorporated Nelson's suggested
changes and checked in your patch.

I actually have two comments about your original
patch.  Sorry I did not send them to you eariler.
Here they are.

1. I am concerned about incrementing the 'ops'
    counter without using atomic routines or a
    lock.  This is especially bad because you
    periodically reset 'ops' to 0.  Consider
    this sequence:

    ops has the value of 990

    thread A: add one to ops = 991 but has not
              update ops yet
    thread B: reset ops to 0
    thread A: update ops to 991

    Now ops will continue to count up from 991.

2. I think we are calling PR_IntervalNow() more
    frequently than necessary.  A better way is
    to create a thread that would sleep for
    'logPeriod' seconds, wake up, print the number
    of ops per log period, and go back to sleep.

Wan-Teh

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Wan-Teh,  please review this change.  I've made a separate logger thread
and we are now using PR_AtomicIncrement(&ops) and PR_AtomicSet(&ops,0)
to address your concerns.

Appreciate your help improving this change.
Attached patch patch as of 2001-1126 (obsolete) — Splinter Review
Kirk,

A few minor suggestions:

1. Move the code that initializes logger_latest and logger_previous from
where it is now, into the logger function, just before the while loop.

2. Move the static variables logger_previousOps, logger_previous, and
logger_latest into function logger.

3. Don't bother to initialize static variables to zero because the c language
guarantees that all static variables that are not explicitly initialized
are implicitly initialized to zero before entering main().

4. Only fetch the value of logger_ops once per loop in logger.	fetch it
once near the top of the loop and store it in a local variable, and then 
use the local variable each time you need it.  

The reason not to fetch it multiple times in the loop is that the value
can change between fetches, leading to inaccurate results.
Comment on attachment 59206 [details] [diff] [review]
patch post Nelson suggested changes

Kirk, this patch looks pretty good.  I have some suggested changes.

>+void 
>+logger(void *arg)

Make this function static.

>+    logger_previous = PR_IntervalNow();
>+    logger_latest = logger_previous;
>+    logger_previousOps = 0;

It is better to replace the last line by
      logger_previousOps = logger_ops;

>+    while(1==1) {

It is simpler to indicate an infinite loop with
      while (1) {
or better yet, with the K&R style
      for (;;) {

>+    	sleep(logPeriod);

Replace this sleep call by a PR_Sleep call as follows:
      PRIntervalTime logPeriodTicks = PR_SecondsToInterval(logPeriod);
      ...
      while(1==1) {
	PR_Sleep(logPeriodTicks);

The sleep function only works when NSPR threads are native
threads, which is not true on all platforms.

>+        if (period < 1) {
>+            period = 1; /* tick */
>+        }

This whole if statement should be deleted.  The only purpose
of it I can come up with is to avoid division by 0.  If so,
you should make that explicit:
	  if (period == 0) {
	      period = 1; /* tick */
	  }
But then I would question the correctness of setting period to
1 when period is 0.  In any case, unless the log period is longer
than five or six hours, 'period' won't wrap around.

(By the way, period (a PRIntervalTime) is a nonnegative integer.
So period < 1 is equivalent to period == 0.)

>+        if (seconds >= logPeriod) {
>+            opsPerSec = (logger_ops_local - logger_previousOps) / seconds;
>+            printf("%.2f op%s/second, %s, %d threads\n",
>+	    	opsPerSec, (opsPerSec>1)?"s":"", NSS_VERSION, threadCount);
>+            logger_previousOps = logger_ops_local;
>+            logger_previous = logger_latest;
>+        }

Here I think it is not necessary to bracket the whole thing with
if (seconds >= logPeriod).

Printing NSS_VERSION can be misleading because it is possible to
drop in an arbitrary version of NSS shared libraries.  NSS_VERSION
is the version of NSS you compiled selfserv against, not the version
of NSS at run time.

Finally, it is a good idea to add a
	      fflush(stdout);
statement after the printf statement because stdout is buffered.

>+    if (rv == SECSuccess && logStats) {
>+	loggerThread = PR_CreateThread(PR_USER_THREAD, 
>+			logger, NULL, PR_PRIORITY_NORMAL, 
>+                        (PR_TRUE==useLocalThreads)?PR_LOCAL_THREAD:PR_GLOBAL_THREAD,
>+                        PR_UNJOINABLE_THREAD, 0);
>+	if (loggerThread == NULL) {
>+	    printf("selfserv: Failed to launch logger thread!\n");
>+	    rv = SECFailure;
>+	} 
>+    }

You may want to use fprintf(stderr) to print error messages.
Follow the current style of selfserv.
Attachment #59206 - Flags: needs-work+
Attachment #59186 - Attachment is obsolete: true
Attached patch post Nelson and Wan-Teh reviews (obsolete) — Splinter Review
I've tested on box (running Linux 7.1).
Attachment #59206 - Attachment is obsolete: true
Comment on attachment 59348 [details] [diff] [review]
post Nelson and Wan-Teh reviews

Kirk, this patch is good!

>+    PRFloat64 secondsPerTick = 1.0 / (PRFloat64)PR_SecondsToInterval(1);

I just remember the PR_TicksPerSecond() function. So this line can
also be written as:
      PRFloat64 secondsPerTick = 1.0 / (PRFloat64)PR_TicksPerSecond();

>+        printf("%.2f op%s/second, %d threads\n", 
>+	       opsPerSec, (opsPerSec>1)?"s":"", threadCount);

Seems like "0.2 ops/second" is grammatically correct, so
it's probably not necessary to do the "opsPerSec>1" test.

>+	loggerThread = PR_CreateThread(PR_USER_THREAD, 
>+			logger, NULL, PR_PRIORITY_NORMAL, 
>+                        (PR_TRUE==useLocalThreads)?PR_LOCAL_THREAD:PR_GLOBAL_THREAD,
>+                        PR_UNJOINABLE_THREAD, 0);

You can replace (PR_TRUE==useLocalThreads) by useLocalThreads.

r=wtc.	You can check this patch in.
Attachment #59348 - Flags: review+
Attachment #59348 - Attachment is obsolete: true
Didn't see any reason to initialize latestTime at start
of logger().  Its being set immediately in the for loop.

Yielding these numbers hitting iws-perf from 5 client machines:

NSS 3.3.1
---------
577.10 ops/second, 25 threads
577.85 ops/second, 25 threads
568.18 ops/second, 25 threads
578.77 ops/second, 25 threads
(3% idle doing restarts only)

tip (2001-1127)
---------------
169.99 ops/second, 25 threads
168.01 ops/second, 25 threads
169.84 ops/second, 25 threads
168.38 ops/second, 25 threads
(4-8% idle doing restarts only)

Indicating the tip has degraded further.
Last weeks tip yielded 226 ops/sec.

I'll attach the 3.3.1 selfserv.c diffs,
and pull the tip as of 2001-1103 for comparison.
Attachment #56786 - Attachment is obsolete: true
Attachment #57074 - Attachment is obsolete: true
Attachment #57539 - Attachment is obsolete: true
Attachment #57808 - Attachment is obsolete: true
Sorry - in my previous note "220" should be "546".
For the record again.  Here were the numbers presented
last week:

	  3.3.1    2001-1120
noreuse   226	   220		   3% drop
reuse	  585	   546		  10% drop

	  3.3.1    2001-1127
reuse	  577	   170

So I'm seeing a huge drop utilizing today's tip.
I'll pull the tip as of Nov 3 next.
Comment on attachment 59365 [details] [diff] [review]
Now using PR_TicksPerSecond(), printing ops/sec always

r=wtc.	This patch is good.
Thanks, Kirk.
Attachment #59365 - Flags: review+
Comment on attachment 59367 [details] [diff] [review]
NSS 3.3.1 selfserv.c patch

>+static void 
>+logger(void *arg)
>+{
>+    PRFloat64 seconds;
>+    PRFloat64 opsPerSec;
>+    PRIntervalTime period;
>+    PRIntervalTime previousTime;
>+    PRIntervalTime latestTime;
>+    PRUint32 previousOps;
>+    PRUint32 ops;
>+    PRIntervalTime logPeriodTicks = PR_SecondsToInterval(logPeriod);
>+    PRFloat64 secondsPerTick = 1.0 / (PRFloat64)PR_TicksPerSecond();
>+
>+    previousTime = PR_IntervalNow();
>+    latestTime = previousTime;
>+    previousOps = loggerOps;

You can delete the assignment statement
      latestTime = previousTime;
too for NSS 3.3.1.

Other than that, this patch is good.
Attachment #59367 - Flags: needs-work+
Target 3.4.  Kirk, you've checked in the fix, right?
Priority: -- → P1
Target Milestone: --- → 3.4
I checked in the patch Wan-Teh shortly after Wan-Teh approved (11/27),
and have had no problems exercising the new functionallity since then.
Closing...
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
Reopening this.  Selfserv run with '-L 30' was not terminating after
visit to "stop page" (ex. https://soupnazi.red.iplanet.com:12344/stop).
The loggerThread was left running.

Checked in a temporary fix yesterday:

cvs diff ./mozilla/security/nss/cmd/selfserv/selfserv.c
Index: ./mozilla/security/nss/cmd/selfserv/selfserv.c
===================================================================
RCS file: /cvsroot/mozilla/security/nss/cmd/selfserv/selfserv.c,v
retrieving revision 1.35
diff -r1.35 selfserv.c
1639c1639
<   loggerThread = PR_CreateThread(PR_USER_THREAD,
---
>   loggerThread = PR_CreateThread(PR_SYSTEM_THREAD,

This works, but when there's time, we should revert to a USER thread,
and Interrupt before PR_Cleanup() in main():

--- selfserv.c  27 Nov 2001 20:08:57 -0000  1.35
+++ selfserv.c  10 Jan 2002 23:31:42 -0000
@@ -1652,6 +1652,9 @@

     VLOG(("selfserv: server_thread: exiting"));

+    if(logStats) {
+        PR_Interrupt(loggerThread);
+    }
     NSS_Shutdown();
     PR_Cleanup();

Apparently, in addition, we need to tell the thread to exit.
Lowered priority because we have a fix that works.
Status: RESOLVED → REOPENED
Priority: P1 → P2
Resolution: FIXED → ---
The remaining problem was fixed.
I've been using this new feature for several weeks
now with not problems.
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: