A tail.exe process hangs around after all.sh is run

RESOLVED FIXED in 3.7

Status

NSS
Test
P2
normal
RESOLVED FIXED
17 years ago
16 years ago

People

(Reporter: Wan-Teh Chang, Assigned: Sonja Mirtitsch)

Tracking

x86
Windows 2000

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

17 years ago
I found that a tail.exe process is hanging around after
the all.sh script is finished.  The tail.exe process
prevents me from exiting the Command Prompt.
(Reporter)

Comment 1

17 years ago
I am using MKS Toolkit 7.0.  The output of 'uname -a' is
Windows_NT PC2 5 00 586
(Assignee)

Comment 2

17 years ago
this should not happen, it has been gone on SonjaNT for a long time - but that
might be because I have some cygnus utilities in the PATH.
Could you attach the last 100 lines of the output of all.sh please (what I am
looking for is a line that says MKS Special...) , and the output of:
echo $O_CRON
ps | grep tail
and also check if it put the output.log in place?
Status: NEW → ASSIGNED
(Assignee)

Comment 3

17 years ago
never mind my my previous comment, I can reproduce this here on one of the machines.
(Reporter)

Comment 4

17 years ago
Assigned the bug to Bishakha.
Assignee: sonja.mirtitsch → bishakhabanerjee
Status: ASSIGNED → NEW
(Reporter)

Comment 5

17 years ago
Changed the QA contact to Bishakha.
QA Contact: sonja.mirtitsch → bishakhabanerjee
(Reporter)

Updated

16 years ago
Priority: -- → P2
Target Milestone: --- → 3.7
(Assignee)

Comment 6

16 years ago
I'll attach the patch
Assignee: bishakhabanerjee → sonja.mirtitsch
(Assignee)

Comment 7

16 years ago
Created attachment 101214 [details] [diff] [review]
patch1

please review
(Reporter)

Comment 8

16 years ago
Comment on attachment 101214 [details] [diff] [review]
patch1

Sonja,

Could you explain what the problem was?  That'll help
us review the fix.

>-    if [ -n "$os_name" -a "$os_name" = "Windows" ]
>+    if [ `uname` = "WINNT" -o `uname` = "Windows_NT" -o "$os_name" = "Windows" ]

The original code has a -n "$os_name" test.  Why isn't
that test in the new code?

>-        kill `ps | grep "tail -f ${LOGFILE}" | grep -v grep | 
>-            sed -e "s/^ *//" -e "s/ .*//"`
>+        kill `ps | grep "tail -f .*output.log" | grep -v grep | awk '{print $2}'`

There are two changes in this line.  The first change
replaces ${LOGFILE} by .*output.log.  The second change
replaces the sed command by an awk command.  Could you
explain these two changes?
(Assignee)

Comment 9

16 years ago
> Could you explain what the problem was?  That'll help us review the fix.

I realized that the variable os_name was empty, I might have written the
original code thinking set_environment would have been sourced, or planned to
set this in init.sh. Now if it is (running from the wrapper nssqa) nothing will
change, but it will work as well if it is called alone (all.sh). Additionally
grep got confused by having "/" in $OUTPUT_LOG on certain machines (worked fine
if run from the wrapper and MKS was forced to be the first found grep)

>-    if [ -n "$os_name" -a "$os_name" = "Windows" ]
>+    if [ `uname` = "WINNT" -o `uname` = "Windows_NT" -o "$os_name" = "Windows" ]

> The original code has a -n "$os_name" test.  Why isn't that test in the new code?

Because it was redundant. the original statement in C would have been
if ( osname != NULL && strcmp(osname,"Windows") == 0 ) 



>-        kill `ps | grep "tail -f ${LOGFILE}" | grep -v grep | 
>-            sed -e "s/^ *//" -e "s/ .*//"`
>+        kill `ps | grep "tail -f .*output.log" | grep -v grep | awk '{print $2}'`

>  The first change replaces ${LOGFILE} by .*output.log.  

depending on the grep used, it would have gotten confused by the whole string
${LOGFILE}, which had several "/" in it - I could also have solved the problem
by quoting the string properly, but since differend PCs have different tools in
the PATH I figured it would take a long time to test that it works. If there is
another process on the PC running a tail -f of an output.log it will be killed
as well, but only one instance of QA works on the PC now anyway as far as I know. 

> The second change replaces the sed command by an awk command.

Not necessary, just for readability. I figured since I was already at it, and
the awk is so much clearer than the sed I had used before I'd also fix this.
(Reporter)

Comment 10

16 years ago
The only change I have reservations about is the
grep "tail -f .*output.log" change.  This means we
can't run two instances of QA on the same PC at
the same time.

Why do we need to kill the tail -f process twice
under MKS?  Why isn't "kill ${TAILPID}" enough?
(Assignee)

Comment 11

16 years ago
> This means we can't run two instances of QA on the same PC at
> the same time.

We never could. I tried that, having tinderbox machine run daily QA, both in a
way that did not do the tail -f, and it had problems, I'll try to dig up some
information / documentation that I had on this


> Why do we need to kill the tail -f process twice
> under MKS?  Why isn't "kill ${TAILPID}" enough?

$! does not work for the MKS shell. The tail -f starts a subshell, which then
starts tail , at least if called from a script, and the PID of this shell is
being returned by $!

(Reporter)

Comment 12

16 years ago
Can we manually run two instances of all.sh on Windows,
each having a different value of PORT?

It seems that instead of redirecting the output of the
scripts to ${LOGFILE} and using tail -f ${LOGFILE}, we
can just pipe the output of the scripts to
tee -a ${LOGFILE}.  Do you think this will work?  This
will remove the tail -f process altogether.
(Assignee)

Comment 13

16 years ago
the tail -f has been in there since I first saw the script, I remember having
tried a tee once, which is the most obvious choice, but don't remember why I
went back to the tail -f. Does the tee -a handle the stderr well on NT? Is -a
implemented on all platforms? I remember problems on HP I think, with torn
output, like we have from server and client writing at the same time. I think
that would be a seperate bug, an enhancement request, to get rid of the tail
altogether.

PORT is being set differently automatically by the wrapper for tinderbox QA. I
do this on a few Unix machines, they run up to 3 instances of QA, daily, 32 bit
tinderbox and 64 bit tinderbox.

What would be the purpose of running 2 instances of QA on the same PC, without
the wrapper? If you have a reason to do this, I'll try it again, and see what
was wrong with it, but this should be a seperate enhancement request too. It has
never worked with the wrapper, and was never tried without, because I never
needed it.
(Reporter)

Comment 14

16 years ago
It is useful for testing the debug and optimized
builds at the same time or testing the WINNT and
WIN95 builds at the same time.
(Assignee)

Comment 15

16 years ago
do you think it will be as fast as running them sequentially? I think it will be
a lot slower, and I also think it was timeouts, that caused the tests to fail.
But I also think you should file an enhancement request on it, or do you want me
to do this? I am kind of busy today, otherwise I'd try and see what went wrong,
but I will go to the lab and see if I can get some quick test on the tinderbox
to run (or fail)
(Reporter)

Comment 16

16 years ago
Created attachment 101330 [details] [diff] [review]
Alternative patch: use tee -a instead of tail -f

> Does the tee -a handle the stderr well on NT? Is -a
> implemented on all platforms?

I believe the answer to both questions is yes, but the
only way to find out is to try it.  So I am offering
this patch as an alternative proposal.	Note that this
patch always sends the test output to stdout, even if
$O_CRON is ON.	If this is not desirable, this patch
needs to be modified.

If you think replacing tail by tee should be a separate
enhancement request, I can open one.  But it is one way
to fix this bug, so it seems to be fine to discuss it in
this bug.

Are you saying that we can run two instances of QA on
the same Unix machine but not on the same Windows machine?
I don't really need to run two instances of QA on the
same Windows machine (so you don't need to file an
enhancement request on it), but we shouldn't preclude that
either.
(Assignee)

Comment 17

16 years ago
> Are you saying that we can run two instances of QA on
> the same Unix machine but not on the same Windows machine?

yes. Works fine for Unix, and has worked fine for over a year.

> I don't really need to run two instances of QA on the
> same Windows machine (so you don't need to file an
> enhancement request on it), but we shouldn't preclude that
> either.

I ran 2 instances of all.sh on nss-nt a few times just now, and one of them had
certutil prompt for a password from stdin.
It also increased time for the ssl test from an average of 4 minutes 30 seconds
to 28 minutes. 

> even if $O_CRON is ON. If this is not desirable, this patch needs to be 
> modified.

O_CRON indicates that it is being run by the cron, and nothing is supposed to go
to stdout or stderr.
it should be something like 
for i in ${TESTS}
 do
     SCRIPTNAME=${i}.sh
     echo "Running Tests for $i"
-    (cd ${QADIR}/$i ; . ./$SCRIPTNAME all file >> ${LOGFILE} 2>&1)
+    if [ -z "O_CRON" -o "$O_CRON" != "ON" ]
+    then
+       (cd ${QADIR}/$i ; . ./$SCRIPTNAME all file >> ${LOGFILE} 2>&1)
+    else
+       (cd ${QADIR}/$i ; . ./$SCRIPTNAME all file 2>&1 | tee -a ${LOGFILE})
+    fi

 done



Other than that, the patch looks fine to me, r=sonmi

(Reporter)

Comment 18

16 years ago
Created attachment 101342 [details] [diff] [review]
Alternative patch v2

Note that I simplified the expression for the test commands
and changed != to =.

In jssqa, why do we need to test for $O_WIN?  Was there
some problem running tail on Windows?  Can we omit the test
for $O_WIN in jssqa?
Attachment #101330 - Attachment is obsolete: true
(Assignee)

Comment 19

16 years ago
> Created an attachment (id=101342)
> Alternative patch v2

r=sonmi

> Note that I simplified the expression for the test commands
> and changed != to =.

that's greadt for readability, I should go thru all the scripts and make these
changes!

> In jssqa, why do we need to test for $O_WIN?  Was there
> some problem running tail on Windows?  Can we omit the test
> for $O_WIN in jssqa?

jssqa was some really quick and dirty attempt to use nssqa for jss, and has been
abandoned with jss 3.1.x - Bishakha and Jamie have the jssqa.new, which is in
development, but we used it here for QAing 3.2. It works with Jamie's all.pl
now, and I threw out a lot of unnecessary stuff. It does not work on Windows at
all yet.
(Assignee)

Comment 20

16 years ago
> Can we manually run two instances of all.sh on Windows,
> each having a different value of PORT?
> It is useful for testing the debug and optimized
> builds at the same time or testing the WINNT and
> WIN95 builds at the same time.



I tried it, in 2 different shell windows I typed 
./all.sh;./all.sh;./all.sh;./all.sh;./all.sh;./all.sh;./all.sh;./all.sh;
a few hours later I went back, killed one of the processes because of a certutil
prompting for a password, and noticed that the time for the ssl test had gone up
by a factor 6 or so, I started a new shell window and typed the same command
line. Today in the morning I found the machine crashed, not reacting anymore,
but no bluescreen. After a reboot the machine did not mount all network drives,
had lock files that indicated some of the nightly QA must have started
erroneously (there was no build present) and the MKS ksh is not able to deal
with it's shell history anymore and gives errormessages when sourcing a file.

> I don't really need to run two instances of QA on the
> same Windows machine (so you don't need to file an
> enhancement request on it), but we shouldn't preclude that
> either.

Since I have no need for running 2 instances of QA on a PC either, and knowing
that it slows down the PC too much to be usefull anyway, I will not file a real
enhancement request on this, but maybe it would be a useful reminder if I will
file one and close it WONTFIX?

(Reporter)

Comment 21

16 years ago
Sonja, it's not necessary to file an enhancement request
for running two instances of QA on the same Windows machine.
My point was that we should not preclude that.
(Reporter)

Comment 22

16 years ago
I checked in the alternative patch (v2).
Status: NEW → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.