PR_Writes misbehaves when writing to Network drives from NT

RESOLVED FIXED in 4.1.1

Status

NSPR
NSPR
P3
normal
RESOLVED FIXED
17 years ago
17 years ago

People

(Reporter: Sonja Mirtitsch, Assigned: larryh (gone))

Tracking

4.1.1
x86
Windows 2000

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 attachments)

(Reporter)

Description

17 years ago
Errormessage is: PKCS12 decode not verified

Problem not seen on Win95 build, only on NT build, greasybear and clio same
behavior, both DEBUG and OPT, SonjaNT no failures seen
This seems to be different from the 64 bit failures since these already occur
with pk12util -o, error message on 64 bit failure is pk12util: add cert and key
failed.

tools.sh: Tools Tests ===============================
tools.sh: Exporting Alice's email cert & key------------------
pk12util -o Alice.p12 -n "Alice" -d ../alicedir -k ../tests.pw.876 \
         -w ../tests.pw.876
Converted from:
6e 73 73  0
Converted to:
 0 6e  0 73  0 73  0  0
Converted from:
6e 73 73  0
Converted to:
 0 6e  0 73  0 73  0  0
Converted from:
6e 73 73  0
Converted to:
 0 6e  0 73  0 73  0  0
pk12util: PKCS12 EXPORT SUCCESSFUL
tools.sh: Importing Alice's email cert & key -----------------
pk12util -i Alice.p12 -d ../tools/copydir -k ../tests.pw.876 -w ../tests.pw.876
Converted from:
6e 73 73  0
Converted to:
 0 6e  0 73  0 73  0  0
pk12util: PKCS12 decode not verified
========================================
the output of pk12util -i should be:
============================================================
tools.sh: Tools Tests ===============================
tools.sh: Exporting Alice's email cert & key------------------
pk12util -o Alice.p12 -n "Alice" -d ../alicedir -k ../tests.pw.936 \
         -w ../tests.pw.936
Converted from:
6e 73 73  0
Converted to:
 0 6e  0 73  0 73  0  0
Converted from:
6e 73 73  0
Converted to:
 0 6e  0 73  0 73  0  0
Converted from:
6e 73 73  0
Converted to:
 0 6e  0 73  0 73  0  0
pk12util: PKCS12 EXPORT SUCCESSFUL
tools.sh: Importing Alice's email cert & key -----------------
pk12util -i Alice.p12 -d ../tools/copydir -k ../tests.pw.936 -w ../tests.pw.936
Converted from:
6e 73 73  0
Converted to:
 0 6e  0 73  0 73  0  0
tools.sh: Create objsign cert -------------------------------
(Reporter)

Updated

17 years ago
Priority: -- → P3
Target Milestone: --- → 3.2.1
(Reporter)

Comment 1

17 years ago
reassigning
Assignee: relyea → nicolson
(Reporter)

Comment 2

17 years ago
The script I am running is tools.sh - nss/tests/tools. It does not show up in
daily QA because we only run the 95build on win2k
(Reporter)

Comment 3

17 years ago
verified that the problem occurs with the WinNT version of pk12util, no matter
if the certs were generated with the WinNT or the Win95 build of certutil

Comment 4

17 years ago
Actually what's happening is the pk12util -o is producing a bogus PKCS #12 file.
I'll attach the ASN.1 dump of the output. A zero byte is getting inserted after
every legitimate byte--as if it were trying to convert ASCII chars to Unicode? I
watched this in the debugger, and only 1 byte is being passed to PR_Write each
time...SO, it appears to be happening in the NSPR layer.

Sonja mentioned that clio has Chinese character support installed, and perhaps
that was convincing Win2k that it needed to unicode convert things?

Comment 5

17 years ago
Created attachment 28879 [details]
hex dump of Alice.p12

Comment 6

17 years ago
Created attachment 28880 [details]
Alice.p12, the bogus PKCS #12 file in binary form
(Reporter)

Comment 7

17 years ago
Since yesterday we have a similar failure show up on jordan (AIX)

jordan, qa log 3/28

tools.sh: Tools Tests ===============================
tools.sh: Exporting Alice's email cert & key------------------
pk12util -o Alice.p12 -n "Alice" -d ../alicedir -k ../tests.pw.13396 \
         -w ../tests.pw.13396
Converted from:
6e 73 73  0
Converted to:
 0 6e  0 73  0 73  0  0
Converted from:
6e 73 73  0
Converted to:
 0 6e  0 73  0 73  0  0
Converted from:
6e 73 73  0
Converted to:
 0 6e  0 73  0 73  0  0
pk12util: PKCS12 EXPORT SUCCESSFUL
tools.sh: Importing Alice's email cert & key -----------------
pk12util -i Alice.p12 -d ../tools/copydir -k ../tests.pw.13396 -w ../tests.pw.13396
Converted from:
6e 73 73  0
Converted to:
 0 6e  0 73  0 73  0  0
pk12util: PKCS12 decode not verified
tools.sh: Create objsign cert -------------------------------
(Reporter)

Comment 8

17 years ago
I reran the tests of 3/26 today, and jordan still failed. The tests of this
build that were run on 3/26 passed, today's tests failed.
I am not certain that this are 2 differnt problems, so I don't slit this bug yet.
(Reporter)

Comment 9

17 years ago
Created attachment 29071 [details]
result.html
(Reporter)

Comment 10

17 years ago
It seems an unrelated failure, Bob compared filesize, and the failure on clio
already shows an Alice.p12 file that is twice the size of the other Alice.p12
files (other OS and Win95 build)
jordan's p12 file has the same size as all the others

Adding Larry since Nelson thinks it might be an NSPR problem
Here's why I think this may be an NSPR problem.

The program in question calls PR_Open and passes in the value 
0660 for the "mode" argument.  

The program runs fine on NT with both the win95 and winNT 
versions of NSPR.
The program runs fine on Win2K with the win95 version of 
NSPR.

But the combination of the winNT version of NSPR and the Win2K OS
is where the failure occurs.

I believe the problem is that Win2k now gives meaning to some bits
in the mode argument to open that it previously ignored.  It may
be the case that the winNT version of NSPR passes the mode bits 
through, but the Win95 version doesn't pass them or alters them.

In any case, I suspect the short term answer is for the program to
stop assumeing that 0660 works on all platforms, and have it start
using a mode value on NT that is computed using NT's defintions of
the mode bits.

Comment 12

17 years ago
Sigh,

Looking at the NSPR code, neither the NT nor Win95 versions pay the slightest
attention to the value of 'mode'. The only difference I can see between the two
implementations (which, though formatted differently, do essetially the same
mappings) is NT add the FILE_FLAG_OVERLAP to flag6.

The only difference between pk12util and other working functions is the
PR_TRUNCATE flag is set.

At first I though maybe the file was operating in a mode where it would
'unicode' certain bytes, but I was why it was unicoding some bytes but not
otheres. Then when I traced the code our last night I found that the offending
bytes were single byte writes. To hack around this I converted this to make
double byte writes. Now the code writes 2 bytes of value followed by 2 bytes of
'0'. NOTE: I now notice that all writes seem to right a block of zeros following
the raw write of bytes, where that block of zero is the same size as the
original write.

bob
Well, to me the major clue is that this problem occurs with the 
NT version of NSPR on Win2k, but not with the win95 version on 
Win2k.  That surely sounds like an NSPR problem to me.  The 
mode bits may or may not be relevant.
(Reporter)

Comment 14

17 years ago
AIX failure was due to a lock file and has been split off this bug, see bug
#74009, QA on Jordan has been rerun and passed

Comment 15

17 years ago
I agree. Larry's off looking at it now. In the meantime I'm going to back out
the hack I put in, as it doesn't solve the problem.

bob
Am I to understand that this funny business with the extra zero bytes
has been seen on Joran (AIX) _and_ clio (Win2k) ??
(Reporter)

Comment 17

17 years ago
nelson: jordan had a different problem - lock file existed - please see bug
#74009. In my opinion this bug had nothing to do with the problem we are looking
at here.

Comment 18

17 years ago
No, just clio. Jordan was a different problem (a machine config problem).
Then I don't understand Sonja's comments of 2001-03-28 10:41.
Those comments claim to be a Jordan QA log, and they demonstrate
the extra zeros problem.
(Reporter)

Comment 20

17 years ago
Well - that is what jordan did - that is why I thought it was the same problem,
it looked the same to me - but it went away when the lockfile was removed.

Comment 21

17 years ago
Good news. I built Larry's little test program which just does PR_Opens and
PR_Writes and reproduced the problem without any use of NSS!

So far we don't know exactly all the factors, but we do know:
1) NSS is not an issue.
2) It appears the network is part of the factor. Running the command locally
produces correct results, but running the command and writing to a network drive
does! (Sonia, could we copy one of our builds locally to clio and see if it
solves the problem?).
3) We don't know if compiler differences affect the outcome.
4) Using WIN95NSPR verse WINNTNSPR affects the problem. A quick look at the code
differences is WINNT is setting a threading flag of some sort on open.

I'm going to reassign this bug now to NSPR and Larry. My guess is we'll
eventually find it's a bug in the Windows/2000 OS itself, though.
Assignee: nicolson → larryh
Component: Tools → NSPR
Product: NSS → NSPR
Summary: pk12util -i fails on Win2K → PR_Writes misbehaves when writing to Network drives from NT
Target Milestone: 3.2.1 → 4.1.1
Version: 3.2 → 4.0
(Reporter)

Comment 22

17 years ago
I'll try to copy it locally and test it.
(Assignee)

Comment 23

17 years ago
Created attachment 29254 [details]
A simple test, writeit.c, to demonstrate the problem
(Assignee)

Updated

17 years ago
Status: NEW → ASSIGNED
(Reporter)

Comment 24

17 years ago
test on clio
============

created certificates with regular QA cert.sh
run tools.sh to verify failure
set PATH to NT build bin and lib
cd to networkdirectory that contains the certs
export Alice.p12 (Alice.p12 is 2992 bytes long)
inport Alice.p12
--- FAILURE
copy certificates to c:/tmp
cd to the local directory that contains the certs
export Alice.p12 (Alice.p12 is 5982 bytes long)
inport Alice.p12
--- IMPORT OK

---------------------
cd
W:/nss/nss32/builds/20010330.1/blowfish_NT4.0_Win95/mozilla/security/nss/tests/tools
tools.sh
# create certs
# tools test
#     failure on p12 import
PATH=g:/bin;z:/nstools/bin;z:/nstools/perl5;c:/mksnt;C:/WINNT/System32;C:/WINNT;.;Z:/bin;c:/MKS/bin;c:/MKS/bin/x11;c:/winnt/system;c:/VISUAL~1/Common/Tools/WinNT;c:/VISUAL~1/Common/MSDev98/Bin;c:/VISUAL~1/Common/Tools;c:/VISUAL~1/VC98/bin;W:/nss/nss32/builds/20010330.1/blowfish_NT4.0_Win95/mozilla/dist/WINNT5.0_OPT.OBJ/bin;W:/nss/nss32/builds/20010330.1/blowfish_NT4.0_Win95/mozilla/dist/WINNT5.0_OPT.OBJ/lib
cd
W:/nss/nss32/builds/20010330.1/blowfish_NT4.0_Win95/mozilla/tests_results/security/clio.1/client
mkdir ../cpdir
pk12util -o Alice.p12 -n "Alice" -d ../alicedir
# ...
# pk12util: PKCS12 EXPORT SUCCESSFUL
pk12util -i Alice.p12 -d ../cpdir
#    ...
# pk12util: PKCS12 decoding failed: security library: improperly formatted
DER-encoded message.
# pk12util: PKCS12 decode not verified: security library: improperly formatted
DER-encoded message.

-------------------

cd c:/tmp
cp -r $HOSTDIR/alicedir .
mkdir cpdir
cp -r $HOSTDIR/client .
cd client
pk12util -o Alice.p12 -n "Alice" -d ../alicedir
# ...
# pk12util: PKCS12 EXPORT SUCCESSFUL
pk12util -i Alice.p12 -d ../cpdir
# ... - no failure message

ls -l Ali* /tmp/cli*/Al*
# -rw-r--r--   1 6793     everyone     2992 Mar 30 12:11 /tmp/client/Alice.p12
# -rw-r--r--   1 16730    everyone     5982 Mar 30 12:12 Alice.p12
(Assignee)

Comment 25

17 years ago
The test program in the attachment, compiled on Win2K, SP1, with MSVC 6.0, SP4.
Runs correctly with correct results when the file created is on a local 
filesystem. When the filesystem for is a networked filesystem, the problem 
manifests itself. ... Hmmmm.
(Assignee)

Comment 26

17 years ago
Created attachment 32200 [details] [diff] [review]
patch to ntio.c. on the tip of NSPR's tree
(Assignee)

Comment 27

17 years ago
After much work in the debugger trying to catch whatever it is that's 
misbehaving, I give up. I suspect a bug in the underlying operating system 
(Win2k). Wan-Teh suggested a fix: coerce the final SetFilePosition() to use 
FILE_BEGIN, with a calculated offset. This proves to work.

Patch will be checked in to NSPR's tip of tree and to 
NSPRPUB_RELEASE_4_1_BRANCH.
(Assignee)

Comment 28

17 years ago
Parts checked in.
(Reporter)

Comment 29

17 years ago
Could we change the NSS tip build to pull the NSPRPUB_RELEASE_4_1_BRANCH?
(Reporter)

Comment 30

17 years ago
I had not read this email:

> Is it possible to build NSS with NSPR 4.0 but test it with NSPR 4.1.1?

I do not think it would be a good idea to test only with the NSPR that is not
used in the "official" release, because it might not catch some other errors.

Maybe we could build and test with both NSPR releases on Windows, and not report
errors that only occur in the build with NSS 4.0 and seem connected to this bug?
The change to the QA scripts is a little bit bigger this way, but I think we get
better coverage.

> The reason we need to build NSS with NSPR 4.0 is that the Mozilla
> client is still using NSPR 4.0.  It is a major undertaking to upgrade
> Mozilla to NSPR 4.1.1 because their version of NSPR has diverged from
> NSPR 4.1.1.  So it is not going to happen soon.

Comment 31

17 years ago
The fix is in NSPR 4.1.1.  Marked the bug fixed.
Status: ASSIGNED → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.