Closed Bug 70765 Opened 24 years ago Closed 24 years ago

PR_Writes misbehaves when writing to Network drives from NT

Categories

(NSPR :: NSPR, defect, P3)

x86
Windows 2000
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: sonja.mirtitsch, Assigned: larryh)

References

Details

Attachments

(5 files)

Errormessage is: PKCS12 decode not verified Problem not seen on Win95 build, only on NT build, greasybear and clio same behavior, both DEBUG and OPT, SonjaNT no failures seen This seems to be different from the 64 bit failures since these already occur with pk12util -o, error message on 64 bit failure is pk12util: add cert and key failed. tools.sh: Tools Tests =============================== tools.sh: Exporting Alice's email cert & key------------------ pk12util -o Alice.p12 -n "Alice" -d ../alicedir -k ../tests.pw.876 \ -w ../tests.pw.876 Converted from: 6e 73 73 0 Converted to: 0 6e 0 73 0 73 0 0 Converted from: 6e 73 73 0 Converted to: 0 6e 0 73 0 73 0 0 Converted from: 6e 73 73 0 Converted to: 0 6e 0 73 0 73 0 0 pk12util: PKCS12 EXPORT SUCCESSFUL tools.sh: Importing Alice's email cert & key ----------------- pk12util -i Alice.p12 -d ../tools/copydir -k ../tests.pw.876 -w ../tests.pw.876 Converted from: 6e 73 73 0 Converted to: 0 6e 0 73 0 73 0 0 pk12util: PKCS12 decode not verified ======================================== the output of pk12util -i should be: ============================================================ tools.sh: Tools Tests =============================== tools.sh: Exporting Alice's email cert & key------------------ pk12util -o Alice.p12 -n "Alice" -d ../alicedir -k ../tests.pw.936 \ -w ../tests.pw.936 Converted from: 6e 73 73 0 Converted to: 0 6e 0 73 0 73 0 0 Converted from: 6e 73 73 0 Converted to: 0 6e 0 73 0 73 0 0 Converted from: 6e 73 73 0 Converted to: 0 6e 0 73 0 73 0 0 pk12util: PKCS12 EXPORT SUCCESSFUL tools.sh: Importing Alice's email cert & key ----------------- pk12util -i Alice.p12 -d ../tools/copydir -k ../tests.pw.936 -w ../tests.pw.936 Converted from: 6e 73 73 0 Converted to: 0 6e 0 73 0 73 0 0 tools.sh: Create objsign cert -------------------------------
Priority: -- → P3
Target Milestone: --- → 3.2.1
reassigning
Assignee: relyea → nicolson
The script I am running is tools.sh - nss/tests/tools. It does not show up in daily QA because we only run the 95build on win2k
verified that the problem occurs with the WinNT version of pk12util, no matter if the certs were generated with the WinNT or the Win95 build of certutil
Actually what's happening is the pk12util -o is producing a bogus PKCS #12 file. I'll attach the ASN.1 dump of the output. A zero byte is getting inserted after every legitimate byte--as if it were trying to convert ASCII chars to Unicode? I watched this in the debugger, and only 1 byte is being passed to PR_Write each time...SO, it appears to be happening in the NSPR layer. Sonja mentioned that clio has Chinese character support installed, and perhaps that was convincing Win2k that it needed to unicode convert things?
Attached file hex dump of Alice.p12
Since yesterday we have a similar failure show up on jordan (AIX) jordan, qa log 3/28 tools.sh: Tools Tests =============================== tools.sh: Exporting Alice's email cert & key------------------ pk12util -o Alice.p12 -n "Alice" -d ../alicedir -k ../tests.pw.13396 \ -w ../tests.pw.13396 Converted from: 6e 73 73 0 Converted to: 0 6e 0 73 0 73 0 0 Converted from: 6e 73 73 0 Converted to: 0 6e 0 73 0 73 0 0 Converted from: 6e 73 73 0 Converted to: 0 6e 0 73 0 73 0 0 pk12util: PKCS12 EXPORT SUCCESSFUL tools.sh: Importing Alice's email cert & key ----------------- pk12util -i Alice.p12 -d ../tools/copydir -k ../tests.pw.13396 -w ../tests.pw.13396 Converted from: 6e 73 73 0 Converted to: 0 6e 0 73 0 73 0 0 pk12util: PKCS12 decode not verified tools.sh: Create objsign cert -------------------------------
I reran the tests of 3/26 today, and jordan still failed. The tests of this build that were run on 3/26 passed, today's tests failed. I am not certain that this are 2 differnt problems, so I don't slit this bug yet.
Attached file result.html
It seems an unrelated failure, Bob compared filesize, and the failure on clio already shows an Alice.p12 file that is twice the size of the other Alice.p12 files (other OS and Win95 build) jordan's p12 file has the same size as all the others Adding Larry since Nelson thinks it might be an NSPR problem
Here's why I think this may be an NSPR problem. The program in question calls PR_Open and passes in the value 0660 for the "mode" argument. The program runs fine on NT with both the win95 and winNT versions of NSPR. The program runs fine on Win2K with the win95 version of NSPR. But the combination of the winNT version of NSPR and the Win2K OS is where the failure occurs. I believe the problem is that Win2k now gives meaning to some bits in the mode argument to open that it previously ignored. It may be the case that the winNT version of NSPR passes the mode bits through, but the Win95 version doesn't pass them or alters them. In any case, I suspect the short term answer is for the program to stop assumeing that 0660 works on all platforms, and have it start using a mode value on NT that is computed using NT's defintions of the mode bits.
Sigh, Looking at the NSPR code, neither the NT nor Win95 versions pay the slightest attention to the value of 'mode'. The only difference I can see between the two implementations (which, though formatted differently, do essetially the same mappings) is NT add the FILE_FLAG_OVERLAP to flag6. The only difference between pk12util and other working functions is the PR_TRUNCATE flag is set. At first I though maybe the file was operating in a mode where it would 'unicode' certain bytes, but I was why it was unicoding some bytes but not otheres. Then when I traced the code our last night I found that the offending bytes were single byte writes. To hack around this I converted this to make double byte writes. Now the code writes 2 bytes of value followed by 2 bytes of '0'. NOTE: I now notice that all writes seem to right a block of zeros following the raw write of bytes, where that block of zero is the same size as the original write. bob
Well, to me the major clue is that this problem occurs with the NT version of NSPR on Win2k, but not with the win95 version on Win2k. That surely sounds like an NSPR problem to me. The mode bits may or may not be relevant.
AIX failure was due to a lock file and has been split off this bug, see bug #74009, QA on Jordan has been rerun and passed
I agree. Larry's off looking at it now. In the meantime I'm going to back out the hack I put in, as it doesn't solve the problem. bob
Am I to understand that this funny business with the extra zero bytes has been seen on Joran (AIX) _and_ clio (Win2k) ??
nelson: jordan had a different problem - lock file existed - please see bug #74009. In my opinion this bug had nothing to do with the problem we are looking at here.
No, just clio. Jordan was a different problem (a machine config problem).
Then I don't understand Sonja's comments of 2001-03-28 10:41. Those comments claim to be a Jordan QA log, and they demonstrate the extra zeros problem.
Well - that is what jordan did - that is why I thought it was the same problem, it looked the same to me - but it went away when the lockfile was removed.
Good news. I built Larry's little test program which just does PR_Opens and PR_Writes and reproduced the problem without any use of NSS! So far we don't know exactly all the factors, but we do know: 1) NSS is not an issue. 2) It appears the network is part of the factor. Running the command locally produces correct results, but running the command and writing to a network drive does! (Sonia, could we copy one of our builds locally to clio and see if it solves the problem?). 3) We don't know if compiler differences affect the outcome. 4) Using WIN95NSPR verse WINNTNSPR affects the problem. A quick look at the code differences is WINNT is setting a threading flag of some sort on open. I'm going to reassign this bug now to NSPR and Larry. My guess is we'll eventually find it's a bug in the Windows/2000 OS itself, though.
Assignee: nicolson → larryh
Component: Tools → NSPR
Product: NSS → NSPR
Summary: pk12util -i fails on Win2K → PR_Writes misbehaves when writing to Network drives from NT
Target Milestone: 3.2.1 → 4.1.1
Version: 3.2 → 4.0
I'll try to copy it locally and test it.
Status: NEW → ASSIGNED
test on clio ============ created certificates with regular QA cert.sh run tools.sh to verify failure set PATH to NT build bin and lib cd to networkdirectory that contains the certs export Alice.p12 (Alice.p12 is 2992 bytes long) inport Alice.p12 --- FAILURE copy certificates to c:/tmp cd to the local directory that contains the certs export Alice.p12 (Alice.p12 is 5982 bytes long) inport Alice.p12 --- IMPORT OK --------------------- cd W:/nss/nss32/builds/20010330.1/blowfish_NT4.0_Win95/mozilla/security/nss/tests/tools tools.sh # create certs # tools test # failure on p12 import PATH=g:/bin;z:/nstools/bin;z:/nstools/perl5;c:/mksnt;C:/WINNT/System32;C:/WINNT;.;Z:/bin;c:/MKS/bin;c:/MKS/bin/x11;c:/winnt/system;c:/VISUAL~1/Common/Tools/WinNT;c:/VISUAL~1/Common/MSDev98/Bin;c:/VISUAL~1/Common/Tools;c:/VISUAL~1/VC98/bin;W:/nss/nss32/builds/20010330.1/blowfish_NT4.0_Win95/mozilla/dist/WINNT5.0_OPT.OBJ/bin;W:/nss/nss32/builds/20010330.1/blowfish_NT4.0_Win95/mozilla/dist/WINNT5.0_OPT.OBJ/lib cd W:/nss/nss32/builds/20010330.1/blowfish_NT4.0_Win95/mozilla/tests_results/security/clio.1/client mkdir ../cpdir pk12util -o Alice.p12 -n "Alice" -d ../alicedir # ... # pk12util: PKCS12 EXPORT SUCCESSFUL pk12util -i Alice.p12 -d ../cpdir # ... # pk12util: PKCS12 decoding failed: security library: improperly formatted DER-encoded message. # pk12util: PKCS12 decode not verified: security library: improperly formatted DER-encoded message. ------------------- cd c:/tmp cp -r $HOSTDIR/alicedir . mkdir cpdir cp -r $HOSTDIR/client . cd client pk12util -o Alice.p12 -n "Alice" -d ../alicedir # ... # pk12util: PKCS12 EXPORT SUCCESSFUL pk12util -i Alice.p12 -d ../cpdir # ... - no failure message ls -l Ali* /tmp/cli*/Al* # -rw-r--r-- 1 6793 everyone 2992 Mar 30 12:11 /tmp/client/Alice.p12 # -rw-r--r-- 1 16730 everyone 5982 Mar 30 12:12 Alice.p12
The test program in the attachment, compiled on Win2K, SP1, with MSVC 6.0, SP4. Runs correctly with correct results when the file created is on a local filesystem. When the filesystem for is a networked filesystem, the problem manifests itself. ... Hmmmm.
After much work in the debugger trying to catch whatever it is that's misbehaving, I give up. I suspect a bug in the underlying operating system (Win2k). Wan-Teh suggested a fix: coerce the final SetFilePosition() to use FILE_BEGIN, with a calculated offset. This proves to work. Patch will be checked in to NSPR's tip of tree and to NSPRPUB_RELEASE_4_1_BRANCH.
Parts checked in.
Could we change the NSS tip build to pull the NSPRPUB_RELEASE_4_1_BRANCH?
I had not read this email: > Is it possible to build NSS with NSPR 4.0 but test it with NSPR 4.1.1? I do not think it would be a good idea to test only with the NSPR that is not used in the "official" release, because it might not catch some other errors. Maybe we could build and test with both NSPR releases on Windows, and not report errors that only occur in the build with NSS 4.0 and seem connected to this bug? The change to the QA scripts is a little bit bigger this way, but I think we get better coverage. > The reason we need to build NSS with NSPR 4.0 is that the Mozilla > client is still using NSPR 4.0. It is a major undertaking to upgrade > Mozilla to NSPR 4.1.1 because their version of NSPR has diverged from > NSPR 4.1.1. So it is not going to happen soon.
The fix is in NSPR 4.1.1. Marked the bug fixed.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
See Also: → 1586070
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: