Closed Bug 466745 Opened 16 years ago Closed 15 years ago

random number generator fails on windows ce

Categories

(NSS :: Libraries, defect, P2)

ARM
Windows CE
defect

Tracking

(Not tracked)

RESOLVED FIXED
3.12.3

People

(Reporter: blassey, Assigned: blassey)

References

Details

(Keywords: fixed1.9.1, mobile, Whiteboard: [wm-m1-b+])

Attachments

(7 files, 15 obsolete files)

4.71 KB, text/plain
Details
1.22 KB, patch
Details | Diff | Splinter Review
1.70 KB, text/plain
Details
1.80 KB, patch
nelson
: review+
Details | Diff | Splinter Review
13.87 KB, patch
nelson
: review+
Details | Diff | Splinter Review
1.72 KB, patch
nelson
: review+
Details | Diff | Splinter Review
3.64 KB, patch
wtc
: review+
Details | Diff | Splinter Review
When navigating to any site over https in fennec on windows ce, we get a page load error and a pop up that reads:

SSL experienced a failure of its random number generator.

(Error code: ssl_error_generate_random_failure)
Which one of you Fennec folks want to take this bug?
I suggest going and looking on the old WINCE branch to see how the PRNG 
was made to work there.  I no longer have a WinCE platform on which to try.
It is failing here:

http://mxr.mozilla.org/mozilla-central/source/security/nss/lib/freebl/prng_fips1861.c#506

506     if (rng->seedCount < MIN_SEED_COUNT) {
507         PZ_Unlock(rng->lock);
508         PORT_SetError(SEC_ERROR_NEED_RANDOM);
509         return SECFailure;
510     }


MIN_SEED_COUNT = 1024, rng->seedCount = 64
The PRNG needs to be "seeded" with an adequate amount of unpredictable stuff.
NSS presently requires 1K byte of it.  Depending on the source, 64-bytes is
probably not *nearly* enough.  I remember when I did a WinCE port of NSS about 
5 years ago (WINCE branch), I rewrote the seed/entropy gathering code for 
WinCE.  You can see that at either of these URLs
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=win_rand.c&branch=NSS_WINCE_ALPHA_BRANCH&root=/cvsroot&subdir=mozilla/security/nss/lib/freebl&command=DIFF_FRAMESET&rev1=1.4&rev2=1.4.2.1
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=win_rand.c&branch=&root=/cvsroot&subdir=mozilla/security/nss/lib/freebl&command=DIFF_FRAMESET&rev1=1.4&rev2=1.5

I guess that stuff is in the source you're using now, if your build defines
_WIN32_WCE, but maybe some things have changed, such as directory names, so 
that that code doesn't find the things it found back in 2002.

See also 
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=win_rand.c&branch=NSS_3_11_BRANCH&root=/cvsroot&subdir=mozilla/security/nss/lib/freebl&command=DIFF_FRAMESET&rev1=1.9.28.1&rev2=1.9.28.2
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=win_rand.c&branch=&root=/cvsroot&subdir=mozilla/security/nss/lib/freebl&command=DIFF_FRAMESET&rev1=1.9&rev2=1.10
which works around a bug in some versions of the MS CRT.

Wan-Teh wrote a patch to use the OS's native PRNG as a source of seed for 
NSS's PRNG.  You can see it at 
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=win_rand.c&branch=NSS_3_11_BRANCH&root=/cvsroot&subdir=mozilla/security/nss/lib/freebl&command=DIFF_FRAMESET&rev1=1.9&rev2=1.9.28.1
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=win_rand.c&branch=&root=/cvsroot&subdir=mozilla/security/nss/lib/freebl&command=DIFF_FRAMESET&rev1=1.10&rev2=1.11
If you can get it to build and run on WinCE, that's probably your best bet 
for seeding the PRNG.

I suspect that most of all of those patches are in the trunk code base you're
using.  You might have to tweak some ifdefs though.
The root cause of this problem is a broken file io in nspr. I'll file a seperate bug for that.

However it seems like it would be better to use the same code path on windows ce as we do on windows desktop.  This patch does that as much as possible.

Also, we were seeding with the result of GetTickCount twice, which isn't very random.
Assignee: nobody → blassey
Attachment #350135 - Flags: review?(nelson)
Depends on: 466790
Attached patch just the nss patch (obsolete) — Splinter Review
Sorry, that previous patch had the changes to nspr and build as well
Attachment #350135 - Attachment is obsolete: true
Attachment #350136 - Flags: review?(nelson)
Attachment #350135 - Flags: review?(nelson)
Comment on attachment 350136 [details] [diff] [review]
just the nss patch

Preliminary comment:

We claim that NSS can be built with gcc on windows desktops.
Now, I'm not sure, but I think this L"string" construct used in this patch 
is something that only works with Microsoft's compilers, not with gcc.  
If I'm right, then this patch would have the effect that NSS would no 
longer build with gcc on Windows desktops.  That would be bad.  
If I'm wrong, and you can point to something semi-authoritative that says 
that gcc supports it, please let me know.

>-    hModule = LoadLibrary("advapi32.dll");
>+    hModule = LoadLibraryW(L"advapi32.dll");
Besides the forgoing comment, my only other concern with this patch is 
that I think it is going to get a LOT less entropy than the code it's
replacing did.  The code it's replacing examined files in 3 directories:
"\\Windows\\Temporary Internet Files", "\\Temp", "\\Windows"
(admittedly it used the wrong method to get those path names).
The patch would reduce that to only the content of one directory, 
which I gather is the localized directory name equivalent to \\Windows.  
I seriously doubt there's enough entropy there to be worthwhile.  

The crucial factor in this data gathering it to strive for unpredictability.
Reading in the contents of unchanging files is of no value whatsoever.  
It would be a serious vulnerability on WinCE if the PRNG seeing was not
sufficiently random.  We're looking for ~1k bit of entropy, and are reading
at least 1K byte to try to get it.  

If you have more ideas for gleaning entropy from the system, please share 
them here.
may I suggest we get the entropy from some combination of CSIDL_BITBUCKET, CSIDL_FAVORITES, CSIDL_MYMUSIC, CSIDL_MYPICTURES, CSIDL_APPDATA, CSIDL_RECENT, CSIDL_DESKTOPDIRECTORY, CSIDL_MYVIDEO, and CSIDL_STARTUP.

CSIDL_BITBUCKET is the recycling bin, I think the rest are self explanatory.
Also, I think its important to do the same thing on windows ce as we do on desktop.  So I suggest we check those directories on desktop as well.  The trade off is the time we spend reading these in, so perhaps we stop at some threshold.

The last directory we read should be the system directory, because as you said it will be the most consistent of these directories.
How about contents of the browser's cache directory?  That always struck me
as the best source of entropy in files.
that does seem like a good place to find entropy, but I can't find a language independent way to get that directory.  If we hard code the path, my concern is that on a non-english device we'll be looking in the wrong directory and finding no files.

I think the recycling bin, my music and my pictures would all have a lot of entropy, assuming the user is using the device (I think that's an assumption for all of these though).  Recent documents should be good too.
Attached patch patch v2 (obsolete) — Splinter Review
conditional on __GNUC__, we either use L"" or just "" for the wide char string literal.

Also, this reads in the files in the 10 folders I enumerated before for both windows desktop and windows ce.  Is there anyway to actually measure/test the amount of entropy we're finding here?
Attachment #350136 - Attachment is obsolete: true
Attachment #350287 - Flags: review?(nelson)
Attachment #350136 - Flags: review?(nelson)
> Is there anyway to actually measure/test the amount of entropy we're finding 
> here?

A technique I used in the past, the last time I did significant work on the 
PRNG seeding, was to have the algorithm record all the bits that it would 
have fed into the PRNG (except for the contents of system files) into a file 
instead.  Execute it several times, producing several files of output.  
The compute the Hamming distance between those files, pairwise.  (Xor the 
contents of two of them and count the number of 1 bits in the result.) 
Average the results. This gives you an optimistic upper bound on the amount 
of entropy produced, on average, by the algorithm. 

I'll write more about considering the value of the file contents later.
Attached file hamming distances (obsolete) —
I did what you suggested on both windows mobile (an emulator) and windows xp, generating 10 files back to back, comparing them all.  The results are attached.

There's obviously something wrong with the implementation for desktop, I'll debug that.  Do the seeds for windows ce look sufficient?
Brad, thanks for doing that.  It was done long ago, but has not been 
repeated for a long time.  I am distressed by the results. 
They are unacceptably small (IMO) for Win XP, and seem absurdly high 
for WinCE.  (did you include the file contents in one but not the other?)
what I did was open a file in RNG_RNGInit, append *data to it in prng_RandomUpdate and close it in RNG_RNGShutdown.  I then wrote this test program:

int _tmain(int argc, _TCHAR* argv[])
{

        RNG_RNGInit();
        RNG_SystemInfoForRNG();

        RNG_RNGShutdown();
        return 0;
}

I suspect that on win XP we're not getting the "right" folders to look in.  If so, that should be easy to fix.
> I'll write more about considering the value of the file contents later.

One of the techniques used on some platforms to obtain bits to feed into 
the PRNG is to chooses a subset of files in one or more directories and 
feed the contents of the selected files into the PRNG.   

In a system where the contents of the directory being sampled (both the set
of files and the contents of the respective files) are very dynamic, and 
change frequently, this technique can contribute real entropy.  But in a 
system where the contents of the directories being samples are essentially
static, such as in a directory of system executables and shared libraries,
it is pretty worthless.  In such systems, all the entropy actually comes
from the algorithm by which the files are selected, not from the file 
contents.  Each file effectively offers no more than one bit of entropy,
and the entropy really comes from the question of whether a certain file
is chosen or not, not from its contents.  The number of bits of entropy
in that scheme in no more than n, and *may* be no more than log2(n) where 
n is the number of files selected.  

So, I suspect that the number of bits of entropy gathered from directories
such as Favories, MY vidio/audio/pictures, and especially startup, is 
simply too low to be worth much.  That is why I recommended that file 
contents be omitted from the hamming distance computations above.  
When a system is resorting to file contents for entropy, it's really 
desparate for entropy.   

On all windows systems (XP, CE, and what all) I think our best bet is to 
get a buffer of randomized data from the system's own PRNG, if possible.
So do you suggest we replace the ReadSystemFiles call in RNG_SystemInfoForRNG for windows with a call to CryptGenRandom?  Or just restrict the system files call to temp folders and the bit bucket and supplement it with CryptGenRandom?

I'll remove the system files and re-run the test.
Look at function RNG_SystemRNG.  It checks to see if the local system
possesses the shared library necessary for CryptGenRandom.  That DLL is
not present on all Windows systems.  So, NSS attempts to use it if present
but not if not.  

The use of files should probably be viewed as a fall-back, only as a 
desperate act when CryptGenRandom is not present.
Attached file more data
this file has hamming distances for seeds produced by:
windows xp w/ temp files
windows xp w/o temp files
windows ce emulator w/ temp files
windows ce emulator w/o temp files
windows ce device w/ temp files

the temp files are:
temp dir (returned by GetTempPath())
recycle bin (CSIDL_BITBUCKET)
recent docs (CSIDL_RECENT)
temp internet files (CSIDL_INTERNET_CACHE)
net neighbors (CSIDL_COMPUTERSNEARME)
internet history (CSIDL_HISTORY)

a couple notes:

I fixed a bug in the file reading code for windows desktop, and you can see a much larger distance between the seeds from the first run I posted.

windows xp w/o tmp files, and both emulator runs show about the same distances.  I assume the emulator isn't finding any randomized files.

On the windows ce device runs 1-7 happened in rapid succession.  There was more of a pause between 0 -> 1 and 7-> 8-> 9.  That pause seemed to effect the results.

The recycling bin's path is being returned as "" on both platforms, which I ignore.

Also, windows ce finds files in only 2 of the 5 directories that windows desktop does because the last 3 CSIDL values are defined on windows ce (and we skip the invalid recycling bin).
Attachment #352286 - Attachment is obsolete: true
Comment on attachment 352514 [details]
more data

Nelson, do these numbers look good enough to you?
Attachment #352514 - Flags: review?(nelson)
Brad, none of the numbers below 1K look good enough to me. :(
I'd prefer to see numbers in the 2-4k range, at least
Let me restate comment 23 in a way that doesn't sound like a mere whim.
There is a view that a PRNG should be seeded with an amount of truly random
entropy that exceeds the size of the largest key that will be generated 
from the output of that PRNG.  By that measure, most of the sets of numbers
gathered here don't contain enough seed entropy to generate a mere 1k bit
RSA key, much less a 2k bit key or 3k or 4k bit key.
Nelson, without the system files windows desktop doesn't meet that standard.  Perhaps this needs to be looked at more closely by someone familiar with the PRNG.
Seems like a Windows NSS bug, not just a CE bug as the OS field says. File a blocking bug and please cc: all here on it ("Clone This Bug" link at the bottom of this bug may do the right thing) -- I hope Nelson can recommend someone to take it.

/be
"Someone familiar with the PRNG".  That would be me.  

BTW, we're rewriting the PRNG (see bug 457045) and with it, the minimum 
number of acceptable entropy bits will be 384 bits.  Your set of numbers 
labeled "windows ce device, with temporary files" meets that criteria.  
However, some changes to the code for WinCE will undoubtedly be needed to
work with the new PRNG.
Depends on: drbg
Flags: wanted1.9.1?
Blocks: 475090
Nelson,  right now we crash on windows mobile as soon as we navigate off the first run page due to this, which is blocking our release.  Do you want to take the changes to how we read system files?  Another option is the CeGetRandomSeed function which returns a 64 bit number based on kernel activity.

http://msdn.microsoft.com/en-us/library/aa450794.aspx
Attachment #352514 - Flags: review?(nelson)
Comment on attachment 352514 [details]
more data

The results shown in the table named 

>windows ce device, with temporary files:

are acceptable for Win CE.
Comment on attachment 350287 [details] [diff] [review]
patch v2

Produce a version of this patch without the NSS_L hack
and I think it is likely to pass review.
Attachment #350287 - Flags: review?(nelson) → review-
Attached patch patch without NSS_L (obsolete) — Splinter Review
Attachment #350287 - Attachment is obsolete: true
Attachment #358762 - Flags: review?(nelson)
Brad, please attach the source to your test program, the one with which
you produced the results in attachment 352514 [details].  I want to try to find tune
the set of temporary files searched on Windows desktop so that it produces
adequate but not excessive results.
This patch is Brad's patch converted to a CVS diff.
I also changed the patch to use NSS's indentation style, (indent 4, tab=8),
and I removed all the old Win16 code, because that old code would not 
compile with Brad's patch applied, and we don't support Win16 any more.

Brad, 
please confirm that this patch faithfully incorporates your latest patch.

I would give this patch r+, but since I have contributed to it, I will ask
another NSS owner/peer to review it.
Attachment #358762 - Attachment is obsolete: true
Attachment #358762 - Flags: review?(nelson)
But before I ask for review on my version of this patch, I want to see if
the set of directories being used can be reduced and still produce 
acceptable results.
you can use this to compare two files with "hamming_dist file1 file2" or for a set of files with "hamming_dist -p <pattern>."  The format of pattern is the format used by FindFirstFile, documented here:

http://msdn.microsoft.com/en-us/library/aa364418%28VS.85%29.aspx
Some people in browser land have a benchmark that measures start-up time,
and they're very sensitive to changes that increase that time.  I think
we should ask them to test one or both of these last two patches and see
if they are going to object strenuously to either one.  But I don't know
who to ask.  Maybe one of the folks CC'ed on this bug knows who to ask?
Here are the results from a try server run:

Windows:
      twinopen: 257.42
      ts: 1156.53
      tp: 431.02
      tp_memset: 62.5MB
Linux:
      twinopen: 195.74
      ts: 1183.68
      tp: 420.05
      tp_RSS: 79.4MB
OS X:
      twinopen: 499.63
      ts: 1177.95
      tp: 298.55
      tp_RSS: 300.4MB

Talos didn't pick up my baseline build, but the Windows Ts numbers on the try server for the last 3 days range from 1153.37 - 1206.21, so it looks like this doesn't adversely effect Ts on Windows.
Flags: blocking1.9.1?
Flags: blocking1.9.1? → blocking1.9.1+
Priority: -- → P2
Brad,

Nelson's comment 20 is right on.  Please make sure that
RNG_SystemRNG can call CryptGenRandom successfully on
Windows Mobile.  You'll need to port the code to Windows
Mobile because the DLL's name is coredll.dll and I suspect
the function name CryptAcquireContextA won't work (note
the "A" suffix).  The MSDN page for CryptGenRandom also
mentions CeGenRandom, which you should look into.  I
guess we should try to use CeGenRandom first, and then
CryptGenRandom.  If CeGenRandom always exists, then we
don't need the CryptGenRandom fallback:
http://msdn.microsoft.com/en-us/library/aa923614.aspx
http://msdn.microsoft.com/en-us/library/aa912301.aspx

Your comment 28 mentioned CeGetRandomSeed.  I think it
is not as good as CeGenRandom.
Flags: wanted1.9.1? → wanted1.9.1+
Whiteboard: [wm-m1-b+]
Bob, this is the other bug that is trying to modify the PRNG seeding 
concurrently with your bug 457045
I agree with wtc's comments.  The new code is more dependent on RNG_SystemRNG than the old code was.

I suspect it will be easier to merge my changes into this patch than the other way around. I'm glad to see this patch is so close. It may be best to land the patch as is and then add the CeGenRandom.

We were hoping for FIPS feature complete, which requires the PRNG to land. I think the CeGenRandom can go in before FIPS code complete (hopefully next month).

bob
Comment on attachment 358769 [details] [diff] [review]
Brad's latest patch, converted to CVS diff, NSS coding style, removed Win16 code

r+

I'd like to see the CeGetRandom put in, but I believe this should work right now.

bob
Attachment #358769 - Flags: review+
Attached patch WIP patch - system rng (obsolete) — Splinter Review
This patch successfully gets pointers to the three dynamically loaded functions.  However, when calling pCryptAcquireContext, I get the following error and crash:

Virtual Memory overlap! CopyRegions failed to reserve memory in proc 'xulrunner.exe' (0x804580A0) for module 'rsaenh.dll' @ 0x1F8B0000!

My reading of the msdn docs is that you should attempt to load Crypt* methods and use CeGenRandom as a fallback.  Given this error though, perhaps we should just jump to using CeGenRandom directly.
Comment on attachment 359773 [details] [diff] [review]
WIP patch - system rng

For RNG_SystemRNG, WinMobile is different enough from Windows
Desktop that I think it's better to have separate implementations
of the entire function, rather than using ifdefs inside the
function.

Since Fennec is already linked with coredll.dll, you can use
those Crypt* and CeGenRandom functions directly, rather than
doing LoadLibraryW and GetProcAddress.

Replace the pRtlGenRandom code by CeGenRandom for WinMobile.
If you know CeGenRandom always exists, skip the Crypt* code
for WinMobile.
Attachment #359773 - Attachment is obsolete: true
Attachment #359791 - Flags: review?(nelson)
Attached patch combined patch (obsolete) — Splinter Review
This patch combines the patch that Bob Relyea previously gave r+ with 
Brad's latest patch to use CeGenRandom directly.  

Brad, if this is the patch you want, I'm willing to give it r+ now.
this looks right to me.  

At some point I'd like to sort out why the Crypt* APIs cause a crash, as the msdn docs lead me to believe they will give a "better" random number. But that is more of an enhancement and can be done in a follow up bug.
Comment on attachment 359791 [details] [diff] [review]
RNG_SystemRNG uses CeGenRandom directly

Brad: it's better to move the #ifdef WINCE to the outside
of the RNG_SystemRNG function.  The only code the two
implementations share is the single line:
    size_t bytes = 0;

This discussion thread has some info on the relation between
CeGenRandom and CryptGenRandom:
http://www.pocketpcjunkies.com/Uwe/Forum.aspx/wince-pb/4913/What-seeds-CeGenRandom

Also, page 13 of WinCE's FIPS Security Policy:
http://csrc.nist.gov/groups/STM/cmvp/documents/140-1/140sp/140sp825.pdf
shows that CeGenRandom is exactly the function to be use in
RNG_SystemRNG -- for generating the seeding material for
a RNG.
Version: unspecified → trunk
Comment on attachment 359806 [details] [diff] [review]
combined patch

r=nelson
I plan to commit this patch.  Further refinements can be done in subsequent patches, perhaps in subsequent bugs.
Attachment #359806 - Flags: review+
Comment on attachment 359791 [details] [diff] [review]
RNG_SystemRNG uses CeGenRandom directly

This patch is ok with me, as part of the larger patch.
Attachment #359791 - Flags: review?(nelson) → review+
(In reply to comment #49)
> (From update of attachment 359791 [details] [diff] [review])
> Brad: it's better to move the #ifdef WINCE to the outside
> of the RNG_SystemRNG function.  The only code the two
> implementations share is the single line:
>     size_t bytes = 0;

Done. Moved it into the same #ifdef/#else block as RNG_FileForRNG
 
> This discussion thread has some info on the relation between
> CeGenRandom and CryptGenRandom:
> http://www.pocketpcjunkies.com/Uwe/Forum.aspx/wince-pb/4913/What-seeds-CeGenRandom
> 
> Also, page 13 of WinCE's FIPS Security Policy:
> http://csrc.nist.gov/groups/STM/cmvp/documents/140-1/140sp/140sp825.pdf
> shows that CeGenRandom is exactly the function to be use in
> RNG_SystemRNG -- for generating the seeding material for
> a RNG.

Thanks, looks like there is no need for a follow up bug then.
Attachment #359791 - Attachment is obsolete: true
Attachment #359814 - Flags: review?(nelson)
I'm going to test this, and if it passes my test, I plan to commit it today.
Attachment #358769 - Attachment is obsolete: true
Attachment #358785 - Attachment is obsolete: true
Attachment #359806 - Attachment is obsolete: true
Attachment #360136 - Flags: review?(nelson)
Comment on attachment 359814 [details] [diff] [review]
updated based on wtc's comments

Brad,
Question: in what header file, if any, is CeGenRandom declared?
Is that file included in this version of win_rand.c?
The MSDN page for CeGenRandom
(http://msdn.microsoft.com/en-us/library/aa912301.aspx)
says it's declared in winbase.h.  So we should already
have included that header (via windows.h).
Attachment #360136 - Flags: review?(nelson) → review+
Comment on attachment 360136 [details] [diff] [review]
combined patch including Brad's latest change (checked in)

Committed on trunk.
Checking in config.mk;  new revision: 1.22; previous revision: 1.21
Checking in win_rand.c; new revision: 1.16; previous revision: 1.15
Nelson, can you push to mozilla-central as well?
Remove an unnecessary memset call.  Add a comment
that cites official Microsoft documnentation of
the appropriateness of using CeGenRandom.

Nelson, please review this patch.

Brad:
- Please verify (in the debugger or using printf statements)
  that the CeGenRandom call succeeds.  This is the most
  important call in our entropy collection code.
- Please ask Kai Engert to push a new NSS snapshot to
  mozilla-central.  Nelson doesn't do that.
Attachment #360180 - Flags: review?(nelson)
CeGenRandom is returning 1024 bytes of data.
Kaie, can you push these changes to mozilla-central?
Brad,
The NSS team's policy is that we only support NSS builds made from sources
that match tags that the NSS team has put into its own CVS tree.  
We regard all other trees (than our CVS repository) to be downstream trees.
That includes Mozilla-Central. We do not maintain downstream trees.

Because Mozilla desires that the NSS team will support the versions of NSS
that it uses, Mozilla generally does not "push" individual patches for NSS
to its Hg repositories, but rather pushes snapshots that correspond to tags
in NSS's CVS repository.  

In keeping with those facts, your request is equivalent to asking for a 
tag to be generated in the NSS CVS tree, for the purpose of being the base
for such a snapshot.  We just did a snapshot in the last week or two, IIRC.
More changes are coming soon.  I guess it's up to Kai to decide how 
frequently he wants to make and push new tags/snapshots.
Attachment #360180 - Flags: review?(nelson) → review+
Comment on attachment 360180 [details] [diff] [review]
Cite the WinCE FIPS Security Policy for CeGenRandom

r=nelson for Wan-Teh's change.
Attachment #359814 - Flags: review?(nelson) → review+
Attachment #360136 - Attachment description: combined patch including Brad's latest change → combined patch including Brad's latest change (checked in)
Comment on attachment 360180 [details] [diff] [review]
Cite the WinCE FIPS Security Policy for CeGenRandom

I checked in this patch on the NSS trunk (3.12.3).

Checking in win_rand.c;
/cvsroot/mozilla/security/nss/lib/freebl/win_rand.c,v  <--  win_rand.c
new revision: 1.17; previous revision: 1.16
done
Note that the old code for WinCE, which was just replaced by Brad's patch,
did a RECURSIVE walk through all subdirectories of the directories named
at the top level.  The new patch does not, but only walks through the 
direct contents of the named directories.  This has significant impact on
the entropy gained from some directories, such as the directory of 
"Temporary Internet Files" (CSIDL_INTERNET_CACHE) and history (CSIDL_HISTORY)
because those directories generally contain only one member, a subdirectory,
in which all the interesting files are found.  For example, the directory
found via CSIDL_INTERNET_CACHE contains just one subdirectory named 
Content.IE5 (or similar, depending on version of IE).  

So, this recent patch actually reduces the amount of entropy gathered from 
those directories compared to the presious version of the code for WinCE.

I will leave it up to Brad to determine if the present situation is 
satisfactory for WinCE.  Brad, if you're happy with the code as it now is
in the CVS trunk, please mark this bug resolved/fixed.
I will pursue this for Windows desktop in bug 475090.
Attached patch recurses into directories (obsolete) — Splinter Review
(In reply to comment #64)
> Note that the old code for WinCE, which was just replaced by Brad's patch,
> did a RECURSIVE walk through all subdirectories of the directories named
> at the top level.  The new patch does not, but only walks through the 
> direct contents of the named directories.  This has significant impact on

No reason not to recurse into the subdirectories, this patch will add that back.

> the entropy gained from some directories, such as the directory of 
> "Temporary Internet Files" (CSIDL_INTERNET_CACHE) and history (CSIDL_HISTORY)
> because those directories generally contain only one member, a subdirectory,
> in which all the interesting files are found.  For example, the directory
> found via CSIDL_INTERNET_CACHE contains just one subdirectory named 
> Content.IE5 (or similar, depending on version of IE).  

Note that the CSIDL_INTERNET_CACHE and CSIDL_HISTORY aren't defined on windows ce.  Fenenc uses the temp directory for its cache, which as it turns out is an excellent source of entropy once the browser has been run a bit.

> 
> So, this recent patch actually reduces the amount of entropy gathered from 
> those directories compared to the presious version of the code for WinCE.
> 
> I will leave it up to Brad to determine if the present situation is 
> satisfactory for WinCE.  Brad, if you're happy with the code as it now is
> in the CVS trunk, please mark this bug resolved/fixed.
> I will pursue this for Windows desktop in bug 475090.

My impression is that with the inclusion of the recursion patch we will be better off than we were before in terms of entropy.  With that reviewed and committed, I'll close the bug.
Attachment #360254 - Flags: review?(nelson)
Attached patch recurses into directories v2 (obsolete) — Splinter Review
I edited Brad's patch (attachment 360254 [details] [diff] [review]) and checked it
in on the NSS trunk (NSS 3.12.3).

Checking in win_rand.c;
/cvsroot/mozilla/security/nss/lib/freebl/win_rand.c,v  <--  win_rand.c
new revision: 1.18; previous revision: 1.17
done
Attachment #360254 - Attachment is obsolete: true
Attachment #360254 - Flags: review?(nelson)
Marked the bug fixed.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Target Milestone: --- → 3.12.3
uh, not yet.   The old code that recursed into subdirectories had some 
features to limit the amount of data it took in from that recursion,
without which it can conceivably take a LONG time to complete.  
Brad's new patch (now committed) doesn't employ any such limit.  
We need such limits, IMO.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Wan-Teh, Tinderbox on goride machine fails on building with this patch, can you please look there ?
Comment on attachment 360655 [details] [diff] [review]
recurses into directories   v2

Sorry that I broke the build.  This patch caused shlibsign
to crash while signing softokn3.dll.  I backed out this patch
(but kept the "shlobj.h" to <shlobj.h> change).

Checking in win_rand.c;
/cvsroot/mozilla/security/nss/lib/freebl/win_rand.c,v  <--  win_rand.c
new revision: 1.19; previous revision: 1.18
done

shlibsign crashed after running for a long time, so I
guess it was an infinite recursion.
Attachment #360655 - Attachment description: recurses into directories (as checked in) → recurses into directories v2
Attachment #360655 - Attachment is obsolete: true
(In reply to comment #60)
> Kaie, can you push these changes to mozilla-central?

If I understand correctly, this bug is not yet fixed, so it's not yet helpful to push a new tag.
Attachment #360783 - Flags: review?(nelson) → review-
Comment on attachment 360783 [details] [diff] [review]
honors return value of func to stop recursion

The real problem with this patch (and with its predecessor) is that it will recurse with values of '.' and '..'.  That's why it went into an apparently infinite loop.  The code must take care not to recurse with those two values.
Also, this patch changes the spacing of this method to match file style
Attachment #360783 - Attachment is obsolete: true
Attachment #360798 - Flags: review?(nelson)
Comment on attachment 360798 [details] [diff] [review]
returns early for path's ending in "\.", "/." or ".."


>+    do {
>+      // pass the full pathname to the callback
>+      _snwprintf(szFileName,_MAX_PATH, L"%s\\%s", szSysDir, fdData.cFileName);
>+      WideCharToMultiByte(CP_ACP, 0, szFileName, -1, 
>+                          narrowFileName,_MAX_PATH,0,0);
>+      if (fdData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
>+        EnumSystemFilesInFolder(func, szFileName);
>+      else
>+        iStatus = (*func)(narrowFileName)?0:FindNextFileW(lFindHandle, &fdData);
>+    } while (iStatus != 0);

In the code path that calls EnumSystemFilesInFolder, iStatus is left 
uninitialized when it reaches the final while statement.  Also, in 
that code path, nothing calls FindNextFileW, so if iStatus happens to 
be non-zero, the value of fdData.cFileName will never change.
Attachment #360798 - Flags: review?(nelson) → review-
Comment on attachment 360136 [details] [diff] [review]
combined patch including Brad's latest change (checked in)

This patch removed a bunch of code that bore comments saying it was Win16 code. But the ifdef's for that code were typically based on _WIN32 not being defined.  I'm worried that OS/2 may have also dependend on that code, as well as Win16. :(
Comment on attachment 360136 [details] [diff] [review]
combined patch including Brad's latest change (checked in)

OS/2 uses os2_rand.c instead.  win_rand.c is only used for
WIN32 and WIN16.
Attached patch patch to test (obsolete) — Splinter Review
I am about to test this patch.  Will request review if/when it passes.

Wan-Teh, That's very good news about OS/2!
Attachment #360798 - Attachment is obsolete: true
Comment on attachment 360834 [details] [diff] [review]
patch to test

Wan-Teh, please review this small patch.
Comment on attachment 360834 [details] [diff] [review]
patch to test

r=wtc.

>+typedef PRInt32 (* handler)(const char *);

The type name "handler" should be capitalized, otherwise it
looks like a variable name.

>+    int                 iSuccess;

This variable could use a better name.  It actually means
whether we should continue in the do-while loop.

>+    if (szSysDir[pathLen - 1] == L'.' &&
>+       (szSysDir[pathLen - 2] == L'\\' || 
>+        szSysDir[pathLen - 2] == L'/' || 
>+        szSysDir[pathLen - 2] == L'.' ))
>+        return;

Please add a comment to say we want to skip "." and "..".
You shouldn't need to check for L'/' because we construct
the pathname using L"%s\\%s".

>     do {
> 	// pass the full pathname to the callback
> 	_snwprintf(szFileName,_MAX_PATH, L"%s\\%s", szSysDir, fdData.cFileName);
> 	WideCharToMultiByte(CP_ACP, 0, szFileName, -1,
> 			    narrowFileName, _MAX_PATH, NULL, NULL);

The WideCharToMultiByte call should be moved to the else
block of the if statement you added, because narrowFileName
is only used in the else block.
Attachment #360834 - Flags: review?(wtc) → review+
Because Nelson pointed me to this bug, I tested NSS trunk on OS/2. It still works as well as before. (At least when built within mozilla-central which is what we mainly care about.)
Comment on attachment 360834 [details] [diff] [review]
patch to test

Two more suggested changes:

1. Use the new "handler" type in:

160 static BOOL
161 EnumSystemFiles(PRInt32 (*func)(const char *))

2. Skip "." and ".." inside the do-while loop here:
      
>@@ -149,13 +162,18 @@ EnumSystemFilesInFolder(PRInt32 (*func)(
>     do {
> 	// pass the full pathname to the callback
> 	_snwprintf(szFileName,_MAX_PATH, L"%s\\%s", szSysDir, fdData.cFileName);

At the beginning of this do while loop, you can do something
like:

      do {
          // skip . and ..
          if (wcscmp(fdData.cFileName, L".") == 0 ||
              wcscmp(fdData.cFileName, L"..") == 0) {
              iSuccess = 1;
              continue;
          }

This avoids the issue with L'/'.

By the way, we should check the return value of the
WideCharToMultiByte call for failure.
nelson, are you going to commit this patch?

kaie, when this change has landed in cvs I'd like to get it tagged and pushed to hg.
The changes Wan-Teh suggested produced an infinite loop, but I liked the
idea so I took it and made it work in this patch.  Wan-Teh, please review.
Attachment #360834 - Attachment is obsolete: true
Attachment #361352 - Flags: review?(wtc)
Attachment #361352 - Flags: review?(wtc) → review+
Comment on attachment 361352 [details] [diff] [review]
patch, incorporate WTC's latest suggestions, and other cleanup

r=wtc.

>+    PRUint32            pathLen = wcslen(szSysDir);

It seems that 'pathLen' is not used and can be removed.
Nelson, after you check this in, I can create the
NSS_3_12_3_BETA3 CVS tag and push it to mozilla-central.
Thanks, Wan-Teh.  I removed pathLen and committed it.
Checking in win_rand.c; new revision: 1.20; previous revision: 1.19
I consider this bug resolved now on the CVS trunk.
Reopen if you disagree.
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
No longer blocks: 475090
I pushed NSS_3_12_3_BETA3 (tagged this morning) to
mozilla-central in changeset 8334582740d7:
http://hg.mozilla.org/mozilla-central/rev/8334582740d7
This patch needs to be pushed to mozilla-191 as well.  Its still not entirely clear to me if this needs to be done by an NSS peer or not.  If this is the case, could one of the peers do that?  If not, please comment on the bug and I'll push the change.
We can't push an NSS Beta to mozilla-191 because Firefox
3.1 is close to being final.  We need to make sure Firefox
3.1 final ships with an NSS final release.  If we are
committed to releasing NSS 3.12.3 RTM soon for Firefox 3.1,
then we can push a Beta to mozilla-191 first.
The agreement is that the Hg repositories will contain snapshots of NSS
that match CVS tags created by the NSS team for this purpose (e.g. RTM tags
or Beta tags, etc.).  It's not entirely clear whose job it is to actually
push those fixes into the Hg repositories.  The only peers of the NSS 
team who use Hg at all appear to be Wan-Teh and Kai.  Personally, I'd 
prefer that the NSS team not be responsible for updating downstream 
repositories.  In any case, a new tag needs to be produced, and right 
now, we're working on fixing some vulnerabilities.  When we've completed
that work (soon) would be a good time to make another such tag.
Just for planning purposes, what is the time frame to have something that's worthy of being tagged and pushed to 1.9.1?
I would like to remind the NSS team, until bug 474606 is resolved, the Firefox 3.1 drivers probably won't accept a newer NSS snapshot for Mozilla 1.9.1.
In reply to Kai's comment 95,  there will soon be some vulnerability fixes
and I'm pretty sure drivers will want them regardless of bug 474606.
As this is an NSS bug, I've filed bug 481968 to track landing a more recent NSS snapshot into mozilla-1.9.1 (equiv. Firefox 3.1)


(In reply to comment #96)
> In reply to Kai's comment 95,  there will soon be some vulnerability fixes
> and I'm pretty sure drivers will want them regardless of bug 474606.

Let's make sure that the Firefox 3.1 drivers are aware of this conflict (if Firefox wants security fixes, they must accept the regression bug 474606, should a workaround not yet be ready).
I've added a dependency on bug 481968 here. As I understand it from a quick reading, for this to land on mozilla-1.9.1 it will need the NSS Beta. If there's another way (creating an NSS tag with just the fix required by this bug) please let us know, and remove the dependency.
(In reply to comment #68)
> uh, not yet.   The old code that recursed into subdirectories had some 
> features to limit the amount of data it took in from that recursion,
> without which it can conceivably take a LONG time to complete.  
> Brad's new patch (now committed) doesn't employ any such limit.  
> We need such limits, IMO.
Are there any such limits yet? From time to time I was running in to a slow script warning in the login manager code which turned out to be caused by trying to get the pk11tokendb service, which needed to initialise NSS, which was enumerating my IE cache. Clearing my IE cache resolved my problem.
Nail, yes, the code now limits the depth into the directories into which
it will descend looking for files.  Look for "maxDepth" in the code.

Perhaps we also need a limit on the number of files and/or immediate 
subdirectories of a single directory through which we will search.
Is there some special significance to depth here?  Why not just limit on the total number of files processed and then abort the DFS/BFS search when we hit that limit?
(In reply to comment #100)
> Nail, yes, the code now limits the depth into the directories into which
> it will descend looking for files.  Look for "maxDepth" in the code.
> 
> Perhaps we also need a limit on the number of files and/or immediate 
> subdirectories of a single directory through which we will search.

What are the cryptographic properties of these files?
(In reply to comment #100)
> Nail, yes, the code now limits the depth into the directories into which
> it will descend looking for files.  Look for "maxDepth" in the code.
All Temporary Internet Files are stored at the same depth:
%USERPROFILE%\Local Settings\Temporary Internet Files\Content.IE5\<salt>\<file>

For some reason I now see the scan happening four times (as recorded by printf()s I've added to EnumSystemFiles).
Would be nice to see the stacks.  I wonder if NSS is being repeatedly
initialized and uninitialized.
(In reply to comment #104)
> Would be nice to see the stacks.
Seems to be caused by this block of code from pkcs11.c:

    rv = RNG_RNGInit();         /* initialize random number generator */
    if (rv != SECSuccess) {
	crv = CKR_DEVICE_ERROR;
	return crv;
    }
    rv = BL_Init();             /* initialize freebl engine */
    if (rv != SECSuccess) {
	crv = CKR_DEVICE_ERROR;
	return crv;
    }
    RNG_SystemInfoForRNG();

RNG_RNGInit() already calls RNG_SystemInfoForRNG() (via PR_CallOnce no less)!

>freebl3!RNG_SystemInfoForRNG(void)+0x223 [security\nss\lib\freebl\win_rand.c @ 358]
>freebl3!rng_init(void)+0x14c [security\nss\lib\freebl\drbg.c @ 420]
>nspr4!PR_CallOnce(struct PRCallOnceType * once = 0x0b938cf8, <function> * func = 0x0b8e73a0)+0x35 [nsprpub\pr\src\misc\prinit.c @ 808]
>freebl3!RNG_RNGInit(void)+0x13 [security\nss\lib\freebl\drbg.c @ 462]
>softokn3!RNG_RNGInit(void)+0x25 [security\nss\lib\freebl\loader.c @ 835]
>softokn3!nsc_CommonInitialize(void * pReserved = 0x0012d7d4, int isFIPS = 0)+0x57 [security\nss\lib\softoken\pkcs11.c @ 2582]

>freebl3!RNG_SystemInfoForRNG(void)+0x223 [security\nss\lib\freebl\win_rand.c @ 358]
>softokn3!RNG_SystemInfoForRNG(void)+0x22 [security\nss\lib\freebl\loader.c @ 1570]
>softokn3!nsc_CommonInitialize(void * pReserved = 0x0012d7d4, int isFIPS = 0)+0x91 [security\nss\lib\softoken\pkcs11.c @ 2602]
Neil, please file a new bug about this issue.  
This bug is too big, and really is fixed, I believe.

(Oh, and sorry for mistyping your name in comment 100.  
You can call me Nailson, if you wish. :)
Blocks: 489811
Keywords: fixed1.9.1
Depends on: 501605
(In reply to comment #105)
> RNG_RNGInit() already calls RNG_SystemInfoForRNG() (via PR_CallOnce no less)!

BTW, I believe neil filed that as new bug 489811.
You need to log in before you can comment on or make changes to this bug.