Closed Bug 57985 Opened 24 years ago Closed 24 years ago

Mozilla PSM-GLUE does not seed PSM

Categories

(Core Graveyard :: Security: UI, defect, P3)

1.0 Branch
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: dougt, Assigned: javi)

References

Details

(Whiteboard: [rtm++] fix in trunk)

Attachments

(3 files)

psm-glue does not seed PSM!


From Nelson:

The plan was as follows:

 1. The client lib provides a function that the client is supposed to call
 to supply entropy to PSM.  When the client calls it, the client lib appends
 any data supplied by the client to a buffer.

 2. Whenever the client makes a call that necessitates a request-response
 between the client and the PSM server, the request is preceeded by a message
 that transfers the buffered entropy to the PSM server.

 I believe Mark implemented the client library functions described above.
 However, I have NO IDEA whether anyone ever did any work to actually call
 the function described in (1) above to actually provide any entropy from the
 application to the client library.
Severity: normal → blocker
Keywords: dogfood, rtm
OS: Linux → All
Hardware: PC → All
PDT needs to understand something:
Does PSM get *any* randomness, or is it currently deterministic??

If it is deterministic, this is a pull-off-wire bug.
a
Whiteboard: [rtm need info]
In 4.x world, we would get entropy by using the location of the mouse when idle.

Is getting the position of the mouse hard to do in SeaMonkey?
Mouse events can be gotten through the mouselistener interface:

http://lxr.mozilla.org/seamonkey/source/dom/public/coreEvents/nsIDOMMouseListener.h#47

Mouse movement can be gotten through this interface:

http://lxr.mozilla.org/seamonkey/source/dom/public/coreEvents/nsIDOMMouseMotionListener.h#48

this should be a good start.

CC-ing hyatt and vidur who know more about this area that I do.
Re-assigning bug to myself.
Assignee: ddrinan → javi
PSM is not completely deterministic, it initializes the random number generator
with some calls to NSS.  But we do not give any further entropy after
initialization.
*** Bug 57987 has been marked as a duplicate of this bug. ***
Just so I'm sure I understand... there is *some* entropy (I heard some doubts
that it exceeds 100 bits... and I don't know what the real amount is :-/ ).
I would really like a fix that accumulates additional entropy.  We want this
proposal asap, and we'll include it in the RTM release (assuming it comes RSN).
We won't stop declaration of a candidate build... but we want a better build...
and we want this fix in that build.

As I recall, entropy is accumulated to disk, and hence IF we get almost any
entropy added, the product will (each time it is used) become more secure.  Lets
get *something* added asap, and then we can look for another place or two to get
more randomness accumulated.

Thanks,

Jim
I'm working on a solution at the moment.  It goes  something like this:

Place an event on the event queue which merely collects random data for 
additional entropy.  Pass that data to PSM client libraries, which will send 
that random data to PSM next time a message is sent to PSM.  After giving the 
data to PSM, post another event to do the same.

The source of data is different on each platform.  These snippets are based on 
code in 4.x land, ie the way we used to gather entropy.

(Replace RNG_RandomUpdate, with CTM_RandomUpdate)
On Win32, the plan is for something like this:
    dwVal = GetMessagePos();                    // mouse position based
     RNG_RandomUpdate(&dwVal, sizeof(dwVal));
 
     lVal = GetMessageTime();                    // time from startup to last 
message
     RNG_RandomUpdate(&lVal, sizeof(lVal));
 
     lVal = GetTickCount();                      // we are about to go idle so                                                  
//this is probably
     RNG_RandomUpdate(&lVal, sizeof(lVal));      //   different from the                                                                                                                                                        
//one above
 
     hWnd = GetClipboardOwner();                 // 2 or 4 bytes
     RNG_RandomUpdate((void *)&hWnd, sizeof(HWND));
 }

On Mac:
                 EventRecord* theEvent = (EventRecord*)ioParam;
                 RNG_RandomUpdate(theEvent) sizeof(EventRecord));
                 long ticks = ::TickCount();
                 RNG_RandomUpdate(&ticks, sizeof(long));

On other platforms:
#if defined(__sun) && (defined(_svr4) || defined(SVR4)) || defined(sony) || 
defined(sinix)
     (void)gettimeofday(&tv);
 #else
     (void)gettimeofday(&tv, 0);
 #endif
 c = CopyLowBits((char*)buf+n, maxbytes, &tv.tv_usec, sizeof(tv.tv_usec));
 n += c;
 maxbytes -= c;
 c = CopyLowBits((char*)buf+n, maxbytes, &tv.tv_sec, sizeof(tv.tv_sec));
 n += c;
 RNG_RandomUpdate(buf, n);
The solution sounds fine... but don't over-do it.

Most specifically, don't spend too much time gathering entropy.  Putting an 
endles stream of events in the event que could easilly be overkill.  The entropy 
is summed, so you don't need this to happen endlessly.

IT would probably be more than sufficient to do this each time (for example) a 
window was created or destroyed.

FWIW: I recall an old Navigator bug where we *did* spend all our time gather 
entropy... to the detriment of performance.  We eventually calmed down to the 
point (as I recall) where we *only* gathered entropy during startups.  

Security folks should verify that entropy is summed from session to session, by 
saving and reloading the final seed across shutdown/restart of PSM.
Jim,

I'm rather certain that there is no saving of PRNG state from 
one execution to the next, not in C4.x, not in PSM.
I'm also pretty certain that most mouse input is fed into the
PRNG in C4.x.  I'm aware of some of the things that were 
done to reduce the time spent doing that.  That's why PSM
uses the technique I described before, where new data is 
merely (and quickly) appended to a circular buffer in memory
as it is provided. The "crunching" of that data is only 
done when events that (potentially) require entropy are
performed.
After talking with Nelson, the better solution to this problem is to grab the
coordinates of the mouse and pass those along before sending a mouse move event
to get processed.

So I change my proposal to instead create a new interface for an entropy
collector which the nsEventListenerManager can query with the X and Y
coordinates of the mouse just before sending off the mouse to the listeners of
the event.

This will not be a big penalty hit and will provide much better entropy than the
previous proposal.
almost everything looks good.  I am concerned that now layout knows about your
new interface.  Is there a existing callback that you can use?  if not (and the
layout module owner give a thumb up), can you cache the creation of this object?
Adding myself to cc list
Doug, in this design the layout code owns the Entropy interface, so there isn't
a dependency on an extension.

However, I still wonder why we don't just register a mouseMotion listener from
the PSM glue layer.  That seems like the straight-forward solution - or am I
missing something?
Your right, the idl file is in layout, but I think that we are still in
agreement that we should register to get some kind of callback from layout.

Also, how often is this called?  It is call for every mouse movement?  What will
this do to performance?
dougt:

The new interface is exported by the layout engine, so it doesn't depend on
anyone else.  It basically says, "If someone wants entropy, I'll give it to
them."  Otherwise the new code is in essence a no-op.

I'm not sure what you mean by "caching" my object.  I just made the PSM
Component implement the new interface as well.  I believe only one gets created
during the life of the program (you wrote that code, so you probably know just
as much as I do).  Is that what you mean by caching (only having one around
during the life of the program)?
The entropy call happens every time before a MouseMovement event is dispatched
to all mouseMovement listeners.  There are 2 potential performance bottlenecks
that didn't seem to bother Windows.

1) The time required to look up an object implementing the new interface.
2) The time required to call domEvent->GetScreenX and domEvent->GetScreenY

When reading the code, it appeared to me that GetScreend[X|Y] had the
coordinates pre-populated so getting these didn't required going down into some
OS call that was potentially expensive.  So this seemed to be an acceptable
solution.

I didn't register a mouseMotionListener because those need to be attached to a
Window, and finding a window handle from the middle of the air in psm-glue is
not easy.
By cacheing I mean:

+        nsCOMPtr<nsIEntropyCollector> enColl =
do_CreateInstance(NS_ENCTROPYCOLLECTOR_CONTRACTID);
 +        if (enColl) {


Everytime you create a new instance of this interface.  Is there a way that you
can do this once>

Also, with this design, you can only have one impl. receive this kind of data.
Should you be using some other design where multiple listerners could register
to recieve this data?


Lastly, I still agree with Terry, the PSM Glue should just be listening to mouse
movement messages.  This way, you would not have to hack layout.
What's the best/fastest way to listen?  In talking to jst and reading code, I
got the impression that I have to add the listener to every window that I want
to receive movements from.  That seems like a lot of new code.  Do we really
want to do that so close to RTM?

(If I'm under-estimating the complexity, let me know and I'm more than happy to
move the call to entropy collector there.)
Is there a way to register a global mouse movement event listener?  (Ideally
that's what we want, but I can't seem to find that.)

Or do I have to register it for a window?
I'm nervous about the performance impact of this patch.  If you're going to be 
listening to all mouse moves, this code has to be fast.  Lightning fast.
Also, are you listening to mousemoves in the chrome as well as content?  Would 
listening only to the content area be sufficient for you, or do you need to be 
invoked while the user mouses over the chrome as well?
There are some problems with this patch.

(1) Don't createInstance the entropy collector every time.
(2) You only fire an entropy collector if a mouse motion listener is actually 
registered with some content node.  That means if nobody ELSE happens to be 
listening for mouse motion events, you won't hear about any events.
(3) The event listener manager is an object that exists once PER content node 
and for the document and for the window.  By patching this code, you will 
potentially get called 2*n times where, n is the length of the path from the 
node in your document tree up to the root of the chrome window! 
You can't put this patch in this code, since you won't just fire once per mouse 
move; you'll potentially fire throughout the entire event flow of a single mouse 
move through the content model.

You could patch dom/src/base/nsGlobalWindow.cpp instead.  This is the point 
where the event hits the top of the tree and is about to die (assuming no 
mChromeEventHandler).  You can ask the question am I a MouseMove, and then you 
can get the entropy collector as a SERVICE and send the event off to yourself.
Nervous about recording mouse moves?  C4.7 does it.
Maybe we don't really need security.  Whaddaya think?

The code that ultimately gets these just sticks them
into a circular buffer in memory.  
I'm sure it takes longer to find the right object than
what the object does with the data.

And yes, mouse movements over chrome are alse wanted.
Sarcastic much?  

The patch as posted does a createInstance every time and can potentially be 
called dozens of times for a single mouseMove event.  

This would obviously impact performance.  Hopefully even you can see that.  

I have a right to be nervous when the patch produced is going to cause a 
performance hit.
Anyway, back to constructive comments for javi if the peanut gallery is 
finished.

You could cache the entropy collector as a static member of nsGlobalWindow.  You 
could then createInstance it when a static gRefCnt (tracking the # of global 
windows) hits 1 and then release when the gRefCnt hits 0.  Then for the lifetime 
of the app you let the window hold a single entropy collector, e.g., 
gEntropyCollector.  

The function to patch in nsGlobalWindow.cpp is HandleDOMEvent.  You need only 
listen during one phase (i.e., capturing or bubbling), so pick one and put the 
code in there.  This will ensure you only fire once per mouse move.

Also, it might be sufficient for you to listen to mouseout instead of mousemove. 
 Mouseout is fired less often but still occurs every time you move from one node 
in the page to another.  I'm not sure how much granularity you need though.
  
cc'ing jst, in case he has ideas/additional comments.  I think global window is 
a better place for this code than event listener manager though, and it ensures 
you really here the mouse move and that you don't fire many times per single 
mouse move.

err, "hear" the mouse move.
How much harder would it be to also call the entropy collector on chrome mouse
events?
This happens automatically.  They go through the same system.
Excellent...

In coding this up, the only problem I'm not sure how to solve is making access
to the global ref counter and entropy collector thread safe.  Is having a global
lock for access sufficient?  Or is there a better way to do this I'm not aware of?
Look at nsXULElement.cpp.  Check out how it uses gRefCnt.
Comments on the new patch:
(1) I'd just use the name gRefCnt instead of gEntropyColRefCnt, since it's 
likely we'll leverage the count for purposes other than the entropy collector 
in the future.
(2) You don't need the line "gEntropyCollector = nsnull".  The NS_RELEASE 
macros automatically null out the pointer.
(3) You can't just put the code at the top of HandleDOMEvent.  DOM events flow 
through two phases (a capturing phase and bubbling phase), so this function 
gets called twice per event.  You want to do something similar to what's on 
line 508, only your check should be...

if (gEntropyCollector && (NS_EVENT_FLAG_BUBBLE != aFlags) && !
mChromeEventHandler && aEvent->message == NS_MOUSE_MOVE) { your code here }

The first part says that you're only interested in one phase (the capturing 
phase), the second part says you're the outermost window (since you don't want 
to fire as you bubble through framesets), and the last part says you're a mouse 
move event.
 
Finally, I talked to jar today, and he suggested that listening on mouse moves 
was a bit of overkill as well.  What about listening to something a little less 
common (but still frequent) like NS_MOUSE_LEFT_BUTTON_DOWN?  That would get 
called whenever the user manipulated the chrome UI or whenever they clicked on 
a link... 

We listen to mouse movements in 4.x, why is it all of a sudden overkill?  The 

call is extremely fast, so I'm not sure what the concern is.  The more noise we 

collect, the better.

More noise is better... but a certain amount is more than enough (especially if
we keep dribbling in more entropy).

As mentioned early on, PSM does extract some entropy, and I think it was NelsonB
that guessed it was "less than 100 bits."  I would note that it was a guess, and
it was not an assertion that we definately had 100 bits of randomness to start
with (but we didn't start out with anything near zero entropy). I'm sure almost
any (infrequent) sample of mouse position extracts about 10 bits of entropy (you
get less per call if you ask too often!)  We don't need a lot of samples to get
way over the bare minimum (128 bit session keys are generated from this randomness).

In 4.x, effectively hacking in various observers was possible, without incurring
infrastructure cost, since we just built spaghetti code as needed.  All Hyatt is
suggesting is an efficient way to extract entropy, using standard
infrastructure, and without damaging performance.

We all want "sufficient" randomness (and more won't hurt).  We all want better
performance, and need to exercise care that we don't lose what performance we
have in this exercise.
r=hyatt
Ideally, you want to collect as least as much entropy as you consume.
A single SSL connection consumes a minimum of 28 bytes (224 bits)
per connection (for a "restart" connection) and 74 bytes (592 bits)
(for a full RSA SSL handshake) on the client side.  Those are minimums.

Practically, there is little reason to collect less than the ideal.

Consider that a full RSA SSL handshake can be launched with a single
click.

I consider only gathering mouse input at clicks to be too infrequent.
Anyone know of a super reviewer who can look at this?  I sent mail to brendan 
and jband, but I'm not sure how soon they can look at this.
I can be the a= if you can get another r=.
The fix is good. r=ddrinan.
a=hyatt
Patch checked into trunk.

Marking rtm+ for consideration for RTM by PDT
Whiteboard: [rtm need info] → [rtm+]
I'm worried about putting the entropy code into the global window for a couple
of reasons:

1. Couldn't I post fake mouse events from a hidden frame via
DocumentEvent::CreateEvent? If this is so then putting the entropy gatherer in
the DOM event listener is a bad idea since I could stuff it with results of my
choosing.

2. Why is the PSM using Mozilla for it's seed? What's the point of a security
module if it depends on an "insecure" application? It should implement its own
scheme around /dev/random (where it exists) or something like the EGD
(http://www.lothar.com/tech/crypto/) when it doesn't.
Please check in on the trunk ASAP.  We will re-evaluate this tomorrow after some 
people have banged on this.
Thanks, hyatt -- I would have punted to you if you hadn't been all over this as 
reviewer.

Adam makes a good point about using /dev/random where possible, or similar.  
Cc'ing shaver, who knows /dev/random.

/be
Whiteboard: [rtm+] → [rtm+] fix in trunk
/dev/random is not available on all platforms.  The user is a good source of 
entropy, but we need Mozilla to pass along that data.  Stick to getting this 
source "wired in".  We can look at other sources later.
Adding extra known input to the PRNG never hurts.  
One cannot "subtract" entropy from the system that way.
"fake" mouse events don't hurt the security.
As long as we're getting read entropy input, 
fake entropy doesn't hurt.

Most systems do NOT have /dev/random, sadly.
But, NSPR already contains code to use /dev/random on 
some systems where it is available.

The user is still the single greatest source of entropy
we have on all platforms.  

PSM provies an efficient method for any process that 
collects that user entropy to pass it to PSM where it 
will be included in the PRNG in the proper and timely
fashion.  This way, PSM doesn't have to also attempt
to collect that same entropy in parallel, which _would_
hurt system performance.
> /dev/random is not available on all platforms

But is a good source of entropy on those platforms that do implement it well,
and IMO, we should use it (but I won't code it, so my vote has only limited
value :) ).

What about the concern adamlock mentioned that the mouse events may be faked by
an attacker?
I didn't notice Nelson's comment, so please disregard my question.

> This way, PSM doesn't have to also attempt
> to collect that same entropy in parallel, which _would_
> hurt system performance.

I think, he suggested to use it *instead* of mouse events in Mozilla, where
possible. (And that *is* possibly at least on Linux.)
Believe it or not, this subject was investigated at length
quite some time ago.  Larry H investigated /dev/random on Linux 
rather in depth at one point. This is what I recall of that 
investigation.

/dev/random on Linux doesn't provide as much data as you ask for, 
each time, on demand.  The amount of data available from it grows 
in proportion to the amount of system activity.  For example, each 
time the disk moves, the amount of data available in /dev/random 
grows slightly (a few bytes).  The availability of data from 
/dev/random seems to grow rather slowly.  

It is possible to exhaust /dev/random for periods of time.
If you have a function that says "read N bytes from a source, 
and don't return until you've acquired all N bytes", that function
can take a LONG time with /dev/random on an idle system. 

So, taking what data you can get from /dev/random as it is
available is a good _addition_ to other sources of entropy,
but I wouldn't rely on it as the sole source of input as long
as other sources also exist.

The problem of aquiring entropy is pretty well understood.
Somehow, in the division of labor for the implementation of 
Mozilla security between the PSM team and CPD, an important
little piece of it fell through the cracks until today.  
OK. Thanks for the explanation.
As hinted at by NelsonB, Adam's suggestion that a "hacker" could fake events and
induce an "attack" is not well founded.  The system *accumulates* randomness, an
no amount of "non-random" data, added into the system, can make the system less
random.

Dependence on /dev/random is nice, as an addition, but there is no reason to
"bet the farm" on that source when additional entropy can be picked up very
cheaply.  As I've said, if we are hurting performance (re: gathering is not
cheap), then we have made a mistake.  I'm willing to bet that on some platforms
/dev/random will not be as "good" as on other platforms, so there is no reason
to depend upon it exclusively.

I was a little surprised to see Nelson's comment about "Consuming" the
randomness.  Although it is clearly a waste to obtain more randomness than used,
it strikes me that given 128 bits of randomness, I can create an arbitrarilly
long pseudorandom sequence of 128 bit random quantities that are correlated, but
are "hard" to predict.  To be specific, they are as "hard" to predict (i.e.,
find one element of the generated sequence, given all the others) as it is to
break one of our ciphers.  Bottom line: Once you've gotten n bits of randomness,
it is easy to generate random numbers that are as "hard to guess" as it is to
guess the first one, namely 2**n trials and errors.  Maybe this is a wasteful
theoritical argument that I'll get beat in the head for... but I really think
folks are getting into overkill when they propose we need such large amounts of
entropy.   A small entropy trickle, atop a basic 128 random seed, should provide
a full measure of key-generation-security when 128 bit ciphers are in use.

If anything, I'd be more concerned with the size of the seed (randomness
summary) maintained inside PSM, then most of this entropy gathering.  I'd also
of course be concerned that there is no way to calculate the seed (internal
state), given an emmission of a pseudo-random key generated by PSM.  *IF* there
was such a way to calculate the seed (state), then an attack *could* be mounted
that would extract enough samples (multiple distinct SSL connections, made in
rapid succession) that would expose the seed, followed by an immediate
connection to the victim-server (before more randomness could be acquired).
...and then all the long-winded entropy gathering would be for naught.
This bug is in candidate limbo.  We will reconsider this fix once we have a
candidate in hand, but we can't take this fix before then.
rtm++, please checkin ASAP so we can build today.
Whiteboard: [rtm+] fix in trunk → [rtm++] fix in trunk
Fix checked into branch.

This is hard for QA to verify.  The one thing I would tell QA is that there
should be noticeable differences on start-up or when clicking on something.  It
should all behave as before to the user.

If there are no regressions in starting up and clicking, then I'd consider this
bug verified.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Correction, there should *not* be any noticeable differences.

Thanks to thayes for pointing out my omitted word.
> If there are no regressions in starting up and clicking, then I'd consider this
> bug verified.

Is there no way to verify that we seed correctly? In bug 56002, we actually had
the case that we didn't seed correctly on Mac.
You can, it's just extremely hard to do so with black box testing.

You'd have to have a program that gave entropy to PSM and based on that verify
that random numbers generated aren't the same.  I'm not sure if there's an easy
way to do so.
This is a good example of where "design for test" is needed to be able to test a
feature.
There probably should be a way to get PSM to dump the seed during each
depenedent secure operation (i.e., Each time the RNG is called for data).  At
the barest of minimum, this would allow a tester to confirm that the seed is not
a constant (oops... and it can/has happened to folks!).  It would still leave
wiggle room for a bug where the seed was not "random enough," but it would at
least allow folks to see *some* hint of randomness.  It is pretty hard to
test/prove/argue a sequence is random, given only the sequence.  On the other
hand, it is sometimes easy to show that a sequence is not varying much, and that
level of testabliity should be supported.  This won't happen in this release....
but it really should be a feature IMO of the PSM internals.

Thanks,
Jim (always wanting more) Roskind
While I agree that designing for test is usually a good thing, in this case
exposing the internal state of the random number generator would violate FIPS
requirements. Leaking the internal state of the random number generator would
allow an attacker to predict the following values, which are probably being used
for keys and other security parameters.  Providing an interface to do this needs
to be carefully evaluated.
Why not
#ifdef DEBUG_PSM
  printf(...);
#endif
?
Is there a way for nitinp to verify this using some of your tools since nitinp
works in the Security group?
assigning ddrinan@netscape.com as QA. David will do the code review and verify
that the bug is fixed.
QA Contact: nitinp → ddrinan
ddrinan - how does this look? 
Marking this bug verified.
Status: RESOLVED → VERIFIED
Mass changing Security:Crypto to PSM
Component: Security: Crypto → Client Library
Product: Browser → PSM
Version: other → 2.1
Mass changing Security:Crypto to PSM
Product: PSM → Core
Version: psm2.1 → 1.0 Branch
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: