Closed Bug 57985 Opened 24 years ago Closed 24 years ago

Mozilla PSM-GLUE does not seed PSM

Categories

(Core Graveyard :: Security: UI, defect, P3)

1.0 Branch
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: dougt, Assigned: javi)

References

Details

(Whiteboard: [rtm++] fix in trunk)

Attachments

(3 files)

psm-glue does not seed PSM! From Nelson: The plan was as follows: 1. The client lib provides a function that the client is supposed to call to supply entropy to PSM. When the client calls it, the client lib appends any data supplied by the client to a buffer. 2. Whenever the client makes a call that necessitates a request-response between the client and the PSM server, the request is preceeded by a message that transfers the buffered entropy to the PSM server. I believe Mark implemented the client library functions described above. However, I have NO IDEA whether anyone ever did any work to actually call the function described in (1) above to actually provide any entropy from the application to the client library.
Severity: normal → blocker
Keywords: dogfood, rtm
OS: Linux → All
Hardware: PC → All
PDT needs to understand something: Does PSM get *any* randomness, or is it currently deterministic?? If it is deterministic, this is a pull-off-wire bug.
a
Whiteboard: [rtm need info]
In 4.x world, we would get entropy by using the location of the mouse when idle. Is getting the position of the mouse hard to do in SeaMonkey?
Mouse events can be gotten through the mouselistener interface: http://lxr.mozilla.org/seamonkey/source/dom/public/coreEvents/nsIDOMMouseListener.h#47 Mouse movement can be gotten through this interface: http://lxr.mozilla.org/seamonkey/source/dom/public/coreEvents/nsIDOMMouseMotionListener.h#48 this should be a good start. CC-ing hyatt and vidur who know more about this area that I do.
Re-assigning bug to myself.
Assignee: ddrinan → javi
PSM is not completely deterministic, it initializes the random number generator with some calls to NSS. But we do not give any further entropy after initialization.
*** Bug 57987 has been marked as a duplicate of this bug. ***
Just so I'm sure I understand... there is *some* entropy (I heard some doubts that it exceeds 100 bits... and I don't know what the real amount is :-/ ). I would really like a fix that accumulates additional entropy. We want this proposal asap, and we'll include it in the RTM release (assuming it comes RSN). We won't stop declaration of a candidate build... but we want a better build... and we want this fix in that build. As I recall, entropy is accumulated to disk, and hence IF we get almost any entropy added, the product will (each time it is used) become more secure. Lets get *something* added asap, and then we can look for another place or two to get more randomness accumulated. Thanks, Jim
I'm working on a solution at the moment. It goes something like this: Place an event on the event queue which merely collects random data for additional entropy. Pass that data to PSM client libraries, which will send that random data to PSM next time a message is sent to PSM. After giving the data to PSM, post another event to do the same. The source of data is different on each platform. These snippets are based on code in 4.x land, ie the way we used to gather entropy. (Replace RNG_RandomUpdate, with CTM_RandomUpdate) On Win32, the plan is for something like this: dwVal = GetMessagePos(); // mouse position based RNG_RandomUpdate(&dwVal, sizeof(dwVal)); lVal = GetMessageTime(); // time from startup to last message RNG_RandomUpdate(&lVal, sizeof(lVal)); lVal = GetTickCount(); // we are about to go idle so //this is probably RNG_RandomUpdate(&lVal, sizeof(lVal)); // different from the //one above hWnd = GetClipboardOwner(); // 2 or 4 bytes RNG_RandomUpdate((void *)&hWnd, sizeof(HWND)); } On Mac: EventRecord* theEvent = (EventRecord*)ioParam; RNG_RandomUpdate(theEvent) sizeof(EventRecord)); long ticks = ::TickCount(); RNG_RandomUpdate(&ticks, sizeof(long)); On other platforms: #if defined(__sun) && (defined(_svr4) || defined(SVR4)) || defined(sony) || defined(sinix) (void)gettimeofday(&tv); #else (void)gettimeofday(&tv, 0); #endif c = CopyLowBits((char*)buf+n, maxbytes, &tv.tv_usec, sizeof(tv.tv_usec)); n += c; maxbytes -= c; c = CopyLowBits((char*)buf+n, maxbytes, &tv.tv_sec, sizeof(tv.tv_sec)); n += c; RNG_RandomUpdate(buf, n);
The solution sounds fine... but don't over-do it. Most specifically, don't spend too much time gathering entropy. Putting an endles stream of events in the event que could easilly be overkill. The entropy is summed, so you don't need this to happen endlessly. IT would probably be more than sufficient to do this each time (for example) a window was created or destroyed. FWIW: I recall an old Navigator bug where we *did* spend all our time gather entropy... to the detriment of performance. We eventually calmed down to the point (as I recall) where we *only* gathered entropy during startups. Security folks should verify that entropy is summed from session to session, by saving and reloading the final seed across shutdown/restart of PSM.
Jim, I'm rather certain that there is no saving of PRNG state from one execution to the next, not in C4.x, not in PSM. I'm also pretty certain that most mouse input is fed into the PRNG in C4.x. I'm aware of some of the things that were done to reduce the time spent doing that. That's why PSM uses the technique I described before, where new data is merely (and quickly) appended to a circular buffer in memory as it is provided. The "crunching" of that data is only done when events that (potentially) require entropy are performed.
After talking with Nelson, the better solution to this problem is to grab the coordinates of the mouse and pass those along before sending a mouse move event to get processed. So I change my proposal to instead create a new interface for an entropy collector which the nsEventListenerManager can query with the X and Y coordinates of the mouse just before sending off the mouse to the listeners of the event. This will not be a big penalty hit and will provide much better entropy than the previous proposal.
almost everything looks good. I am concerned that now layout knows about your new interface. Is there a existing callback that you can use? if not (and the layout module owner give a thumb up), can you cache the creation of this object?
Adding myself to cc list
Doug, in this design the layout code owns the Entropy interface, so there isn't a dependency on an extension. However, I still wonder why we don't just register a mouseMotion listener from the PSM glue layer. That seems like the straight-forward solution - or am I missing something?
Your right, the idl file is in layout, but I think that we are still in agreement that we should register to get some kind of callback from layout. Also, how often is this called? It is call for every mouse movement? What will this do to performance?
dougt: The new interface is exported by the layout engine, so it doesn't depend on anyone else. It basically says, "If someone wants entropy, I'll give it to them." Otherwise the new code is in essence a no-op. I'm not sure what you mean by "caching" my object. I just made the PSM Component implement the new interface as well. I believe only one gets created during the life of the program (you wrote that code, so you probably know just as much as I do). Is that what you mean by caching (only having one around during the life of the program)?
The entropy call happens every time before a MouseMovement event is dispatched to all mouseMovement listeners. There are 2 potential performance bottlenecks that didn't seem to bother Windows. 1) The time required to look up an object implementing the new interface. 2) The time required to call domEvent->GetScreenX and domEvent->GetScreenY When reading the code, it appeared to me that GetScreend[X|Y] had the coordinates pre-populated so getting these didn't required going down into some OS call that was potentially expensive. So this seemed to be an acceptable solution. I didn't register a mouseMotionListener because those need to be attached to a Window, and finding a window handle from the middle of the air in psm-glue is not easy.
By cacheing I mean: + nsCOMPtr<nsIEntropyCollector> enColl = do_CreateInstance(NS_ENCTROPYCOLLECTOR_CONTRACTID); + if (enColl) { Everytime you create a new instance of this interface. Is there a way that you can do this once> Also, with this design, you can only have one impl. receive this kind of data. Should you be using some other design where multiple listerners could register to recieve this data? Lastly, I still agree with Terry, the PSM Glue should just be listening to mouse movement messages. This way, you would not have to hack layout.
What's the best/fastest way to listen? In talking to jst and reading code, I got the impression that I have to add the listener to every window that I want to receive movements from. That seems like a lot of new code. Do we really want to do that so close to RTM? (If I'm under-estimating the complexity, let me know and I'm more than happy to move the call to entropy collector there.)
Is there a way to register a global mouse movement event listener? (Ideally that's what we want, but I can't seem to find that.) Or do I have to register it for a window?
I'm nervous about the performance impact of this patch. If you're going to be listening to all mouse moves, this code has to be fast. Lightning fast.
Also, are you listening to mousemoves in the chrome as well as content? Would listening only to the content area be sufficient for you, or do you need to be invoked while the user mouses over the chrome as well?
There are some problems with this patch. (1) Don't createInstance the entropy collector every time. (2) You only fire an entropy collector if a mouse motion listener is actually registered with some content node. That means if nobody ELSE happens to be listening for mouse motion events, you won't hear about any events. (3) The event listener manager is an object that exists once PER content node and for the document and for the window. By patching this code, you will potentially get called 2*n times where, n is the length of the path from the node in your document tree up to the root of the chrome window! You can't put this patch in this code, since you won't just fire once per mouse move; you'll potentially fire throughout the entire event flow of a single mouse move through the content model. You could patch dom/src/base/nsGlobalWindow.cpp instead. This is the point where the event hits the top of the tree and is about to die (assuming no mChromeEventHandler). You can ask the question am I a MouseMove, and then you can get the entropy collector as a SERVICE and send the event off to yourself.
Nervous about recording mouse moves? C4.7 does it. Maybe we don't really need security. Whaddaya think? The code that ultimately gets these just sticks them into a circular buffer in memory. I'm sure it takes longer to find the right object than what the object does with the data. And yes, mouse movements over chrome are alse wanted.
Sarcastic much? The patch as posted does a createInstance every time and can potentially be called dozens of times for a single mouseMove event. This would obviously impact performance. Hopefully even you can see that. I have a right to be nervous when the patch produced is going to cause a performance hit.
Anyway, back to constructive comments for javi if the peanut gallery is finished. You could cache the entropy collector as a static member of nsGlobalWindow. You could then createInstance it when a static gRefCnt (tracking the # of global windows) hits 1 and then release when the gRefCnt hits 0. Then for the lifetime of the app you let the window hold a single entropy collector, e.g., gEntropyCollector. The function to patch in nsGlobalWindow.cpp is HandleDOMEvent. You need only listen during one phase (i.e., capturing or bubbling), so pick one and put the code in there. This will ensure you only fire once per mouse move. Also, it might be sufficient for you to listen to mouseout instead of mousemove. Mouseout is fired less often but still occurs every time you move from one node in the page to another. I'm not sure how much granularity you need though. cc'ing jst, in case he has ideas/additional comments. I think global window is a better place for this code than event listener manager though, and it ensures you really here the mouse move and that you don't fire many times per single mouse move.
err, "hear" the mouse move.
How much harder would it be to also call the entropy collector on chrome mouse events?
This happens automatically. They go through the same system.
Excellent... In coding this up, the only problem I'm not sure how to solve is making access to the global ref counter and entropy collector thread safe. Is having a global lock for access sufficient? Or is there a better way to do this I'm not aware of?
Look at nsXULElement.cpp. Check out how it uses gRefCnt.
Comments on the new patch: (1) I'd just use the name gRefCnt instead of gEntropyColRefCnt, since it's likely we'll leverage the count for purposes other than the entropy collector in the future. (2) You don't need the line "gEntropyCollector = nsnull". The NS_RELEASE macros automatically null out the pointer. (3) You can't just put the code at the top of HandleDOMEvent. DOM events flow through two phases (a capturing phase and bubbling phase), so this function gets called twice per event. You want to do something similar to what's on line 508, only your check should be... if (gEntropyCollector && (NS_EVENT_FLAG_BUBBLE != aFlags) && ! mChromeEventHandler && aEvent->message == NS_MOUSE_MOVE) { your code here } The first part says that you're only interested in one phase (the capturing phase), the second part says you're the outermost window (since you don't want to fire as you bubble through framesets), and the last part says you're a mouse move event. Finally, I talked to jar today, and he suggested that listening on mouse moves was a bit of overkill as well. What about listening to something a little less common (but still frequent) like NS_MOUSE_LEFT_BUTTON_DOWN? That would get called whenever the user manipulated the chrome UI or whenever they clicked on a link...
We listen to mouse movements in 4.x, why is it all of a sudden overkill? The call is extremely fast, so I'm not sure what the concern is. The more noise we collect, the better.
More noise is better... but a certain amount is more than enough (especially if we keep dribbling in more entropy). As mentioned early on, PSM does extract some entropy, and I think it was NelsonB that guessed it was "less than 100 bits." I would note that it was a guess, and it was not an assertion that we definately had 100 bits of randomness to start with (but we didn't start out with anything near zero entropy). I'm sure almost any (infrequent) sample of mouse position extracts about 10 bits of entropy (you get less per call if you ask too often!) We don't need a lot of samples to get way over the bare minimum (128 bit session keys are generated from this randomness). In 4.x, effectively hacking in various observers was possible, without incurring infrastructure cost, since we just built spaghetti code as needed. All Hyatt is suggesting is an efficient way to extract entropy, using standard infrastructure, and without damaging performance. We all want "sufficient" randomness (and more won't hurt). We all want better performance, and need to exercise care that we don't lose what performance we have in this exercise.
r=hyatt
Ideally, you want to collect as least as much entropy as you consume. A single SSL connection consumes a minimum of 28 bytes (224 bits) per connection (for a "restart" connection) and 74 bytes (592 bits) (for a full RSA SSL handshake) on the client side. Those are minimums. Practically, there is little reason to collect less than the ideal. Consider that a full RSA SSL handshake can be launched with a single click. I consider only gathering mouse input at clicks to be too infrequent.
Anyone know of a super reviewer who can look at this? I sent mail to brendan and jband, but I'm not sure how soon they can look at this.
I can be the a= if you can get another r=.
The fix is good. r=ddrinan.
a=hyatt
Patch checked into trunk. Marking rtm+ for consideration for RTM by PDT
Whiteboard: [rtm need info] → [rtm+]
I'm worried about putting the entropy code into the global window for a couple of reasons: 1. Couldn't I post fake mouse events from a hidden frame via DocumentEvent::CreateEvent? If this is so then putting the entropy gatherer in the DOM event listener is a bad idea since I could stuff it with results of my choosing. 2. Why is the PSM using Mozilla for it's seed? What's the point of a security module if it depends on an "insecure" application? It should implement its own scheme around /dev/random (where it exists) or something like the EGD (http://www.lothar.com/tech/crypto/) when it doesn't.
Please check in on the trunk ASAP. We will re-evaluate this tomorrow after some people have banged on this.
Thanks, hyatt -- I would have punted to you if you hadn't been all over this as reviewer. Adam makes a good point about using /dev/random where possible, or similar. Cc'ing shaver, who knows /dev/random. /be
Whiteboard: [rtm+] → [rtm+] fix in trunk
/dev/random is not available on all platforms. The user is a good source of entropy, but we need Mozilla to pass along that data. Stick to getting this source "wired in". We can look at other sources later.
Adding extra known input to the PRNG never hurts. One cannot "subtract" entropy from the system that way. "fake" mouse events don't hurt the security. As long as we're getting read entropy input, fake entropy doesn't hurt. Most systems do NOT have /dev/random, sadly. But, NSPR already contains code to use /dev/random on some systems where it is available. The user is still the single greatest source of entropy we have on all platforms. PSM provies an efficient method for any process that collects that user entropy to pass it to PSM where it will be included in the PRNG in the proper and timely fashion. This way, PSM doesn't have to also attempt to collect that same entropy in parallel, which _would_ hurt system performance.
> /dev/random is not available on all platforms But is a good source of entropy on those platforms that do implement it well, and IMO, we should use it (but I won't code it, so my vote has only limited value :) ). What about the concern adamlock mentioned that the mouse events may be faked by an attacker?
I didn't notice Nelson's comment, so please disregard my question. > This way, PSM doesn't have to also attempt > to collect that same entropy in parallel, which _would_ > hurt system performance. I think, he suggested to use it *instead* of mouse events in Mozilla, where possible. (And that *is* possibly at least on Linux.)
Believe it or not, this subject was investigated at length quite some time ago. Larry H investigated /dev/random on Linux rather in depth at one point. This is what I recall of that investigation. /dev/random on Linux doesn't provide as much data as you ask for, each time, on demand. The amount of data available from it grows in proportion to the amount of system activity. For example, each time the disk moves, the amount of data available in /dev/random grows slightly (a few bytes). The availability of data from /dev/random seems to grow rather slowly. It is possible to exhaust /dev/random for periods of time. If you have a function that says "read N bytes from a source, and don't return until you've acquired all N bytes", that function can take a LONG time with /dev/random on an idle system. So, taking what data you can get from /dev/random as it is available is a good _addition_ to other sources of entropy, but I wouldn't rely on it as the sole source of input as long as other sources also exist. The problem of aquiring entropy is pretty well understood. Somehow, in the division of labor for the implementation of Mozilla security between the PSM team and CPD, an important little piece of it fell through the cracks until today.
OK. Thanks for the explanation.
As hinted at by NelsonB, Adam's suggestion that a "hacker" could fake events and induce an "attack" is not well founded. The system *accumulates* randomness, an no amount of "non-random" data, added into the system, can make the system less random. Dependence on /dev/random is nice, as an addition, but there is no reason to "bet the farm" on that source when additional entropy can be picked up very cheaply. As I've said, if we are hurting performance (re: gathering is not cheap), then we have made a mistake. I'm willing to bet that on some platforms /dev/random will not be as "good" as on other platforms, so there is no reason to depend upon it exclusively. I was a little surprised to see Nelson's comment about "Consuming" the randomness. Although it is clearly a waste to obtain more randomness than used, it strikes me that given 128 bits of randomness, I can create an arbitrarilly long pseudorandom sequence of 128 bit random quantities that are correlated, but are "hard" to predict. To be specific, they are as "hard" to predict (i.e., find one element of the generated sequence, given all the others) as it is to break one of our ciphers. Bottom line: Once you've gotten n bits of randomness, it is easy to generate random numbers that are as "hard to guess" as it is to guess the first one, namely 2**n trials and errors. Maybe this is a wasteful theoritical argument that I'll get beat in the head for... but I really think folks are getting into overkill when they propose we need such large amounts of entropy. A small entropy trickle, atop a basic 128 random seed, should provide a full measure of key-generation-security when 128 bit ciphers are in use. If anything, I'd be more concerned with the size of the seed (randomness summary) maintained inside PSM, then most of this entropy gathering. I'd also of course be concerned that there is no way to calculate the seed (internal state), given an emmission of a pseudo-random key generated by PSM. *IF* there was such a way to calculate the seed (state), then an attack *could* be mounted that would extract enough samples (multiple distinct SSL connections, made in rapid succession) that would expose the seed, followed by an immediate connection to the victim-server (before more randomness could be acquired). ...and then all the long-winded entropy gathering would be for naught.
This bug is in candidate limbo. We will reconsider this fix once we have a candidate in hand, but we can't take this fix before then.
rtm++, please checkin ASAP so we can build today.
Whiteboard: [rtm+] fix in trunk → [rtm++] fix in trunk
Fix checked into branch. This is hard for QA to verify. The one thing I would tell QA is that there should be noticeable differences on start-up or when clicking on something. It should all behave as before to the user. If there are no regressions in starting up and clicking, then I'd consider this bug verified.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Correction, there should *not* be any noticeable differences. Thanks to thayes for pointing out my omitted word.
> If there are no regressions in starting up and clicking, then I'd consider this > bug verified. Is there no way to verify that we seed correctly? In bug 56002, we actually had the case that we didn't seed correctly on Mac.
You can, it's just extremely hard to do so with black box testing. You'd have to have a program that gave entropy to PSM and based on that verify that random numbers generated aren't the same. I'm not sure if there's an easy way to do so.
This is a good example of where "design for test" is needed to be able to test a feature. There probably should be a way to get PSM to dump the seed during each depenedent secure operation (i.e., Each time the RNG is called for data). At the barest of minimum, this would allow a tester to confirm that the seed is not a constant (oops... and it can/has happened to folks!). It would still leave wiggle room for a bug where the seed was not "random enough," but it would at least allow folks to see *some* hint of randomness. It is pretty hard to test/prove/argue a sequence is random, given only the sequence. On the other hand, it is sometimes easy to show that a sequence is not varying much, and that level of testabliity should be supported. This won't happen in this release.... but it really should be a feature IMO of the PSM internals. Thanks, Jim (always wanting more) Roskind
While I agree that designing for test is usually a good thing, in this case exposing the internal state of the random number generator would violate FIPS requirements. Leaking the internal state of the random number generator would allow an attacker to predict the following values, which are probably being used for keys and other security parameters. Providing an interface to do this needs to be carefully evaluated.
Why not #ifdef DEBUG_PSM printf(...); #endif ?
Is there a way for nitinp to verify this using some of your tools since nitinp works in the Security group?
assigning ddrinan@netscape.com as QA. David will do the code review and verify that the bug is fixed.
QA Contact: nitinp → ddrinan
ddrinan - how does this look?
Marking this bug verified.
Status: RESOLVED → VERIFIED
Mass changing Security:Crypto to PSM
Component: Security: Crypto → Client Library
Product: Browser → PSM
Version: other → 2.1
Mass changing Security:Crypto to PSM
Product: PSM → Core
Version: psm2.1 → 1.0 Branch
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: