freebl/softoken/pk11wrap RC4 implementation is not thread safe

NEW
Unassigned

Status

NSS
Libraries
--
critical
8 years ago
8 years ago

People

(Reporter: briansmith, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

The cause is the same as the cause of Bug 341127 and Bug 451754. Consider an output buffer that looks like this:

Thread 1 is encrypting into [ciphertext] while Thread 1 is reading from [before]. If [ciphertext] doesn't start on a word boundary, then when the RC4 code encrypts the first word of [ciphertext], Thread 1 will read the end of [before], do some encryption, and then write to the end of [before]. If other threads are writing to/from [before], there will be a race condition:

+--------+------------+--------+
| before | ciphertext | after  |
+--------+------------+--------+
| XXXX00 | AAAAAAAAAA | FFFFFF | Thread 1: RC4 code reads XXXX00AA
+--------+------------+--------+
| XXXX01 | AAAAAAAAAA | FFFFFF | Thread 2: increments [before]
+--------+------------+--------+
| XXXX00 | BBAAAAAAAA | FFFFFF | Thread 1: encrypts AA to BB
+--------+------------+--------+ writes XXXX00BB, undoing 
                                 Thread 2's write!

A similar race condition can occur at the end:

+--------+------------+--------+
| before | ciphertext | after  |
+--------+------------+--------+
| XXXX00 | AAAAAAAAAA | FFXXXX | Thread 1: RC4 code reads AAFFXXXX
+--------+------------+--------+
| XXXX01 | AAAAAAAAAA | 00XXXX | Thread 2: increments [after]
+--------+------------+--------+
| XXXX00 | BBAAAAAAAA | FFXXXX | Thread 1: encrypts AA to BB
+--------+------------+--------+ writes BBFFXXXX, undoing 
                                 Thread 2's write!

AFAICT, this problem doesn't affect current versions of libssl because libssl only encrypts/decrypts into buffers that should be word-aligned/padded on both ends (assuming malloc() returns word-aligned pointers), and those buffers are protected by locks.

This problem was originally reported by Mads Kiilerich in the dev-tech-crypto thread "PK11_CipherOp with RC4 and invalid memory access":

"I can imagine that the current approach can cause real problems if the memory next to the buffers concurrently is modified from other threads. Perhaps that should be mentioned too."
Of course, the last line of that second table should look like this:

+--------+------------+--------+
| XXXX00 | AAAAAAAABB | FFXXXX | Thread 1: encrypts AA to BB
+--------+------------+--------+ writes BBFFXXXX, undoing 
                                 Thread 2's write!
It's not totally inconceivably that this would happen, that two threads would
attempt to separately RC4 encrypt (or decrypt) adjacent buffers whose point 
of adjacency was actually in the middle of a word ... but it's pretty unlikely.

It's extremely unlikely to imagine that two threads could call memory allocation 
routines that return to them separate buffers that happen to be adjacent inside
a single word.  Sane memory management just doesn't work that way.  

Because of the existence of multi-processor systems whose CPUs have non-coherent write-back caches, this problem isn't solved by switching to byte-at-a-time access for words at the beginning and end of buffers.  Solving it would require having some sort of memory lock system that allowed each 
cipher algorithm to lock the word that potentially belongs to multiple 
buffers.  This applies not only to RC4 bot to all algorithms (including 
memcopy).  

The execution time cost of such a system attempting to prevent all possible cases on all possible CPU types would be quite expensive, prohibitely so, IMO.
I guess the question is whether freebl (In reply to comment #2)
> It's not totally inconceivably that this would happen, that two threads would
> attempt to separately RC4 encrypt (or decrypt) adjacent buffers whose point 
> of adjacency was actually in the middle of a word ... but it's pretty 
> unlikely.

Only one thread needs to be doing RC4 encrypt or decrypt. The other thread can be doing any other writing operation in the adjacent buffer.

> It's extremely unlikely to imagine that two threads could call memory
> allocation routines that return to them separate buffers that happen to be 
> adjacent inside a single word.  Sane memory management just doesn't work
> that way.  

The two buffers could be partitions of a single buffer. This will be the case if optimization #2 of bug 576902 is implemented.

> Because of the existence of multi-processor systems whose CPUs have
> non-coherent write-back caches, this problem isn't solved by switching to
> byte-at-a-time access for words at the beginning and end of buffers.  Solving
> it would require having some sort of memory lock system that allowed each 
> cipher algorithm to lock the word that potentially belongs to multiple 
> buffers.  This applies not only to RC4 bot to all algorithms (including 
> memcopy).  
> 
> The execution time cost of such a system attempting to prevent all possible
> cases on all possible CPU types would be quite expensive, prohibitely so, IMO.

I read this argument as saying that, since there are systems without coherent caches that would cause similar issues, it is OK for freebl to also be non-thread-safe on systems with coherent caches. I can see some merit in that argument. But, on the other hand, lots of systems (e.g. a single CPU system with preemptive multitasking) do have coherent caches and it seems wrong to require the application developer to treat them like they don't, for a very minor optimiation. Given that there is practically no performance cost to doing byte-at-a-time only at the very beginning and very ends of the buffer, I think it is better to err on the side of caution.
You need to log in before you can comment on or make changes to this bug.