Closed
Bug 892040
Opened 12 years ago
Closed 12 years ago
Key stretching benchmarking for PICL auth
Categories
(Android Background Services Graveyard :: Crypto, defect)
Tracking
(Not tracked)
VERIFIED
FIXED
1.0
People
(Reporter: rnewman, Assigned: rnewman)
References
Details
(Whiteboard: [qa-])
Our initial key-stretching plan[1] is to
do 20k rounds of PBKDF2-HMAC-SHA256, followed by scrypt at
N/r/p=64k/8/1, then a second 20k of PBKDF2. scrypt at that setting takes
about 60MB of RAM.
On a 1.8GHz Atom, I estimate that should take 130ms+1.3s+130ms= 1.6s .
On a Raspberry Pi (which might be comparable to a mid-end phone) it
should take 7.5s, which is at the far end of my usability target. This
only needs to be done once per device.
The bulk of the stretching value comes from the scrypt. The PBKDF is
just there to protect clients that outsource the scrypt phase to a
helper (which we haven't defined or written yet).
Security is obviously better when we don't use the helper, so I'd like
the client to measure itself at runtime and decide whether it's capable
of doing the scrypt itself. That's a question of speed (the client can
run a few milliseconds of the underlying Salsa20 operation and then
estimate how long the whole thing would take, and bail if it's more than
like 10 seconds). It's also a question of memory, for which maybe the
client can just try to allocate the whole thing and then fall back to
the helper if the malloc fails.
Do you know if memory-allocation failures on android are survivable? On
unix, it'd be pretty painful (you're more likely to go into swap thrash
than to get a failure from malloc).
So maybe a better criteria would be to ask the OS about the total memory
size, and use the helper unless it's greater than like 512MB.
Anyways, the speed-testing app I have in mind would launch, run 20k
rounds of PBKDF2-HMAC-SHA256 and report the time, run a small scrypt
loop (maybe N/r/p = 1k/8/1, which should only use 1MB) and report the
time, then exit. Maybe have it POST the results to a server somewhere
(along with device-identifying values and whatever /proc/cpuinfo we can
get). Then we run it on all the devices we care about, collect the data,
build a spreadsheet, and choose some settings that will work well-enough
everywhere.
Assignee | ||
Updated•12 years ago
|
Component: Android Sync → Crypto
Assignee | ||
Comment 2•12 years ago
|
||
Input to this:
* Android enforces a 24MB limit for each process (16MB on some devices, 32MB on others, 48MB if you're really lucky). If scrypt requires 60MB, we can't do it in Java.
* You can get around that by using the NDK to allocate memory in C.
Even on devices for which scrypt isn't too expensive, we'd need to drop down to C to do it.
Assignee | ||
Comment 3•12 years ago
|
||
Additionally: I misremembered that we did PBKDF2 already in Android Sync. We don't; that was used for the earlier key-stretching version of Sync, which we abandoned before the Android rewrite.
We *do* use RFC 5869 HKDF, so not a terrifying leap.
Assignee | ||
Comment 4•12 years ago
|
||
Working PBKDF2 code (SHA-1).
https://github.com/mozilla-services/android-sync/tree/rnewman/pbkdf2
From informal testing, 20,000 iterations of PBKDF2-with-SHA1 takes 2.7 seconds on my top-of-the-line dual-core HTC One.
More tomorrow.
Assignee | ||
Comment 5•12 years ago
|
||
A direct implementation of PBKDF2-SHA256 -- a la http://stackoverflow.com/questions/9147463/java-pbkdf2-with-hmacsha256-as-the-prf -- takes a ridiculous 166 seconds to perform 40,000 iterations.
It's likely that this can be optimized further (I haven't yet), and it doesn't take as much advantage of the pre-bundled Java crypto code as the SHA-1 version, which is basically a single API call.
Regardless, this might be an indication that this all needs to happen in native code (e.g., by linking to libnss).
More tomorrow.
Comment 6•12 years ago
|
||
The hashing itself should be pretty fast. I suspect that the slowdown is resulting from a whole bunch of context switching, if the individual SHA256() calls are going through some kind of overblown crypto-manager API that pretends to be a smartcard or all happens in kernelspace. I bet it'd be a lot faster if SHA256() were just a library call (keeping everything in userspace) or if the whole PBKDF2-SHA256 operation were a single API call (keeping everything in that other space).
Assignee | ||
Comment 7•12 years ago
|
||
spongycastle 1.47 and up seems to bundle PBKDF2+SHA256, so that's my next avenue.
Assignee | ||
Comment 8•12 years ago
|
||
spongycastle's implementation takes 98 seconds for 40,000 iterations.
The exact same spongycastle implementation, but using SHA-1, is only very slightly slower than my previous implementation of PBKDF2HMACSHA1 using javax.crypto, so there's no obvious misuse of the SC API.
SHA-256 just seems to be very expensive in this environment.
Given that SHA-1's weaknesses don't really impact its use in this situation, one wonders whether either SHA-256 is unnecessary here, or whether we need to do that many rounds of SHA-256. But I'll continue with my benchmarking…
Assignee | ||
Comment 9•12 years ago
|
||
Some reading suggests that the SHA-1 version probably uses OpenSSL under the hood, which would explain the speed disparity.
Assignee | ||
Comment 10•12 years ago
|
||
With OpenSSL, same hardware finishes 80,000 iterations of PBKDF2-SHA-256 in 1.4 seconds, which is hilarious. I need to write more tests to make sure this is correct.
Target Milestone: --- → 1.0
Assignee | ||
Comment 11•12 years ago
|
||
I've requisitioned an ARMv6 device so I can do low-end benchmarking. These two devices alone should give us a good solution space.
Assignee | ||
Comment 12•12 years ago
|
||
The OpenSSL variant passes the ridiculous 16,777,216-iteration test in 152 seconds.
Assignee | ||
Comment 13•12 years ago
|
||
Complete and working code is here:
https://github.com/mozilla-services/android-sync/pull/330
Will verify on ARMv6 device when it arrives.
Updated•12 years ago
|
Whiteboard: [qa-]
Assignee | ||
Comment 14•12 years ago
|
||
Relevant:
https://wiki.mozilla.org/Identity/AttachedServices/Key_Stretching_Performance_Tests
Note that on my HTC One we can do the full key stretch -- 20k PBKDF2, 65k scrypt, 20k PBKDF2 -- in about 7.3 seconds.
Assignee | ||
Comment 15•12 years ago
|
||
Benchmarking done. Resolving.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•