Closed Bug 638297 Opened 13 years ago Closed 13 years ago

Share buffers between encryption and decryption calls

Categories

(Firefox :: Sync, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
fennec 2.0+ ---

People

(Reporter: rnewman, Assigned: rnewman)

References

Details

(Whiteboard: [qa-])

Attachments

(4 files, 3 obsolete files)

Every hit on WeaveCrypto.decrypt and .encrypt allocates two new buffers and casts them a few times. That's a lot of allocations.

It would likely be a big memory allocation win to use a limited set of buffers and grow them as necessary, reusing them on each call.
Assignee: nobody → rnewman
Blocks: 636304
tracking-fennec: --- → ?
Allocating two buffers (input and output) once and reusing them on all crypt calls would save us 4*3*N allocations, thus requesting blocking Fennec.
Apart from the four 3*N and 1*N object allocations for the buffers, these are the remaining allocations that are proportional to N:

1*N WeaveCrypto.js:433  ivParam = this.nss.PK11_ParamFromIV(...
1*N WeaveCrypto.js:438  ctx = this.nss.PK11_CreateContextBySymKey(...
1*N WeaveCrypto.js:443  tmpOutputSize
1*N WeaveCrypto.js:445  tmpOutputSize.address()?
1*N WeaveCrypto.js:455  tmpOutputSize2
1*N WeaveCrypto.js:456  tmpOutputSize2.address()?

I'm pretty sure we only have to allocate those integers and their pointers once. Depending on how PK11_ParamFromIV and PK11_CreateContextBySymKey work, we might even be able to only create those once (together with the IV and the sym key, respectively).

For completeness' sake, here are the buffer allocation hits:

3*N WeaveCrypto.js:404  let input = new ctypes.ArrayType(...
3*N WeaveCrypto.js:405  let ints  = ctypes.cast(input, ...
3*N WeaveCrypto.js:408  let outputBuffer = new ctypes.ArrayType(...
3*N WeaveCrypto.js:460  let newOutput = ctypes.cast(output, ...
1*N WeaveCrypto.js:449  output.adressOfElement(actualOutputSize)
The buffer management is getting pretty tangly, so I'm going to resume later tonight and in the morning.
This saves two ctype int allocations per decrypt or encrypt call.
Attachment #516642 - Flags: review?(philipp)
This patch:

* Initializes a pair of shared buffers, one for input and one for output.
* Maintains an int-array pointer where useful, to avoid a cast.
* Adds explicit buffer lengths to commonCrypt.

* Avoids one buffer allocation and one cast in encrypt.
* Avoids two buffer allocations and one cast in decrypt.

We don't replace the other buffer allocation in encrypt, because the current method works better with non-ASCII input, and a correct fix for this would be significant additional complexity.

We also don't replace the cast in the response of commonCrypt, because it is helpful in carrying the actual output size with the response.

Tests pass; running CrossWeave now.
Attachment #516645 - Flags: review?(philipp)
Comment on attachment 516642 [details] [diff] [review]
Part 0: reuse ints. v1

>+    // Avoid allocating new temporary ints on every run of _commonCrypt.
>+    _commonCryptSignedOutputSize: new ctypes.int(),
>+    _commonCryptUnsignedOutputSize: new ctypes.unsigned_int(),

I don't think we want to allocate these on import time and share them between instances of WeaveCrypto. I suggest creating them as null here and initializing them in one the of the init* methods, or upon first use. We also want to store the pointer as that seems to be a separate ctypes object (see comment 2).

r=me with that
Attachment #516642 - Flags: review?(philipp) → review+
Comment on attachment 516645 [details] [diff] [review]
Part 1: reuse buffers. v1

>             this.initNSS();
>             this.initAlgorithmSettings();   // Depends on NSS.
>             this.initIVSECItem();
>+            this.initBuffers(1024);

Make the 1024 a const please.

r=me with that.
Attachment #516645 - Flags: review?(philipp) → review+
Thanks Philipp. Pushed:

http://hg.mozilla.org/services/fx-sync/rev/2f6df1b6f135
http://hg.mozilla.org/services/fx-sync/rev/c20309b83792

Let's see what CrossWeave has to say about this.
Status: NEW → ASSIGNED
Whiteboard: [fixed in fx-sync]
Surprise part 2!
Attachment #516662 - Flags: review?(philipp)
(In reply to comment #9)
> Surprise part 2!

This avoids the allocation of a 16-byte buffer, ctype reference, and address on every IV generation... which is once per uploaded encrypted item. Won't affect download/decrypt, but avoids some churn on upload syncs.

CrossWeave passes, btw.
tracking-fennec: ? → 2.0+
This reorders some tests to verify that changing buffer size downwards still works.
Attachment #516662 - Attachment is obsolete: true
Attachment #516662 - Flags: review?(philipp)
Attachment #516671 - Flags: review?(philipp)
Attachment #516671 - Flags: review?(philipp) → review+
Sharing buffers and populating them by iteration is slow -- 20 times slower than allocating a new buffer. That's just JS.

We noticed that a fresh big sync allocated 700MB less on desktop... but extended sync time, and that reflects in Philipp's graphs.

Unless we're able to fill a buffer from a JS string using a ctypes call, we should back out most of Part 1.

Probably a straight back-out is infeasible, so I'll prepare an amendment patch.
Attached patch Don't zero-pad. v1 (obsolete) — Splinter Review
We no longer do anything that requires zero-padding; we use explicit buffer lengths everywhere. This small patch removes it, which will save some needless array traversals.

Tests pass; haven't run crossweave yet.
Attachment #517056 - Flags: review?(philipp)
(In reply to comment #13)

> We noticed that a fresh big sync allocated 700MB less on desktop... but
> extended sync time, and that reflects in Philipp's graphs.
> 
> Unless we're able to fill a buffer from a JS string using a ctypes call, we
> should back out most of Part 1.
> 
> Probably a straight back-out is infeasible, so I'll prepare an amendment patch.

Subsequent re-measurements revealed that total time is not significantly impacted, but the memory reduction is real. As such a backout is unnecessary.

However, we should consider implementing some of the missing ctypes operations -- readBytes (readString with no UTF-8 decoding) and writeString -- and use these to push work down into C++.
Whiteboard: [fixed in fx-sync] → [fixed in fx-sync][fixed in services]
Whiteboard: [fixed in fx-sync][fixed in services] → [parts landed in services][needs review philipp]
Comment on attachment 517056 [details] [diff] [review]
Don't zero-pad. v1

>   // Fill it too short.
>   cryptoSvc.byteCompressInts("MMM", intData, 8);
>-  for (let i = 0; i < 8; ++i)
>+  for (let i = 0; i < 3; ++i)
>     do_check_eq(intData[i], [77, 77, 77, 0, 0, 0, 0, 0, 0][i]);

This test is now a bit misleading... I would suggest having either only a 3-element array here OR go through the whole buffer, demonstrating that only elements up to 'len' are rewritten.

r=me otherwise.
Attachment #517056 - Flags: review?(philipp) → review+
Attached patch Don't zero-pad. v2 (obsolete) — Splinter Review
This addresses the prior review comment and also adds logic to handle short IVs in a consistent way (by null-padding them up to blockSize). To my knowledge, Sync never uses an IV shorter than (for AES) 16 bytes, but at least the code is now deterministic in that case!

This should be the final patch for this bug, unless I'm able to enhance js-ctypes within the time available today -- if that works, I can get a 20-40x speedup on byteCompressInts, giving us better than the best of both worlds (less garbage and faster runtime).

It's best to operate under the assumption that it won't be ready and reviewed in time, given that we're blocking Fennec, so please behave accordingly.
Attachment #517056 - Attachment is obsolete: true
Attachment #517469 - Flags: review?(philipp)
(In reply to comment #17)
> This addresses the prior review comment and
> also adds logic to handle short IVs in a
> consistent way (by null-padding them up to
> blockSize). To my knowledge, Sync never uses
> an IV shorter than (for AES) 16 bytes, but
> at least the code is now deterministic in
> that case!

Do not pad the IV with zeros. Sync uses randomly generated IVs, and we are relying on the fact that we will generate all 16 byes randomly. The IV has to be 16 bytes. If it isn't, that is an extremely serious error. 

Example: let's say the IV is one byte; then there are only 255 possible (zero-padded) IVs which pretty much defeats the purposes of an IV in the first place.
(In reply to comment #18)
> Do not pad the IV with zeros. Sync uses randomly generated IVs, and we are
> relying on the fact that we will generate all 16 byes randomly. The IV has to
> be 16 bytes. If it isn't, that is an extremely serious error. 

Thanks for pre-empting my review comment :). I agree, we should fail hard if the IV isn't the blocksize.
(In reply to comment #18)
> Do not pad the IV with zeros.

OK.

I don't want to do nothing in the case of a short IV, which is what we were doing, so throw instead?

(Thanks for the drive-by review!)
(In reply to comment #20)
> I don't want to do nothing in the case of a short IV, which is what we were
> doing, so throw instead?

Yes please.
Throw instead.
Attachment #517469 - Attachment is obsolete: true
Attachment #517469 - Flags: review?(philipp)
Attachment #517478 - Flags: review?(philipp)
Attachment #517478 - Flags: review?(philipp) → review+
Thar she blows:

http://hg.mozilla.org/services/fx-sync/rev/90aa2db69b98

Philipp, please merge up to s-c when you get a chance.
Whiteboard: [parts landed in services][needs review philipp] → [parts landed in services]
Filed follow-up:

  Bug 639589
Merged part 3 to services-central:
http://hg.mozilla.org/services/services-central/rev/cf2463e452ba

Here are the services-central URLs for the previous parts:

Part 0: http://hg.mozilla.org/services/services-central/rev/2ef1943bd05b
Part 1: http://hg.mozilla.org/services/services-central/rev/66df10cc265c
Part 2: http://hg.mozilla.org/services/services-central/rev/a5360f16ee60
Whiteboard: [parts landed in services] → [fixed in fx-sync][fixed in services]
Landed:

Part 0: http://hg.mozilla.org/mozilla-central/rev/2ef1943bd05b
Part 1: http://hg.mozilla.org/mozilla-central/rev/66df10cc265c
Part 2: http://hg.mozilla.org/mozilla-central/rev/a5360f16ee60
Part 3: http://hg.mozilla.org/mozilla-central/rev/cf2463e452ba
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Whiteboard: [fixed in fx-sync][fixed in services]
Whiteboard: [qa-]
Component: Firefox Sync: Crypto → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: