Closed Bug 636309 Opened 13 years ago Closed 12 years ago

Reduce or eliminate Sync's usage of js-ctypes

Categories

(Firefox :: Sync, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 743070

People

(Reporter: philikon, Unassigned)

References

(Depends on 1 open bug)

Details

nsIStreamCipher nearly does everything we need in terms of crypto, except that it doesn't support the padded AES mode. This could be easily added. Once that is done, we could ditch WeaveCrypto (and ctypes which in itself seems to add at least ~2 megs to the memory footprint)
Depends on: 636310
I think we should create a scriptable interface that looks something like this:

[scriptable]
interface nsISyncRecordKey {
    void verifyAndDecrypt(in nsCString base64EncodedMAC, 
                          in nsCString base64EncodedCiphertext,
                          out nsCString plaintext);
    void encryptAndSign(in nsCString plaintext,
                        out nsCString base64EncodedCiphertext,
                        out nsCString base64EncodedMAC);
}

[scriptable]
interface nsISyncMasterKey {
    // TODO: methods for loading/saving sync key from keystore
    void unwrapRecordKey(in nsCString base64EncodedMACForEncryptedKey, 
                         in nsCString base64EncodedEncrytedKey,
                         out nsISyncRecordKey syncRecordKey);
};

Besides (hopefully greatly) reducing the number Javascript objects created, this would make the crypto in sync much easier to verify as correct (e.g. making it obvious that HMACs are always verified before the data is decrypted in order to avoid padding oracle attacks). Also, verifyAndDecrypt should be able to avoid quite a few allocations by doing base 64 decoding and/or decryption in-place.
Summary: Ditch WeaveCrypto and use nsIStreamCipher → Reduce or eliminate Sync's usage of js-ctypes
Mike, what are your thoughts about doing this? I remember that you were advocating for a js-ctypes-based implementation before.
I think my primary motivation was one of avoiding yet another round of code churn.  I'm not opposed in principle, but I'd want to have some clear goals/outcomes tied to any major piece of work.
I don't know if the 2MB of (AFAICT wasted) memory usage is due to *any* usage of js-ctypes (which would merit its own bug), or whether that 2MB increase in memory is specific to Sync's use of js-ctypes, or what.
Whiteboard: [MemShrink]
(In reply to Brian Smith (:bsmith) from comment #4)
> I don't know if the 2MB of (AFAICT wasted) memory usage is due to *any*
> usage of js-ctypes (which would merit its own bug), or whether that 2MB
> increase in memory is specific to Sync's use of js-ctypes, or what.

Back when I was taking measurements (pretty much exactly 1 year ago), just simply doing Cu.import("resource://gre/modules/ctypes.jsm") would raise the memory footprint by 2 MB, presumably because libffi was being loaded. So we'd be saving those 2 MB only so long as no other part of Firefox or an extension used ctypes. Right now that's the case if Sync isn't enabled. Also, the main motivator back then was mobile, which no longer uses the Gecko Sync implementation starting with Android Sync.
> Back when I was taking measurements (pretty much exactly 1 year ago), just
> simply doing Cu.import("resource://gre/modules/ctypes.jsm") would raise the
> memory footprint by 2 MB, presumably because libffi was being loaded.

What's the definition of "footprint"?  Resident Set Size, i.e. physical memory consumption?
Whiteboard: [MemShrink] → [MemShrink:P3]
DOMCrypt work might also be a possibility here.
Depends on: 721193
This isn't that hard to fix and basing the solution on DOMCrypt probably won't be as good w.r.t. performance and memory footprint because the base64 (and similar) encoding would still end up being done on JS instead of native code. If somebody is interested in working on this, I will help them with it. Considering that the js-ctypes-based code is already using NSS, it is actually pretty straightforward to rewrite those parts of the JS code in C++.
No longer depends on: 636310, 721193
Brian: we're looking at rewriting significant parts of Sync in the next few months and I'd definitely like to tackle this properly.

If the first step is coming up with an interface, I'll propose (in language agnostic pseudo-code):

// 1. Serialize native JavaScript object to a JSON string. Produces cleartext.
// 2. Generate IV
// 3. AES256(cleartext, encryptionKey, IV) -> ciphertext
// 4. HMAC256(ciphertext + IV, hmacKey) -> hmac
// 5. Base64 encode {ciphertext, hmac, IV}
// 6. Put results from above into a JSON object and serialize to a string.
int encryptRecordPayload(
      in JSObject obj,
      in byte[] encryptionKey,
      in byte[] hmacKey,
      out string payload);

// This is the above except in reverse.
// 1. Decode JSON string payload into an object/map. Extract ciphertext, iv, hmac
// 2. Base64 decode fields from above
// 3. Perform HMAC verification. Return an error code if that fails.
// 4. Decrypt ciphertext -> cleartext
// 5. JSON deserialize cleartext into JavaScript object
int decryptRecordPayload(
      in string payload,
      in byte[] encryptionKey,
      in byte[] hmacKey,
      out JSObject decoded);

Please note we are considering creating a new storage version to "fix" some warts in the existing crypto. Mainly:

1) Perform HMAC over raw bytes, not the base64 encoded version
2) Include IV in HMAC

I also intend to change the JS implementation so keys, ciphertexts, etc are stored as raw bytes (not baseN encoded variants) as much as possible. We perform needless conversions today because we often store these in encoded variants, only to have to convert them frequently for many of the crypto operations.

Also, while the proposal from comment #1 included an interface for dealing with intermediate key wrapping, I'm not sure we do this enough to warrant a performance win. But, if we can tackle it easily, I think having the crypto all live in the same place would be a win.
(In reply to Gregory Szorc [:gps] from comment #9)
> Please note we are considering creating a new storage version to "fix" some
> warts in the existing crypto. Mainly:
> 
> 1) Perform HMAC over raw bytes, not the base64 encoded version
> 2) Include IV in HMAC

As far as crypto goes, perhaps the usage of HKDF should be changed at the same time.

I believe, but I haven't verified, that there is also some big wins in storage/transmission efficiency by compressing and encrypting+signing multiple records at once, if changing the data format is on the table as you indicated above.

> I also intend to change the JS implementation so keys, ciphertexts, etc are
> stored as raw bytes (not baseN encoded variants) as much as possible. We
> perform needless conversions today because we often store these in encoded
> variants, only to have to convert them frequently for many of the crypto
> operations.
> 
> Also, while the proposal from comment #1 included an interface for dealing
> with intermediate key wrapping, I'm not sure we do this enough to warrant a
> performance win. But, if we can tackle it easily, I think having the crypto
> all live in the same place would be a win.

The proposal in comment #1 is basically an extension of the "keys in raw bytes" optimization you mentioned in the previous paragraph. There is a significant amount of overhead (multiple memory allocations and lock acquisitions) in creating/destroying key objects in NSS over and over again.
(In reply to Brian Smith (:bsmith) from comment #8)
> This isn't that hard to fix and basing the solution on DOMCrypt probably
> won't be as good w.r.t. performance and memory footprint because the base64
> (and similar) encoding would still end up being done on JS instead of native
> code. If somebody is interested in working on this, I will help them with
> it. Considering that the js-ctypes-based code is already using NSS, it is
> actually pretty straightforward to rewrite those parts of the JS code in C++.

See also the old C++ XPCOM implementation: https://mxr.mozilla.org/services-central/source/fx-sync/crypto-obsolete/src/

(In reply to Gregory Szorc [:gps] from comment #9)
> Brian: we're looking at rewriting significant parts of Sync in the next few
> months and I'd definitely like to tackle this properly.
> 
> If the first step is coming up with an interface, I'll propose (in language
> agnostic pseudo-code):
> 
> // 1. Serialize native JavaScript object to a JSON string. Produces
> cleartext.
> // 2. Generate IV
> // 3. AES256(cleartext, encryptionKey, IV) -> ciphertext
> // 4. HMAC256(ciphertext + IV, hmacKey) -> hmac
> // 5. Base64 encode {ciphertext, hmac, IV}
> // 6. Put results from above into a JSON object and serialize to a string.

I approve of this plan.

> int encryptRecordPayload(

What's the 'int'? Do you mean 'nsresult'?

>       in JSObject obj,

s/JSObject/jsval/

Also, now you're going to need to decorate this method with [implicit_jscontext] so you get the JSContext object passed in (which you need for all non-trivial js api functions).

> Please note we are considering creating a new storage version to "fix" some
> warts in the existing crypto.

Unless there are other reasons that happen to align with a new storage format, I'm pretty sure the benefits will not outweigh the pain you will be going through. By a long way.
Depends on: 743070
I interpret this bug to mean "have Sync don't use js-ctypes [because it is/was the only consumer of js-ctypes and the memory footprint of loading js-ctypes was deemed wasteful]." We already have bug 743070 filed to replace WeaveCrypto. This is effectively a superset of this bug. And, it will probably not involve js-ctypes. Although, who knows.

I've heard that other people will be using js-ctypes in Gecko. So, I'm renominating this bug for MemShrink triage. I /think/ this bug can probably be closed WONTFIX or a dupe. But, I would like an answer from the MemShrink people on whether usage of js-ctypes is forbidden in core platform or application code.
Whiteboard: [MemShrink:P3] → [MemShrink]
(In reply to Gregory Szorc [:gps] from comment #12)
> I interpret this bug to mean "have Sync don't use js-ctypes [because it
> is/was the only consumer of js-ctypes and the memory footprint of loading
> js-ctypes was deemed wasteful]." We already have bug 743070 filed to replace
> WeaveCrypto. This is effectively a superset of this bug. And, it will
> probably not involve js-ctypes. Although, who knows.
> 
> I've heard that other people will be using js-ctypes in Gecko. So, I'm
> renominating this bug for MemShrink triage. I /think/ this bug can probably
> be closed WONTFIX or a dupe. But, I would like an answer from the MemShrink
> people on whether usage of js-ctypes is forbidden in core platform or
> application code.

I forbid it!  Bow down before me!

Actually, I don't have that much power :)  And my impression is that js-ctypes is only going to get more use in the future, rather than less.  (bz, does that sound right to you?)

Maybe there's an argument for avoiding the 2MB js-ctypes hit, but I think it's reasonable for Sync to use it.
Whiteboard: [MemShrink]
I have no idea what the plans are for ctypes use, sorry....
(In reply to Nicholas Nethercote [:njn] from comment #13)
> Actually, I don't have that much power :)  And my impression is that
> js-ctypes is only going to get more use in the future, rather than less. 

At least the improved JS file I/O API is making rather heavy usage of js-ctypes.  I imagine that implies that js-ctypes will be loaded virtually all the time...
Right now we're strongly pushing js-ctypes for add-ons which need to interface with binary code. There aren't a whole lot of add-ons that use it right now, but I'd still not be happy about a policy that discourages its use, given how painful the alternatives (binary components, essentially) are.
As previous commenters have pointed out, ctypes is indeed a valid tool for interfacing with C code, and the 2 MB overhead is -- in most cases -- negligible or a price we're already prepared to pay elsewhere.

There are other reasons why it makes sense for Sync to move away from ctypes:

* ctypes is object creation intensive. Every cast, buffer resize, etc., creates at least one new JS object. This is fine for casual use (extensions, file API, etc.). Sync's crypto code, however, gets called in a tight loop several thousand times. That's a lot of overhead and GC pressure, even with the memoization we're currently doing (at the expense of some potential memory leaks, I'm sure).

* ctypes code can be very verbose and yet not resemble the underlying C interface very much. This can be a problem when security-sensitive code like crypto code needs to be peer reviewed by people possibly not familiar with JS, and definitely not familiar with ctypes. In fact, one of the goals for bug 743070 was to put all record crypting and verification in one central place (in C++), so it can be vetted more easily.

Greg, can we dupe this bug or make it depend on bug 743070?
OK. So, ctypes is OK for overall Firefox.

As far as Sync goes, we have bug 743070 on file to replace the current WeaveCrypto (the part that uses ctypes) with something else (which doesn't use ctypes and is more efficient because of it).

So, marking this as a dupe.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Component: Firefox Sync: Crypto → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.