Closed Bug 1441157 Opened 6 years ago Closed 5 months ago

Switch timer fuzzing over to AES-CTR

Categories

(Core :: DOM: Events, enhancement, P5)

enhancement

Tracking

()

RESOLVED WONTFIX

People

(Reporter: tjr, Unassigned)

References

Details

Attachments

(1 file)

After talking with Franziskus about Bug 1425462 I think there's enough oddness to track down about NSS/PSM/Initialization and such that it would be best to switch over to AES in a subsequent bug rather than land it initially. So we'll aim to do that shortly after the branch to beta.

Security-wise, using a cryptographic hash function is just as good as AES-CTR (although see Bug 1440195 for a security improvement we do need.) Performance-wise AES-CTR is better, although probably not if you don't have AES-NI in which case we may even fall back to ChaCha. (To be tested.)
The attached patch crashes (both gtest and regular browser) with

> Assertion failure: arena, at /Users/tritter/Documents/moz/mozilla-unified-fuzz-4/memory/build/mozjemalloc.cpp:3494

I think this is a result of trying to do crypto really early on in startup before anything is set up; but it doesn't explain why it ever worked. It might be a dumb mistake I made while de-staticing all of the stuff from the previous version in Bug 1425462, which is at https://reviewboard.mozilla.org/r/222206/diff/3 (Although I don't know if that link will persist.)
According to data from 2 months ago, a little over 50% of our users have AES-NI instructions.
As a starter, rebased the patch. It crashes, without symbols despite being a debug bug. Some partial stacks though:


> #01: ???[/home/tom/Documents/moz/mozilla-unified-fuzz-followup/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x8a94dc2]
> #02: ???[/home/tom/Documents/moz/mozilla-unified-fuzz-followup/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x8a1c5d8]
> #03: ???[/lib/x86_64-linux-gnu/libpthread.so.0 +0x13150]
> #04: ???[/home/tom/Documents/moz/mozilla-unified-fuzz-followup/obj-x86_64-pc-linux-gnu/dist/bin/firefox +0xba65]
> #05: ???[/home/tom/Documents/moz/mozilla-unified-fuzz-followup/obj-x86_64-pc-linux-gnu/dist/bin/firefox +0x10637]
> #06: ???[/home/tom/Documents/moz/mozilla-unified-fuzz-followup/obj-x86_64-pc-linux-gnu/dist/bin/firefox +0x100ed]
> #07: ???[/home/tom/Documents/moz/mozilla-unified-fuzz-followup/obj-x86_64-pc-linux-gnu/dist/bin/firefox +0x117e5]
> #08: free[/home/tom/Documents/moz/mozilla-unified-fuzz-followup/obj-x86_64-pc-linux-gnu/dist/bin/firefox +0xd91a]
> #09: PR_Free[/home/tom/Documents/moz/mozilla-unified-fuzz-followup/obj-x86_64-pc-linux-gnu/dist/bin/libnspr4.so +0x1a18d]
> #10: PORT_Free_Util[/home/tom/Documents/moz/mozilla-unified-fuzz-followup/obj-x86_64-pc-linux-gnu/dist/bin/libnssutil3.so +0x1ef11]
> #11: SECITEM_FreeItem_Util[/home/tom/Documents/moz/mozilla-unified-fuzz-followup/obj-x86_64-pc-linux-gnu/dist/bin/libnssutil3.so +0x1d5ad]
> #12: SECITEM_FreeItem[/home/tom/Documents/moz/mozilla-unified-fuzz-followup/obj-x86_64-pc-linux-gnu/dist/bin/libnss3.so +0x49603]
> #13: ???[/home/tom/Documents/moz/mozilla-unified-fuzz-followup/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x101a671]
Comment on attachment 8954034 [details]
Bug 1441157 Switch from using a SHA-256 DRBG to a AES-CTR based one

https://reviewboard.mozilla.org/r/223192/#review233086

::: toolkit/components/resistfingerprinting/nsRFPService.cpp:401
(Diff revision 2)
> -     * even still an HMAC is a perfect fit here, as we're hashing a value
> -     * using a seed that never changes, and an input that does. So why not
> -     * use one?
> -     *
> -     * Basically - we don't need to, it's two invocations of the hash function,
> -     * and speed really counts here.
> +    MOZ_ASSERT(slot.get());
> +
> +    UniquePLArenaPool arena(PORT_NewArena(DER_DEFAULT_CHUNKSIZE));
> +    MOZ_ASSERT(arena.get());
> +
> +    UniqueSECItem secItem(::SECITEM_AllocItem(arena.get(), nullptr, AES_BLOCK_SIZE_BYTES));

This looks like trouble to me and might cause your issue. The crash looks like a double free. You use an arena *and* a unique pointer. So the arena will try to free the `secItem` *and* the unique pointer will free it. Same for the other secitem.
(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #7)
> This looks like trouble to me and might cause your issue. The crash looks
> like a double free. You use an arena *and* a unique pointer. So the arena
> will try to free the `secItem` *and* the unique pointer will free it. Same
> for the other secitem.

Thanks, that was it!
Comment on attachment 8954034 [details]
Bug 1441157 Switch from using a SHA-256 DRBG to a AES-CTR based one

https://reviewboard.mozilla.org/r/223192/#review233200

::: toolkit/components/resistfingerprinting/nsRFPService.cpp:402
(Diff revisions 2 - 3)
>  
>      UniquePLArenaPool arena(PORT_NewArena(DER_DEFAULT_CHUNKSIZE));
>      MOZ_ASSERT(arena.get());
>  
> -    UniqueSECItem secItem(::SECITEM_AllocItem(arena.get(), nullptr, AES_BLOCK_SIZE_BYTES));
> -    MOZ_ASSERT(secItem.get());
> +    SECItem* secItem = ::SECITEM_AllocItem(arena.get(), nullptr, AES_BLOCK_SIZE_BYTES);
> +    MOZ_ASSERT(secItem);

You can check unique pointers directy (`MOZ_ASSERT(secItem)`)

::: toolkit/components/resistfingerprinting/nsRFPService.cpp:411
(Diff revisions 2 - 3)
>                                                PK11_OriginUnwrap, CKA_ENCRYPT,
> -                                              secItem.get(), nullptr));
> +                                              secItem, nullptr));
>      MOZ_ASSERT(symKey.get());
>  
>      // Set up CTR Parameters
> -    UniqueSECItem param(::SECITEM_AllocItem(arena.get(), nullptr, 0));
> +    SECItem* param =::SECITEM_AllocItem(arena.get(), nullptr, 0);

Should have said this in the first comment. I think the preffered solution is to use a scoped secitem and no arena (simply pass in `nullptr`). This way you can avoid raw pointers.

::: toolkit/components/resistfingerprinting/nsRFPService.cpp:419
(Diff revisions 2 - 3)
>      ctrParams.ulCounterBits = AES_BLOCK_SIZE_BITS;
>      // This is the value of the entire IV, normally encompassing 64 bits of nonce and 64
>      // bits of counter. We set the IV to be the clamped time
>      memset(&ctrParams.cb, 0x00, AES_BLOCK_SIZE_BYTES);
>      memcpy(&ctrParams.cb, &extraClampedTime, sizeof(extraClampedTime));
> -    param.get()->data = (unsigned char*) &ctrParams;
> +    param->data = (unsigned char*) &ctrParams;

No need to `get()` anywhere here, except when passing it to PK11.
So I ran 4 SHA vs AES comparisons in perfherder: Windows and Linux with and without AES-NI

I expected to see some difference between AES-NI hardware and no AES-NI hardware.
I also expected to see a performance improvement of using AES at all with or without AES-NI, but especially with AES-NI.


All expectations were dashed.


https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=c1df942a23235ccc8ed0d40eca97f43967bfa9ff&newProject=try&newRevision=a7632a6f77e84a7b3743a6524743f2165e85ad9f&framework=1&showOnlyComparable=1

Linux with AES-NI shows only one certain result: a .4% improvement on motionmark.




https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=c785574977924cdcf54294bb16dc4b585f748188&newProject=try&newRevision=0eca9cab08ea7f8a20f8c86f92f034b3c585f6d2&framework=1&showOnlyComparable=1

Linux WITHOUT AES-NI shos 7 certain results, the largest of which are a 1.5% gain, a 2% gain, and a 2% loss.




https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=ec1b1a69966e1117e9443e44d4a6ccd1b00dfafc&newProject=try&newRevision=d37f9a0c51d7c7ccef49dc2f2540f04b29476eec&framework=1&showOnlyComparable=1

Windows with AES-NI shows one certain result: a 2.8% gain in domaeo.




https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=03e898dd3c5b7d2847b2d02cb19b6a961c9e7687&newProject=try&newRevision=f75185310908ea9a5402d3449a52ac0a4130612a&framework=1&showOnlyComparable=1

Windows without AES-NI shows 2 certain results: a 3% gain on stylebench and a 2.2% loss on displaylist_mutate.



I don't qute know what to make of this.  I did confirm that the Talos jobs ran on the machines I expected. https://wiki.mozilla.org/Buildbot/Talos/Misc
t-linux64-ms-102 for example corresponds (I presume) to the MoonShot machines which have 1585Lv5 cpus with have AES. talos-linux64-ix-058 corresponds (again, presumably) to the IX hardware without AES.

It's possible Bug 1443943 eliminated so many callsites that the crypto just isn't very expensive anymore....



It seems likely; however, that even if the data isn't being exposed by Talos; that of SHA, AES with AES-NI, and without - that there is an objective rank about which one is fastest, so I need to figure out how to measure that and make a decision.
I measured this locally, and AES, even with AES-NI, was much slower than SHA-256. I think this bug is a WONTFIX unless/until we get a PKCS#11-free encryption interface that doesn't get slowed down. I'll leave it open as blocking the other bug.
Assignee: tom → nobody
Depends on: 1441887
Priority: P1 → P5
Severity: normal → S3

I don't expect we will change this.

Status: NEW → RESOLVED
Closed: 5 months ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: