Closed Bug 1057343 Opened 5 years ago Closed 2 years ago

Use Linux getrandom()/getentropy() kernel system call for obtaining entropy

Categories

(NSS :: Libraries, enhancement)

3.16.3
All
Linux
enhancement
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kaie, Assigned: kaie)

References

Details

Attachments

(1 file, 4 obsolete files)

The Linux kernel has recently been extended with a new system call "getrandom" to obtain random data. It can be used as a replacement for reading from /dev/urandom.

If it makes sense to use it, then NSS should be extended accordingly.

More information can be found in the commit message for the Linux kernel code, which includes a detailed description:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c6e9d6f38894798696f23c8084ca7edbf16ee895
Summary: Use "getrandom" kernel system call for obtaining entropy → Use Linux "getrandom" kernel system call for obtaining entropy
See Also: → 995069
We should get this implemented in the near future.

Calling getrandom() has the following advantage: If the system's kernel RNG hasn't been sufficiently seeded yet, the call to getrandom() will block until sufficient seeding has been obtained. This is useful at system boot time, to avoid that an application uses unsafe random data. This is the only time that getrandom() will block. (Thanks to Tomas Mraz for explaining this to me.)

However, instead of using getrandom() directly, it has been suggested by Florian Weimer to use the getentropy() API (which is a wrapper around getrandom() and isn't a cancellation point).

How should this change be implemented? Let's look at today's implementation(s) in NSS.

NSS calls RNG_SystemInfoForRNG() for the default seeding of the RNG. Historically, on Unix/Linux, RNG_SystemInfoForRNG used multiple sources for gathering entropy (including /dev/urandom).

Recently, in bug 1346735, an additional implementation for RNG_SystemInfoForRNG was added, that uses /dev/urandom as the only entropy source. (I say "default", because an application may provide additional input by calling RNG_RandomUpdate().)

This recent additional implementation can be selected by defining the SEED_ONLY_DEV_URANDOM build time symbol. It seems that Firefox has started to make use of this new implementation, because SEED_ONLY_DEV_URANDOM is defined by default when the gyp build system is used.

For Fedora Linux, we should also switch to this approach in the near future (define SEED_ONLY_DEV_URANDOM at build time). So let's start by implementing this change for the newer code path.

However, instead of calling /dev/urandom directly, we'd like to call getentropy, for the reason I've explained above.

The getentropy() call has been available since glibc version 2.25. Let's use a build time check, and only call it if NSS is built against this version or a newer version.

We should also implement a fallback. If a binary attempts to call getentropy(), but the kernel at runtime doesn't support the underlying getrandom() functionality, we should fall back to the classic approach and read from /dev/urandom.

(I don't know yet if the classic NSS implementation (SEED_ONLY_DEV_URANDOM is undefined) also needs to be enhanced to call getentropy instead of reading from /dev/urandom.)
Assignee: nobody → kaie
Summary: Use Linux "getrandom" kernel system call for obtaining entropy → Use Linux getrandom()/getentropy() kernel system call for obtaining entropy
Attached patch 1057343-v1.patch (obsolete) — Splinter Review
Franziskus, what do you think?
Attachment #8968175 - Flags: review?(franziskuskiefer)
Attached patch 1057343-v3.patch (obsolete) — Splinter Review
Attachment #8968177 - Attachment is obsolete: true
Attachment #8968178 - Flags: review?(franziskuskiefer)
The earlier try build had failed, because of an unused variable. That means that Mozilla's Linux builders don't have the new glibc yet, and therefore aren't using the new code. That means the results of the try build aren't that interesting, besides build sanity checking.
Comment on attachment 8968178 [details] [diff] [review]
1057343-v3.patch

Review of attachment 8968178 [details] [diff] [review]:
-----------------------------------------------------------------

Two things:
* I would suggest you add a new compile-time or runtime flag for this. For most applications (like Firefox) we don't want a blocking RNG. It would only cause hangs we don't want. If there isn't enough entropy for applications when the OS is booted up, we're out of luck anyway. I can see that this is useful for applications on startup, but they should then use a different configuration/build.
* This is only the unix_random change. But if you actually want to move to a /dev/urandom or /dev/random you should remove unix_rand.c and everything related.
Attachment #8968178 - Flags: review?(franziskuskiefer)
(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #7)
> Comment on attachment 8968178 [details] [diff] [review]
> 1057343-v3.patch
> 
> Review of attachment 8968178 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Two things:
> * I would suggest you add a new compile-time or runtime flag for this. For
> most applications (like Firefox) we don't want a blocking RNG.

once you get to a desktop, getrandom() will be non blocking, this change has zero impact on Firefox

getrandom() will block only if the urandom pool has not been initialised, once it was initialised (that happens once during whole runtime) getrandom() will never block

ensuring that the urandom pool was initialised before consuming any entropy from the kernel is precisely why getrandom() is used
(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #7)
> * I would suggest you add a new compile-time or runtime flag for this. For
> most applications (like Firefox) we don't want a blocking RNG. It would only
> cause hangs we don't want. If there isn't enough entropy for applications
> when the OS is booted up, we're out of luck anyway. I can see that this is
> useful for applications on startup, but they should then use a different
> configuration/build.


Franziskus, can Hubert's argument convince you, that a compile time flag for Firefox is unnecessary?


> * This is only the unix_random change. But if you actually want to move to a
> /dev/urandom or /dev/random you should remove unix_rand.c and everything
> related.

Why? We could treat it as the legacy approach.

Is it correct that Firefox defines SEED_ONLY_DEV_URANDOM and exclusively uses your recent unix_random.c ?

If modern Linux distributions (e.g. future Fedora) switch to the modern implementation, the classic unix_rand.c could remain unchanged for legacy environments, and could eventually get removed from NSS at a future time. (Legacy environments are those Linux distributions, which rebase to modern Firefox and NSS, but keep old glibc and old kernel.)
> Is it correct that Firefox defines SEED_ONLY_DEV_URANDOM and exclusively uses your recent unix_random.c ?

Yes.

> Why? We could treat it as the legacy approach.

Sure, but I like removing code that's not really necessary. Does NSS really run systems that doesn't have a /dev/urandom?

> can Hubert's argument convince you, that a compile time flag for Firefox is unnecessary?

I think so. I thought it would block if not enough entropy is present. But if I read the man page correctly, and as Hubert said, this only blocks if the entropy pool is not initialised yet. That sounds like it should be fine.
Attached patch 1057343-v4.patch (obsolete) — Splinter Review
Updated patch that adds a missing include for errno.h, and adds a way to define the symbol for classic "make" builds.
Attachment #8968178 - Attachment is obsolete: true
Attachment #8968520 - Flags: review?(franziskuskiefer)
(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #10)
> Sure, but I like removing code that's not really necessary. Does NSS really
> run systems that doesn't have a /dev/urandom?

Probably not. The old code uses /dev/urandom in addition to other seeding sources. I don't know if it's acceptable to remove that old code on legacy OS versions. I'd prefer to do the removal as a separate work step. I'll file a bug for it, and will add it to the "see also" section of this bug, and will ask for Bob's input in the new bug.
See Also: → 1454657
For what it's worth, this won't affect Mozilla's builds of Firefox at all, currently — they're hosted on Debian 7 (bug 1429670), which has glibc 2.13 and kernel 3.2, so anything newer than that that's build-time conditional on definitions in headers won't be used.
> * I would suggest you add a new compile-time or runtime flag for this. For
> most applications (like Firefox) we don't want a blocking RNG.

Firefox should never block, unless you are start firefox up at system boot time. You'll want syscall init in Firefox anyway because you'll want to run from containers, which sometimes cut of access to /dev altogether (including /dev/urandom).

> Sure, but I like removing code that's not really necessary. Does NSS really
> run systems that doesn't have a /dev/urandom?

We had that discusson a while ago. We've already removed the backup to /dev/urandom. If /dev/urandom isn't available, we currently fail to initialize NSS.

What is there isn't so much what is 'necessary' as much as what it 'prudent'. I tend to be a little paranoid about the PRNG. We all should be. The number of 'fails' in the rng code out there is legion. The little bit code code in these functions insulates us from environments that have a broken /dev/urandom (rather than a non-existent one). On those systems we will at least not generate a completely predictable stream of keys and leave our users vulnerable.

bob
Comment on attachment 8968520 [details] [diff] [review]
1057343-v4.patch

Review of attachment 8968520 [details] [diff] [review]:
-----------------------------------------------------------------

> What is there isn't so much what is 'necessary' as much as what it 'prudent'.

With this patch unix_random.c will never be used because `DSEED_ONLY_DEV_URANDOM` is always set.
If you want to have the /dev/urandom fallback, these changes should go into unix_random.c and the `DSEED_ONLY_DEV_URANDOM` must not be set.

::: lib/freebl/unix_urandom.c
@@ +35,5 @@
> +    result = getentropy(dest, maxLen);
> +    if (result == 0) { /* success */
> +        return maxLen;
> +    }
> +    if (result != -1 || errno != ENOSYS) { /* unexpected error */

We don't need `result != -1` here. If `getentropy` really returns something else, something is serously wrong but we should still try to get randomness.
I think the only case we should error out here is `EIO`.
Attachment #8968520 - Flags: review?(franziskuskiefer)
> With this patch unix_random.c will never be used because `DSEED_ONLY_DEV_URANDOM` is always set.
> If you want to have the /dev/urandom fallback, these changes should go into unix_random.c and the `DSEED_ONLY_DEV_URANDOM` must not be 
> set.

It's not a /dev/urandom fallback. It's hedging your bets against a broken /dev/urandom. If you drop that code, you are fully dependent on a properly functioning /dev/urandom.

bob
(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #15)
> With this patch unix_random.c will never be used because
> `DSEED_ONLY_DEV_URANDOM` is always set.
> If you want to have the /dev/urandom fallback, these changes should go into
> unix_random.c and the `DSEED_ONLY_DEV_URANDOM` must not be set.

No file "unix_random.c" exists.

After checking with Franziskus on IRC, he clarified that he was talking about file "unix_rand.c" (the old implementation).
There have been additional discussions outside this bug.

After considering all positions and arguments, I'd like to propose the following:

(a)

The old unix_rand.c (urandom plus other entropy) will be kept around for another while.

It will continue as the compile time default for make builds.

Old Linux distributions, such as RHEL 6 and RHEL 7 will continue to use the old unix_rand.c.

We file a separate reminder bug, that requests to remove the old unix_rand.c code, after RHEL 7 no longer requires it.


(b)

Firefox probably will continue to use the new unix_urandom.c which uses /dev/urandom exclusively.

It should be acceptable for Firefox, to add the suggested enhancement, which uses the getentropy() system call, if available. If not available, we fall back to the current /dev/urandom, exclusively.

For the Fedora Linux distribution, and potential future enterprise Linux distributions that are based on it, we want to use the same approach: Call getentropy(), and fallback to /dev/urandom.

Because it is very unlikely that the future Linux distributions will ever get a failure from the getentropy() syscall, the fallback to /dev/urandom could be seen as unnecessary. However, in order to avoid multiple different implementations, it is fine to keep the fallback in the code.

This allows Firefox and modern Linux distributions to keep sharing the same code, which shall be implemented in unix_urandom.c

This code is already selected by default by Firefox, because it uses the gyp build system, which defines SEED_ONLY_DEV_URANDOM by default.

We need a mechanism to use unix_urandom.c in "make" builds. I suggest that we introduce a new environment variable, prefixed with "NSS_".

If environment variable NSS_SEED_ONLY_DEV_URANDOM is defined, then symbol SEED_ONLY_DEV_URANDOM will by defined for the compiler. This is what patch v4 already suggested.


(c) 

Bob's concern is about the variety of existing Linux environments. There might be some with broken urandom. Because the Firefox binaries provided by Mozilla are used in a wide variety of different Linux environments, it's Mozilla who'd need additional protections most.

So in theory, we could have an additional discussion, if Firefox should switch back to an implementation that uses additional RNG seeding, either like the old unix_rand.c is currently doing it, or in some other ways.

Bob argues it's helpful. Franziskus argued it isn't really helpful and still predictable.

I'm not able to confirm Franziskus position. I'm leaning towards Bob's position. I think having additional input cannot hurt.

However, I would like to keep this discussion separate from this bug, which is strictly about the getentropy() enhancement. I will file another bug, to track the question if Firefox should use additional RNG seending besides urandom.
(In reply to Kai Engert (:kaie:) from comment #18)
> 
> We file a separate reminder bug, that requests to remove the old unix_rand.c
> code, after RHEL 7 no longer requires it.

Bug 1455311
See Also: → 1455311
(In reply to Kai Engert (:kaie:) from comment #18)
> 
> However, I would like to keep this discussion separate from this bug, which
> is strictly about the getentropy() enhancement. I will file another bug, to
> track the question if Firefox should use additional RNG seending besides
> urandom.

Bug 1455317
See Also: → 1455317
Attached patch 1057343-v5.patchSplinter Review
Attachment #8968520 - Attachment is obsolete: true
Attachment #8969330 - Flags: review?(franziskuskiefer)
(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #15)
> I think the only case we should error out here is `EIO`.

EIO means either "requested too much", which is a scenario the new patch v5 handles, so we should never get EIO for that reason.

Or it means the destination isn't writable. This is similar as EFAULT.

With that, I think ENOSYS should be the only scenario in which we try to fall back, because we shouldn't try again to write to a non-writable buffer.
Comment on attachment 8969330 [details] [diff] [review]
1057343-v5.patch

Review of attachment 8969330 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/freebl/unix_urandom.c
@@ +34,5 @@
>  
> +#if defined(LINUX) && defined(__GLIBC__) && ((__GLIBC__ > 2) || ((__GLIBC__ == 2) && (__GLIBC_MINOR__ >= 25)))
> +    int result;
> +
> +    while (fileBytes < maxLen) {

This is only called from `prng_reseed` and `rng_init` with 55 and 110 bytes. So there's no need for looping here. If we'd ever call this with more than 256 we would fall back to /dev/urandom directly. That should be fine. I'd like to keep the code here simple.
Attachment #8969330 - Flags: review?(franziskuskiefer)
(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #23)
> This is only called from `prng_reseed` and `rng_init` with 55 and 110 bytes.
> So there's no need for looping here.

I think you're mistaken.

Function RNG_SystemInfoForRNG calls RNG_SystemRNG with SYSTEM_RNG_SEED_COUNT, which is 1024.
The API definition of RNG_SystemRNG doesn't define a limit. It only doesn't allow for partial results, it's either all or fail.

I don't like the suggestion to "fall back to read from /dev/urandom" if it's available. We want to switch to getentropy() because it has better properties. Not using it based on the size of requested data seems wrong.
Flags: needinfo?(franziskuskiefer)
> Function RNG_SystemInfoForRNG calls RNG_SystemRNG with SYSTEM_RNG_SEED_COUNT, which is 1024.

RNG_SystemInfoForRNG is a function in unix_rand.c so has nothing to do with this.
Flags: needinfo?(franziskuskiefer)
(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #26)
> RNG_SystemInfoForRNG is a function in unix_rand.c so has nothing to do with
> this.

No, it has an implementation in unix_urandom.c, too.
https://hg.mozilla.org/projects/nss/file/tip/lib/freebl/unix_urandom.c
Comment on attachment 8969330 [details] [diff] [review]
1057343-v5.patch

Review of attachment 8969330 [details] [diff] [review]:
-----------------------------------------------------------------

Oh well, don't mind me... Sorry for all the confusion. Looks good. Land it.
Attachment #8969330 - Flags: review+
Thank you Franziskus! Trees are currently closed, I'll land on Monday.
https://hg.mozilla.org/projects/nss/rev/2371e32a4cb7
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 3.37
You need to log in before you can comment on or make changes to this bug.