MFBT needs a fast way to conduct Bernoulli trials

RESOLVED FIXED in Firefox 44

Status

()

defect
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: jimb, Assigned: jimb)

Tracking

unspecified
mozilla44
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox44 fixed)

Details

Attachments

(1 attachment, 2 obsolete attachments)

The comments in the patch explain everything. At length. Perhaps too much length.
Attachment #8663252 - Flags: review?(jwalden+bmo)
Depends on: 1206356
Blocks: 1206596
Comment on attachment 8663252 [details] [diff] [review]
Add mfbt/FastBernoulliTrial.h, implementing efficient random sampling.

Review of attachment 8663252 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/FastBernoulliTrial.h
@@ +165,5 @@
> +
> + public:
> +  // Construct a fast Bernoulli trial generator. Calls to |trial()| return true
> +  // with probability |aProbability|. Use |aState0| and |aState1| to seed the
> +  // random number generator; both may not be zero.

MOZ_ASSERT this?
(In reply to Nick Fitzgerald [:fitzgen][:nf] from comment #1)
> Comment on attachment 8663252 [details] [diff] [review]
> Add mfbt/FastBernoulliTrial.h, implementing efficient random sampling.
> 
> Review of attachment 8663252 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mfbt/FastBernoulliTrial.h
> @@ +165,5 @@
> > +
> > + public:
> > +  // Construct a fast Bernoulli trial generator. Calls to |trial()| return true
> > +  // with probability |aProbability|. Use |aState0| and |aState1| to seed the
> > +  // random number generator; both may not be zero.
> 
> MOZ_ASSERT this?

Nevermind, I see it is asserted in the patch in bug 1206356.
Comment on attachment 8663252 [details] [diff] [review]
Add mfbt/FastBernoulliTrial.h, implementing efficient random sampling.

Review of attachment 8663252 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/FastBernoulliTrial.h
@@ +165,5 @@
> +
> + public:
> +  // Construct a fast Bernoulli trial generator. Calls to |trial()| return true
> +  // with probability |aProbability|. Use |aState0| and |aState1| to seed the
> +  // random number generator; both may not be zero.

The constructor for XorShift128PlusRNG asserts this.
Attachment #8663252 - Flags: review?(jwalden+bmo) → review?(nfroyd)
Comment on attachment 8663252 [details] [diff] [review]
Add mfbt/FastBernoulliTrial.h, implementing efficient random sampling.

Review of attachment 8663252 [details] [diff] [review]:
-----------------------------------------------------------------

Cool beans, neat idea, fun comments to read, mirabile lectu.  (I'm gonna pretend that pseudo-Latin is real  :-)  Might well be!)

The undercounting issue with trial(size_t), however, means this definitely isn't ready to go yet, unless I'm entirely misunderstanding what should happen for the N-trial case.  Please do correct me if I'm wrong.

::: mfbt/FastBernoulliTrial.h
@@ +2,5 @@
> +/* vim: set ts=8 sts=2 et sw=2 tw=80: */
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +

Add an overview comment like

/* An algorithm for sampling N trials with a desired probability. */

or something.

@@ +15,5 @@
> +
> +namespace mozilla {
> +
> +// class FastBernoulliTrial
> +//

Much comments!  I agree it's a bit overkill, but (like you) I kind of don't want to take a knife to it anywhere, and it's nicely readable and all.  So good enough as-is.

@@ +98,5 @@
> +  // get "rolling the dice" afresh for every trial.
> +  //
> +  // In short, skip counts in Bernoulli trials of probability P obey a geometric
> +  // distribution. If a random variable X is uniformly distributed from [0..1),
> +  // then floor(log X base (1-P)) has the appropriate geometric distribution for

I would descend into LaTeX mode for this, I think, to be completely clear what's said.

@@ +170,5 @@
> +  FastBernoulliTrial(double aProbability, uint64_t aState0, uint64_t aState1)
> +   : mGenerator(aState0, aState1),
> +     mSkipCount(0)
> +  {
> +    setProbability(aProbability);

I like how this also initializes the skipCount -- clever.

@@ +174,5 @@
> +    setProbability(aProbability);
> +  }
> +
> +  // Return true with probability |mProbability|. The lower |mProbability| is,
> +  // the faster this function runs.

Use /** */ comments for all these user-facing documentation comments.

@@ +186,5 @@
> +  }
> +
> +  // Return true with the same likelihood that at least one of the next |aCount|
> +  // calls to |trial()| would. This takes roughly as much time as a single call to
> +  // |trial()|.

So, this is interesting.  Won't this potentially *undersample* events, because it can't return true for a case where multiple trials succeed within the given count of trials?  Like, say we have

  FastBernoulliTrial fbt(0.5, ..., ...);
  if (fbt.trial(42))
    ...; // thing happens once
  for (size_t i = 0; i < 42; i++) {
    if (fbt.trial())
      ...; // thing happens 21ish times
  }

Seems to me like this particular method -- while a nice idea -- needs a different signature and a different algorithm, somehow.  Maybe returning the number of times a trial succeeded?

At that point, you probably could unify these into a single method:

  size_t trial(size_t aCount = 1) {
    ...
  }

because the once-case would be a degenerate form of the many-case.

@@ +218,5 @@
> +  // 1.0. Otherwise, this is not used.
> +  double mLogNotProbability;
> +
> +  // Our random number generator.
> +  mozilla::XorShift128PlusRNG mGenerator;

For what it's worth, this doesn't need mozilla:: as you're in the mozilla namespace.

@@ +237,5 @@
> +
> +    // If the probabilility is zero, |trial| never returns true. Don't bother us
> +    // for a while.
> +    if (mProbability == 0.0) {
> +      mSkipCount = SIZE_MAX;

lol

::: mfbt/tests/TestFastBernoulliTrial.cpp
@@ +119,5 @@
> +
> +  {
> +    size_t count = 0;
> +    for (size_t i = 0; i < 10000; i++)
> +      count += bernoulli.trial(1000);

Given the undercounting flub in trial(size_t), I think we probably should have a test that verifies that N calls to trial(1) has equivalent behavior to 1 call to trial(N) in terms of observations, somehow.

@@ +133,5 @@
> +                                        0x74f298193fe1c5b1ULL);
> +
> +  // Establish a very high skip count.
> +  bernoulli.setProbability(0.0);
> +

Worth a !trial() assert here?
Attachment #8663252 - Flags: review?(nfroyd) → feedback+
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #4)
> > +  // In short, skip counts in Bernoulli trials of probability P obey a geometric
> > +  // distribution. If a random variable X is uniformly distributed from [0..1),
> > +  // then floor(log X base (1-P)) has the appropriate geometric distribution for
> 
> I would descend into LaTeX mode for this, I think, to be completely clear
> what's said.

Okay. You mean, like what Wikipedia has? \lfloor \ln(U) / \ln(1-p)\rfloor 

As someone who hasn't used TeX in a very long time, that's not something I'd find helpful. I could write it out in real C++: std::floor(std::log(X) / std::log(1-P)), or JS. Would that serve?

> > +  // Return true with the same likelihood that at least one of the next |aCount|
> > +  // calls to |trial()| would. This takes roughly as much time as a single call to
> > +  // |trial()|.
> 
> So, this is interesting.  Won't this potentially *undersample* events,
> because it can't return true for a case where multiple trials succeed within
> the given count of trials?  Like, say we have
> 
>   FastBernoulliTrial fbt(0.5, ..., ...);
>   if (fbt.trial(42))
>     ...; // thing happens once
>   for (size_t i = 0; i < 42; i++) {
>     if (fbt.trial())
>       ...; // thing happens 21ish times
>   }

The way you'd use it is actually more like this:

   FastBernoulliTrial fbt(0.5, ..., ...);
   if (fbt.trial(42))
     note_event_with_size(42);
   for (size_t i = 0; i < 42; i++) {
     if (fbt.trial())
       note_event_with_size(1);
   }

and the events are going to be displayed visually in some way that presents the larger ones more prominently. This is the technique used by Google's tcmalloc library:

    https://github.com/gperftools/gperftools/blob/master/src/sampler.h#L73-L100

The comments above the class for trial(n) are meant to be a restatement of the comments in the tcmalloc header.

So I don't consider it a "flub"; it was designed rather carefully, and with respect for standard practices.


> ::: mfbt/tests/TestFastBernoulliTrial.cpp
> @@ +119,5 @@
> > +
> > +  {
> > +    size_t count = 0;
> > +    for (size_t i = 0; i < 10000; i++)
> > +      count += bernoulli.trial(1000);
> 
> Given the undercounting flub in trial(size_t), I think we probably should
> have a test that verifies that N calls to trial(1) has equivalent behavior
> to 1 call to trial(N) in terms of observations, somehow.

That's exactly what this test does. Those counts weren't just chosen to match the observed behavior; I computed the "expected" values mathematically, and checked that each count was plausibly close. I've added comments like these; does that help?

  {
    size_t count = 0;
    for (size_t i = 0; i < 10000; i++)
      count += bernoulli.trial(3);

    // Expected value: (1 - (1 - 0.01) ** 3) == 0.0297,
    // 0.0297 * 10000 == 297
    MOZ_RELEASE_ASSERT(count == 304);
  }

  {
    size_t count = 0;
    for (size_t i = 0; i < 10000; i++)
      count += bernoulli.trial(10);

    // Expected value: (1 - (1 - 0.01) ** 10) == 0.0956,
    // 0.0956 * 10000 == 956
    MOZ_RELEASE_ASSERT(count == 936);
  }

  {
    size_t count = 0;
    for (size_t i = 0; i < 10000; i++)
      count += bernoulli.trial(100);

    // Expected value: (1 - (1 - 0.01) ** 100) == 0.6339
    // 0.6339 * 10000 == 6339
    MOZ_RELEASE_ASSERT(count == 6372);
  }


> > +  // Establish a very high skip count.
> > +  bernoulli.setProbability(0.0);
> > +
> 
> Worth a !trial() assert here?

This one was a regression test; I need the state to be exactly as it is after a call to setProbability in order to reproduce the conditions of the original bug. Calling 'trial' changes the state.
(In reply to Jim Blandy :jimb from comment #5)
> So I don't consider it a "flub"; it was designed rather carefully, and with
> respect for standard practices.

Easy, easy!  As I said, 

> unless I'm entirely misunderstanding what should happen for the N-trial case.
> Please do correct me if I'm wrong.

If your comment is accurate, then precisely this would be what had happened.

Anyway.  Looking more at your comment, and thinking more, I think I'm leaning toward concluding that I wasn't thinking hard enough, or was misunderstanding something, or something along those lines.  But I need to think some more on this before I can convince myself that's the case.  Please give me a little more time to convince myself one way or the other.
I'm sorry for being prickly. The other comments all sound good; I'll take care of them now.
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #4)
> @@ +170,5 @@
> > +  FastBernoulliTrial(double aProbability, uint64_t aState0, uint64_t aState1)
> > +   : mGenerator(aState0, aState1),
> > +     mSkipCount(0)
> > +  {
> > +    setProbability(aProbability);
> 
> I like how this also initializes the skipCount -- clever.

Actually, since you pointed it out, I think this became unnecessary after the fix for the bug that TestChangeProbability looks for. The constructor always calls setProbability, which always calls chooseSkipCount, which always sets mSkipCount. So I've taken this out.
Comment on attachment 8663252 [details] [diff] [review]
Add mfbt/FastBernoulliTrial.h, implementing efficient random sampling.

Review of attachment 8663252 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/FastBernoulliTrial.h
@@ +204,5 @@
> +  void setProbability(double aProbability) {
> +    MOZ_ASSERT(0.0 <= aProbability && aProbability <= 1.0);
> +    mProbability = aProbability;
> +    if (0.0 < mProbability && mProbability < 1.0)
> +      mLogNotProbability = std::log(1.0 - mProbability);

In IEEE floating point semantics, x < 1.0 does indeed imply that 1.0 - x != 0.0. Getting this right in all cases depends on denormal numbers.

However, not all hardware supports denormals, and sometimes they are turned off for performance. (Look up the FTZ and DAZ flags for Intel CPUs).

There could be values of aProbability that would cause mLogNotProbability to become -infinity. It looks like everything still works as expected in that case.

@@ +215,5 @@
> +  double mProbability;
> +
> +  // The value of std::log(1 - mProbability), if mProbability is neither 0.0 nor
> +  // 1.0. Otherwise, this is not used.
> +  double mLogNotProbability;

Since this value is only used as a divisor, it may be faster to store the reciprocal value: 1/(1-P). Floating point multiplication is faster than division.

@@ +242,5 @@
> +      return false;
> +    }
> +
> +    mSkipCount = std::floor(std::log(mGenerator.nextDouble())
> +                            / mLogNotProbability);

What happens if nextDouble() returns exactly 0.0? log(0.0) returns -infinity. Similarly, there could be rare cases where floor() returns something finite, but larger than SIZE_MAX.

You could clamp the value before converting it size_t:

double x = std::floor(std::log(nextDouble()) * mRecipLogNotProbability);
mSkipCount = x < SIZE_MAX ? x : SIZE_MAX;
(In reply to Jakob Stoklund Olesen [:jolesen] from comment #9)
> Comment on attachment 8663252 [details] [diff] [review]
> Add mfbt/FastBernoulliTrial.h, implementing efficient random sampling.
> 
> Review of attachment 8663252 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mfbt/FastBernoulliTrial.h
> @@ +204,5 @@
> > +  void setProbability(double aProbability) {
> > +    MOZ_ASSERT(0.0 <= aProbability && aProbability <= 1.0);
> > +    mProbability = aProbability;
> > +    if (0.0 < mProbability && mProbability < 1.0)
> > +      mLogNotProbability = std::log(1.0 - mProbability);
> 
> In IEEE floating point semantics, x < 1.0 does indeed imply that 1.0 - x !=
> 0.0. Getting this right in all cases depends on denormal numbers.
> 
> However, not all hardware supports denormals, and sometimes they are turned
> off for performance. (Look up the FTZ and DAZ flags for Intel CPUs).
> 
> There could be values of aProbability that would cause mLogNotProbability to
> become -infinity. It looks like everything still works as expected in that
> case.

Based on our in-person conversations I've revised some of this code and added
comments (will post after this reply), but I don't think mLogNotProbability can
be -infinity, and I don't think avoiding -infinity requires denormals.

Assume all numbers are exact unless marked otherwise with "~".

- In the 'then' branch, mProbability is (0,1). The gaps below 1 are 2**-53, so
  that range is really (0, 1-2**-53]. Since the gaps near zero are much smaller,
  I don't think it has any problem with this.

- Gaps near 1 are larger than gaps near 0, so for small ε > 0, 1-ε == 1. If
  denormals are not permitted, this is still true. So 1-mProbability is in
  [2**-53, 1].

- log(2**-53) is ~37. So log(1-mProbability) is in [-37, 0].

In chooseSkipCount, we would like to assume that mLogNotProbability is negative,
so we need to catch zero in setProbability.

C++11 has a 'std::nextafter' function to advance doubles by the smallest
possible increment in one direction or another, which makes it possible to
actually test these cases, which is nice.


> @@ +215,5 @@
> > +  double mProbability;
> > +
> > +  // The value of std::log(1 - mProbability), if mProbability is neither 0.0 nor
> > +  // 1.0. Otherwise, this is not used.
> > +  double mLogNotProbability;
> 
> Since this value is only used as a divisor, it may be faster to store the
> reciprocal value: 1/(1-P). Floating point multiplication is faster than
> division.

Interesting. I'll give it a shot.


> @@ +242,5 @@
> > +      return false;
> > +    }
> > +
> > +    mSkipCount = std::floor(std::log(mGenerator.nextDouble())
> > +                            / mLogNotProbability);
> 
> What happens if nextDouble() returns exactly 0.0? log(0.0) returns
> -infinity. Similarly, there could be rare cases where floor() returns
> something finite, but larger than SIZE_MAX.

Yes, this can occur. We need to clamp the value. I've added code for this, and
addressed the SIZE_MAX -> double conversion issues we discussed.

> You could clamp the value before converting it size_t:
> 
> double x = std::floor(std::log(nextDouble()) * mRecipLogNotProbability);
> mSkipCount = x < SIZE_MAX ? x : SIZE_MAX;

I think this won't work: the 'then' and 'else' expressions of a ?: operator are
promoted to the 'wider' type, which would be 'double' in this case; if SIZE_MAX
is 2**64-1, it will be rounded to 2**64, and then converted to size_t, which is
undefined behavior.

The comments in the code explain how I dealt with this.
jimb: I don't suppose you want to modify DMD to use this new class to control its sampling...? :)
Here's the revised patch. I've addressed Waldo's review comments and the numeric issues Jakub raised, and added a bunch more comments to cover the latter; the fixes are reasonable but not obvious.
Attachment #8663252 - Attachment is obsolete: true
Attachment #8666499 - Flags: review?(jwalden+bmo)
(In reply to Nicholas Nethercote [:njn] from comment #11)
> jimb: I don't suppose you want to modify DMD to use this new class to
> control its sampling...? :)

Not in this quarter! :D But yeah, that's hopefully the kind of thing this would serve well.
Try push, including cppunit tests, which includes TestFastBernoulliTrial:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=75789f10fd1d
Comment on attachment 8666499 [details] [diff] [review]
Add mfbt/FastBernoulliTrial.h, implementing efficient random sampling.

Review of attachment 8666499 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Jim Blandy :jimb from comment #5)
> As someone who hasn't used TeX in a very long time, that's not something I'd
> find helpful. I could write it out in real C++: std::floor(std::log(X) /
> std::log(1-P)), or JS. Would that serve?

That'd work, yes.  Mostly I think it's that the wording to write out a log of nonstandard base is not the easiest wording to parse, IMO.  If LaTeX doesn't work for you, I'm happy avoiding it.  C++ that parses unambiguously works, too.

> The way you'd use it is actually more like this:
> 
>    FastBernoulliTrial fbt(0.5, ..., ...);
>    if (fbt.trial(42))
>      note_event_with_size(42);
>    for (size_t i = 0; i < 42; i++) {
>      if (fbt.trial())
>        note_event_with_size(1);
>    }
> 
> and the events are going to be displayed visually in some way that presents
> the larger ones more prominently. This is the technique used by Google's
> tcmalloc library:

I see, so it's the responsibility of the "noter" to weight accordingly.  And because P is constant for each event, and we're assuming many events, any discounting for the number of trials that succeed in a multitrial doesn't require knowledge of the decremented number in the FBT -- if desired, you could just use the average as a reasonable enough guess that'd even out over time.

My mind still isn't grokking this scheme, only mostly understanding it, but I guess it's close enough.

> > ::: mfbt/tests/TestFastBernoulliTrial.cpp
> > @@ +119,5 @@
> > > +
> > > +  {
> > > +    size_t count = 0;
> > > +    for (size_t i = 0; i < 10000; i++)
> > > +      count += bernoulli.trial(1000);
> > 
> > Given the undercounting flub in trial(size_t), I think we probably should
> > have a test that verifies that N calls to trial(1) has equivalent behavior
> > to 1 call to trial(N) in terms of observations, somehow.
> 
> That's exactly what this test does. Those counts weren't just chosen to
> match the observed behavior; I computed the "expected" values
> mathematically, and checked that each count was plausibly close. I've added
> comments like these; does that help?

The changes are fine, but mostly I think it's just following up on the responsibility-of-the-noter bit above, not an independently different issue needing much addressing.

> > Worth a !trial() assert here?
> 
> This one was a regression test; I need the state to be exactly as it is
> after a call to setProbability in order to reproduce the conditions of the
> original bug. Calling 'trial' changes the state.

Fine by me.

::: mfbt/FastBernoulliTrial.h
@@ +14,5 @@
> +#include <stdint.h>
> +
> +namespace mozilla {
> +
> +/*

Super-nitpick, but the idea is you use /** */ comments -- two stars after the opening -- for comments that are user docs, so this should have two stars.  The implementation comments -- everything but the comment on FastBernoulliTrial(double aProbability, uint64_t aState0, uint64_t aState1), bool trial(), bool trial(size_t aCount), void setRandomState(uint64_t aState0, uint64_t aState1), and void setProbability(double aProbability) -- can be whatever.

So just one tweak to make now, but the idea's worth belaboring slightly for next time.

::: mfbt/tests/TestFastBernoulliTrial.cpp
@@ +6,5 @@
> +
> +#include "mozilla/Assertions.h"
> +#include "mozilla/FastBernoulliTrial.h"
> +
> +#include <math.h>

If you're using the std:: version of nextafter, use <cmath> here.  Technically FBT.h implies this, but IWYU and all.
Attachment #8666499 - Flags: review?(jwalden+bmo) → review+
Thanks for the review, Waldo. I think this patch implements the changes you requested.

I ran into difficulties earlier misusing cmath, so I've done a build-only try push:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4d421fde0f42
Attachment #8666499 - Attachment is obsolete: true
(In reply to Jim Blandy :jimb from comment #17)
> I ran into difficulties earlier misusing cmath, so I've done a build-only
> try push:
> 
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=4d421fde0f42

Using <cmath> and std::nextafter is less portable than <math.h> and unqualified nextafter.
Okay, that push looks clean.
https://hg.mozilla.org/integration/mozilla-inbound/rev/0140d126b76b4fe692b22d880f897496370ef480
Bug 1206357: Add mfbt/FastBernoulliTrial.h, implementing efficient random sampling. r=waldo
Flags: in-testsuite+
Target Milestone: --- → mozilla44
https://hg.mozilla.org/mozilla-central/rev/0140d126b76b
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.