Closed Bug 522126 Opened 10 years ago Closed 10 years ago

Implement high-resolution POSIX-clock TimeStamp/TimeDuration backend

Categories

(Core :: XPCOM, enhancement)

x86_64
Linux
enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla1.9.3a1

People

(Reporter: cjones, Assigned: cjones)

References

Details

Attachments

(3 files, 4 obsolete files)

I need to time events that last less than a millisecond, but NSPR claims to have millisecond-resolution timers on my modern linux system.  Independent tests show that clock_gettime(CLOCK_MONOTONIC) actually has about 5e-7 second resolution on my system.

Rather than changing NSPR, it's probably easier to provide different TimeStamp implementations for platforms with "good" timers.

jduell also mentions the GASNet timer library that we might be able to import and have TimeStamp wrap, which would be best of all.
I don't have time to work on this atm, and this patch satisfies my immediate need.  Anyone taking this bug can use that patch as a "starter kit".

I should say though that using the GASNet timer code seems like a better way to go.

Also, TimeStamp or TimeDuration should be extended with a "static double Resolution()" method returning the resolution of the underlying timer.  GASNet probably already provides this.
(In reply to comment #1)
> Also, TimeStamp or TimeDuration should be extended with a "static double
> Resolution()" method returning the resolution of the underlying timer.  GASNet
> probably already provides this.

What's the need for this? I would be reluctant to provide it. For one thing the resolution can vary over time depending on power management, for example.
(In reply to comment #2)
> (In reply to comment #1)
> > Also, TimeStamp or TimeDuration should be extended with a "static double
> > Resolution()" method returning the resolution of the underlying timer.  GASNet
> > probably already provides this.
> 
> What's the need for this? I would be reluctant to provide it. For one thing the
> resolution can vary over time depending on power management, for example.

Until we have high-resolution timers implemented for all platforms, I want to know about the possible, ahem, 4-order-of-magnitude discrepancy in TimeStamp::Now resolution, so I can just skip the kind of performance testing that motivated this bug on those platforms.  Frequency scaling only affects uncorrected cycle counters, and it's a bad idea for us to use those anyway because of inter-CPU-core discrepancies.

I'm fine with rounding up the empirical resolution to the nearest order of magnitude.  Is that a satisfactory compromise?
Comment on attachment 411372 [details] [diff] [review]
part 3: implement TimeStamp with POSIX timers

Missing part 2, which is the configure.in junk to test for clock_gettime(CLOCK_MONOTONIC).  I tested this patch by hand.
Attachment #411372 - Flags: review?(roc)
(In reply to comment #3)
> Until we have high-resolution timers implemented for all platforms, I want to
> know about the possible, ahem, 4-order-of-magnitude discrepancy in
> TimeStamp::Now resolution, so I can just skip the kind of performance testing
> that motivated this bug on those platforms.  Frequency scaling only affects
> uncorrected cycle counters, and it's a bad idea for us to use those anyway
> because of inter-CPU-core discrepancies.

It's not just about frequency scaling. On some versions of Windows you can vary the frequency of the interrupt that drives its clock, increasing the frequency at the expense of increased power consumption.

> I'm fine with rounding up the empirical resolution to the nearest order of
> magnitude.  Is that a satisfactory compromise?

I agree that this is a reasonable use of the Resolution() method, so just add it.
(In reply to comment #7)
> > I'm fine with rounding up the empirical resolution to the nearest order of
> > magnitude.  Is that a satisfactory compromise?
> 
> I agree that this is a reasonable use of the Resolution() method, so just add
> it.

Done, it went up for your review last night ;).
Comment on attachment 411371 [details] [diff] [review]
part 1: refactor NSPR TimeStamp in preparation for platform-specific ones

+ * It is
+ * assumed that the system-dependent unit is constant, so that values
+ * can be simply added and subtracted.

Rather, the system-dependent unit *must be* constant, or the semantics of this class are completely broken.
Attachment #411371 - Flags: review?(roc) → review+
Assignee: nobody → jones.chris.g
+// Since we won't get 1ns resolution out of clock_gettime(), limit the
+// "precision" of TimeStamp::Now() values by a "reality factor",
+// computed from TimeDuration::Resolution.
+static PRUint64 sRealityFactor;

Why do we bother with this?

+  //  Since
+  // clock_gettime() likely involves a system call on your platform,

Hmm, I'd kinda hope this is not true, but oh well.

+  if (0 == minres)
+    NS_RUNTIMEABORT("the clock resolution is *not* 1ns, something's wrong");

Suppose clock_gettime was just reading a CPU register, and you had a really fast CPU --- couldn't this fire?

+    NS_RUNTIMEABORT("the clock resolution is *not* >=1ms, something's wrong");

If you're really unlucky this could fire. I think these should both just be NS_WARNING.

+  // round the resolution to the nearest power of ten

I don't think we really need to do this.
(In reply to comment #10)
> +// Since we won't get 1ns resolution out of clock_gettime(), limit the
> +// "precision" of TimeStamp::Now() values by a "reality factor",
> +// computed from TimeDuration::Resolution.
> +static PRUint64 sRealityFactor;
> 
> Why do we bother with this?
> 

Otherwise TimeDuration::Resolution() will return doubles with a bunch of insignificant digits.

> +  //  Since
> +  // clock_gettime() likely involves a system call on your platform,
> 
> Hmm, I'd kinda hope this is not true, but oh well.
> 

It is for x86/linux/glibc.  I should probably check Darwin though.

> +  if (0 == minres)
> +    NS_RUNTIMEABORT("the clock resolution is *not* 1ns, something's wrong");
> 
> Suppose clock_gettime was just reading a CPU register, and you had a really
> fast CPU --- couldn't this fire?
> 

Eh ... that would mean the function call, register read, counter math, memory stores or register writes, and call-return completed (let alone the postprocessing math in ClockTimeNs()) in ~3 CPU cycles (1 ns), which I don't believe is possible.

> +    NS_RUNTIMEABORT("the clock resolution is *not* >=1ms, something's wrong");
> 
> If you're really unlucky this could fire. I think these should both just be
> NS_WARNING.
> 

This might be possible, if clock_gettime() isn't a syscall.

> +  // round the resolution to the nearest power of ten
> 
> I don't think we really need to do this.

If we want to chomp insignificant bits/digits, we have to, otherwise millseconds and seconds won't be represented exactly and your unit tests fail ;).
(In reply to comment #11)
> (In reply to comment #10)
> > +// Since we won't get 1ns resolution out of clock_gettime(), limit the
> > +// "precision" of TimeStamp::Now() values by a "reality factor",
> > +// computed from TimeDuration::Resolution.
> > +static PRUint64 sRealityFactor;
> > 
> > Why do we bother with this?
> > 
> 
> Otherwise TimeDuration::Resolution() will return doubles with a bunch of
> insignificant digits.
> 

I should add that this is also closer to the NSPR clock "tick" semantics.
(In reply to comment #11)
> (In reply to comment #10)
> > +    NS_RUNTIMEABORT("the clock resolution is *not* >=1ms, something's wrong");
> > 
> > If you're really unlucky this could fire. I think these should both just be
> > NS_WARNING.
> > 

Jeez, I totally misread this.  I can't think of a case where this could happen, TBH, unless the platform clock is really bad (and in that case we should fall back on NSPR).  Do you have something in mind?
(In reply to comment #11)
> Otherwise TimeDuration::Resolution() will return doubles with a bunch of
> insignificant digits.

Like I said on IRC, I don't think we need to trim insignificant digits for non-human-readable values.

> Eh ... that would mean the function call, register read, counter math, memory
> stores or register writes, and call-return completed (let alone the
> postprocessing math in ClockTimeNs()) in ~3 CPU cycles (1 ns), which I don't
> believe is possible.

Everything could be inlined and take place entirely in registers. Not likely, sure, but I hate to abort in that case.

(In reply to comment #13)
> Jeez, I totally misread this.  I can't think of a case where this could happen,
> TBH, unless the platform clock is really bad (and in that case we should fall
> back on NSPR).  Do you have something in mind?

10 unlucky page faults/context switches?
(In reply to comment #14)
> (In reply to comment #11)
> (In reply to comment #13)
> > Jeez, I totally misread this.  I can't think of a case where this could happen,
> > TBH, unless the platform clock is really bad (and in that case we should fall
> > back on NSPR).  Do you have something in mind?
> 
> 10 unlucky page faults/context switches?

It's academic now based on IRC conversation, but there couldn't be 10 consecutive page faults since the amount of memory access is small and bounded.  Or, if that did occur, the system would be in such severe memory pressure Firefox wouldn't run anyway.  So it'd have to be 10 consecutive context switches at clock_gettime().  Maybe that could possibly happen on a real-time-scheduling kernel, but otherwise it would imply that our timeslice could be on the order of nanoseconds, and Firefox would be unbearably slow.
Attached patch part 3 v2 (obsolete) — Splinter Review
Not reposting the initial patch, the only substantial change is the addition of TimeDuration::ToSecondsSigDigits(), which can be inferred from this patch.
Attachment #411542 - Flags: review?(roc)
Attachment #411372 - Attachment is obsolete: true
Attachment #411372 - Flags: review?(roc)
+  // round the resolution to the nearest power of ten

I still don't think this is necessary.
Attached patch part 3 v3 (obsolete) — Splinter Review
Keep originally computed resolution, shuffle around rounding code.
Attachment #411542 - Attachment is obsolete: true
Attachment #411568 - Flags: review?(roc)
Attachment #411542 - Flags: review?(roc)
+  // find the number of significant digits in sResolution, for the
+  // sake of ToSecondsSigDigits()
+  // NB: we assume here that we don't have a resolution < 1ms, but if
+  // we do, it doesn't break anything, we just lie
+  if (sResolution >= (100*kNsPerUs))      sResolutionSigDigs = 100*kNsPerUs;
+  else if (sResolution >= (10*kNsPerUs))  sResolutionSigDigs = 10*kNsPerUs;
+  else if (sResolution >= (kNsPerUs))     sResolutionSigDigs = kNsPerUs;
+  else if (sResolution >= (100))          sResolutionSigDigs = 100;
+  else if (sResolution >= (10))           sResolutionSigDigs = 10;
+  else                                    sResolutionSigDigs = 1;

Why not just a loop?

  for (sResolutionSigDigs = 1; sResolutionSigDigs < sResolution;
       sResolutionSigDigs *= 10);
(In reply to comment #19)
> +  // find the number of significant digits in sResolution, for the
> +  // sake of ToSecondsSigDigits()
> +  // NB: we assume here that we don't have a resolution < 1ms, but if
> +  // we do, it doesn't break anything, we just lie
> +  if (sResolution >= (100*kNsPerUs))      sResolutionSigDigs = 100*kNsPerUs;
> +  else if (sResolution >= (10*kNsPerUs))  sResolutionSigDigs = 10*kNsPerUs;
> +  else if (sResolution >= (kNsPerUs))     sResolutionSigDigs = kNsPerUs;
> +  else if (sResolution >= (100))          sResolutionSigDigs = 100;
> +  else if (sResolution >= (10))           sResolutionSigDigs = 10;
> +  else                                    sResolutionSigDigs = 1;
> 
> Why not just a loop?
> 

Because I'm dumb.

>   for (sResolutionSigDigs = 1; sResolutionSigDigs < sResolution;
>        sResolutionSigDigs *= 10);

This isn't quite correct ;).
Attached patch part 3 v4Splinter Review
Attachment #411568 - Attachment is obsolete: true
Attachment #411589 - Flags: review?(roc)
Attachment #411568 - Flags: review?(roc)
(In reply to comment #10)
> +  //  Since
> +  // clock_gettime() likely involves a system call on your platform,
> 
> Hmm, I'd kinda hope this is not true, but oh well.
> 

Not sure if it's much consolation, but clock_gettime *isn't* a syscall on linux/x86_64.  Apparently it's a "vsyscall" to the kernel-gate.
Comment on attachment 411608 [details] [diff] [review]
configure test for clock_gettime and build hackery

--- a/toolkit/library/libxul-config.mk
+ifdef HAVE_CLOCK_MONOTONIC
+EXTRA_DSO_LDOPTS += -lrt
+endif

This feels like it ought to be a configure test or at least specified in configure, not hardcoded here. At the very least, can you make this (something)_LIBS and then use that instead of -lrt directly?

r=me with that change.
Attachment #411608 - Flags: review?(ted.mielczarek) → review+
No longer blocks: 519568
Target Milestone: --- → mozilla1.9.3a1
Flags: in-testsuite+
Summary: Use high-resolution platform timers for TimeStamp → Implement high-resolution POSIX-clock TimeStamp/TimeDuration backend
What about those of us with VMs whose clock_getres() is 10000000ns?
Or indeed any system where it is greater than the time between syscalls?
Perhaps if two syscalls return the same time, fall back on clock_getres?
Depends on: 549178
You need to log in before you can comment on or make changes to this bug.