484921 - Need an interval time measurement API that doesn't roll over

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Description

•

16 years ago

PRIntervalTime rolls over about every 50 days, or less on Solaris by the look of it. PRTime can't be used as an interval time since it depends on the system time, which can be reset. In bug 475441 (and elsewhere) it seems unwise to pretend that PRIntervalTime does not roll over. So I think we need a 64-bit interval-time API. I don't think we already have one. It should live in XPCOM. We can implement it fairly easily on top of PRIntervalTime. Something like this: PRInt64 now() { static PRInt64 highWord = 0; static PRIntervalTime last; locked { PRIntervalTime now = PR_IntervalNow(); if (now < last) { highWord += 1L << 32; } last = now; return highWord + now; } } OK, so that will miss a rollover if we avoid calling now() during 50 days of uptime. I can live with that. To make things simpler than PRIntervalTime I'd make the units always be microseconds. It could be nanoseconds but that would give us a rollover after a mere 292 years of uptime. My main question is: why haven't we already needed this?

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

16 years ago

Blocks: 475441

Wan-Teh Chang

Comment 1

•

16 years ago

Yes, your algorithm is what's implemented in Chromium: http://src.chromium.org/viewvc/chrome/trunk/src/base/time_win.cc?revision=10791&view=markup Search for "rollover_" in that file, which is equivalent to "highWord" in your pseudocode, and "last_seen_" is equivalent to your "last".

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 2

•

16 years ago

Attached patch patch — Details — Splinter Review

Attachment #369599 - Flags: review?

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

16 years ago

Attachment #369599 - Flags: review? → review?(benjamin)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 3

•

16 years ago

It occurred to me while writing this that it could be a bit simpler and more efficient if we had 64-bit compare-and-swap. You'd basically store the last returned timestamp in the 64-bit slot. You'd call PR_IntervalNow(), load the last returned timestamp, compute the new timestamp, then do a compare-and-swap to store the new timestamp in the slot only if the last timestamp has not been changed by another thread. The cool part is that if the last timestamp has been changed by another thread, in this code you don't need to do a retry loop; you can just use the last timestamp value that the other thread computed, since it must be "fresh enough". But at least ARM platforms we care about don't support 64-bit CAS, so never mind.

Wan-Teh Chang

Comment 4

•

16 years ago

Comment on attachment 369599 [details] [diff] [review] patch roc: You can add a new NSPR function PR_IntervalNow64() using the code in nsTimeStamp.cpp. Whatever you decide, you should check in the current patch first. >+static PRUint32 gLastNow; It'd be better to use the PRIntervalTime type (a synonym of PRUint32) for gLastNow (and the 'now' local variable in TimeStamp::Now()). In TimeStamp::Startup(), it seems better to initialize gLastNow to PR_IntervalNow(). But it also seems that the initial value of gLastNow doesn't matter. In TimeStamp::Now(), to make the critical section as short as possible, call PR_IntervalNow (which may require a system call) before you call PR_Lock.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 5

•

16 years ago

(In reply to comment #4) > You can add a new NSPR function PR_IntervalNow64() using the > code in nsTimeStamp.cpp. Whatever you decide, you should check > in the current patch first. We could, although I think it would make more sense to focus on converting our code that currently uses PRIntervalTime to TimeStamp and TimeDuration. With PRIntervalTime it's too easy to mix up "moments" and "durations" since they have the same type. > >+static PRUint32 gLastNow; > > It'd be better to use the PRIntervalTime type (a synonym of > PRUint32) for gLastNow (and the 'now' local variable in > TimeStamp::Now()). OK. > In TimeStamp::Startup(), it seems better to initialize gLastNow > to PR_IntervalNow(). But it also seems that the initial value > of gLastNow doesn't matter. Yeah, it seems slightly simpler and infinitesimally more efficient to just use 0. > In TimeStamp::Now(), to make the critical section as short as > possible, call PR_IntervalNow (which may require a system call) > before you call PR_Lock. That's a good idea.

Matthew Gregan [:kinetik]

Comment 6

•

16 years ago

> +#include "pratom.h" Doesn't seem to be used, left over from an earlier version?

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 7

•

16 years ago

Should TimeStamps be monotonically increasing? Seems reasonable, but this isn't explicit in the class's description. The attached implementation preserves monotonicity. However, moving PR_IntervalNow() outside the mutex, as suggested in comment 3, does not. This leads to two (potential) problems: (i) returning non-monotonic TimeStamps; and (ii) allowing a race condition in which a thread spuriously detects 32 bit overflow (I can present examples if necessary). The suggested DCAS implementation of comment 3, if I understand it correctly, does not suffer from these problems. Spurious overflows aren't an issue a priori, per TimeStamp's loose specification, but it seems odd to allow. I don't know how these TimeStamps will be used, so the mutex implementation may be sufficient. However, if monotonicity is desired, and potentially sleeping on a mutex is not desired, then we need to use DCAS. This can be implementing using 32-bit CAS for this case, but is complicated. Please advise.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 8

•

16 years ago

(In reply to comment #6) > > +#include "pratom.h" > > Doesn't seem to be used, left over from an earlier version? Yes. Will fix. (In reply to comment #7) > Should TimeStamps be monotonically increasing? Seems reasonable, but this > isn't explicit in the class's description. Yes, I should add a comment to that effect. > The attached implementation preserves monotonicity. However, moving > PR_IntervalNow() outside the mutex, as suggested in comment 3, does not. This > leads to two (potential) problems: (i) returning non-monotonic TimeStamps; and > (ii) allowing a race condition in which a thread spuriously detects 32 bit > overflow (I can present examples if necessary). You're right, I was too eager to agree :-). > I don't know how these TimeStamps will be used, so the mutex implementation > may be sufficient. However, if monotonicity is desired, and potentially > sleeping on a mutex is not desired, then we need to use DCAS. This can be > implementing using 32-bit CAS for this case, but is complicated. Please > advise. It's not obvious to me how 32-bit CAS would work here, so I'm curious to hear your proposal, but we don't need it AFAIK so don't bother spending time on it if you have't got an answer already worked out :-).

Wan-Teh Chang

Comment 9

•

16 years ago

Chris, you are right! The PR_IntervalNow() calls used to update gLastNow have to be serialized by the lock to ensure monotonicity of gLastNow.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 10

•

16 years ago

> > I don't know how these TimeStamps will be used, so the mutex implementation > > may be sufficient. However, if monotonicity is desired, and potentially > > sleeping on a mutex is not desired, then we need to use DCAS. This can be > > implementing using 32-bit CAS for this case, but is complicated. Please > > advise. > > It's not obvious to me how 32-bit CAS would work here, so I'm curious to hear > your proposal, but we don't need it AFAIK so don't bother spending time on it > if you have't got an answer already worked out :-). I rechecked my idea from last night, and it doesn't work ;). This is a fun problem, and I'm going to play with it some more. I have another idea that might work out. But in the meantime, a practical point: turns out that the x86 ISA (Pentium and later) has an 8-byte CAS instruction (cmpxchg8b)! I didn't know that. So for x86 at least, simulating DCAS is a moot point: we can just use cmpxchg8b if desired.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 11

•

16 years ago

For future reference/closure, here's a Now() implementation using only CAS that I'm pretty sure works. If anyone's interested in using it, I can run it through a model checker. Just to be clear, this algorithm doesn't simulate DCAS, but only implements a 64-bit atomic increment with 32-bit CAS. It also assumes that overflow on the 32 bits will happen *very* infrequently (every 50 days or so). If overflow is much more frequent, the algorithm is incorrect. In the common case (no contention), this algorithm should require the same number of atomic memory instructions as a mutex (two). class TimeStamp: int32 gLow int32 gHigh int64 Now(): while 1: # Note (1) tlow = gLow thigh = gHigh nowLow = PR_IntervalNow() # Note (2) if CAS(gLow, tlow, now): nowHigh = thigh if tlow > now: nowHigh = nowHigh + 1 if CAS(gHigh, thigh, nowHigh) # Note (3) return <nowHigh, nowLow> Note (1): this implementation spins: each thread must update the global "now" before returning. This is how the thread ensures that its value is "consistent." Although the algorithm is "nonblocking" (which loosely means that some thread always makes progress), I would prefer the shortcut suggested in comment 3. However, I suspect (but haven't proven) that it's impossible to correctly implement that idea with only CAS. Note (2): although the system call is made in the inner loop, the system call is made without holding any locks. So other threads can progress. Hoisting the system call from the loop would create potential livelocks. Note (3): this is the trickiest part of the algorithm. Ideally we would like to only write gHigh when overflow is detected, but in that case there's a race condition that could cause a thread to return an inconsistent value. This is the part of the code that relies on the assumption of infrequent (~50 days) overflow.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 12

•

16 years ago

Actually, the implementation in comment 11 is incorrect. The order of updating the high and low words needs to be reversed. Like I said, if anyone cares to use this, I'll run it through a model checker ;). The common case is still not worse than that in comment 11. class TimeStamp: int32 gLow int32 gHigh int64 Now(): overflow = 0 while 1: tlow = gLow newHigh = thigh = gHigh nowLow = PR_IntervalNow() if !overflow && tlow > nowLow: overflow = 1 newHigh = newHigh + 1 if CAS(gHigh, thigh, newHigh): if CAS(gLow, tlow, newLow): return <newHigh, newLow> (The reason for the new |overflow| variable is that if the first CAS ever fails, then some other thread has already detected overflow and adjusted |gHigh| accordingly. No other thread needs to worry about it from then on (for 50 days, at least), as if the first CAS fails, then we assume PR_IntervalNow() will always return a "small" interval (i.e., less than the first tlow that caused the thread to first notice overflow). So whenever a thread finally updates gLow, it will be to a value consistent with gHigh.)

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 13

•

16 years ago

This problem just popped back into my head. Turns out that the code in comment 12 is also wrong :(. I give up on CAS for the time being. Unless we can figure out the CAS version, I vote for this flavor of implementation: class TimeStamp: int32 gLow int32 gHigh enum mode { FAST, SLOW } Lock lock # define SLOW_TRIGGER INT32_MAX - [2 hours?] # define FAST_TRIGGER [1 hour?] int64 Now(): tlow = gLow thigh = gHigh if likely(FAST == mode): do: tlow = gLow now = PR_IntervalNow() while !CAS(gLow, tlow, now) if unlikely(now > SLOW_TRIGGER): mode = SLOW else: # SLOW mode Lock(lock) tlow = gLow thigh = gHigh now = PR_IntervalNow() if tlow > now: # overflow gHigh = gHigh + 1 gLow = now if now > FAST_TRIGGER: mode = FAST return <thigh, tlow> The basic idea is to switch between a fastpath, atomic increment of the low word, and a slow path, mutex, based on the value of the counter. When it goes "close enough" to overflow, then we switch to the slow path, and switch back once we can assume all threads have exited the Now() method. This is a generally unsafe heuristic, but assuming that intervals are taken reasonably frequently (say, once an hour), then the FAST/SLOW_THRESHOLDs can be set safely. And amortized, it should be faster than both the (incorrect) proposals above and the mutex version.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 14

•

16 years ago

Probably purely academic at this point, but I think I finally found a 32-bit CAS implementation. Sorry for the flailing about above. class TimeStamp: uint32 gLow uint32 gHigh bool overflow # define INCREMENT_PHASE (UINT32_MAX / 4) # define OVERFLOW_PHASE ((3 * UINT32_MAX) / 4) int64 Now(): tlow = gLow newHigh = thigh = gHigh newLow = PR_IntervalNow() if INCREMENT_PHASE <= newLow && newLow < OVERFLOW_PHASE: #fastpath overflow = 0 CAS(gLow, tlow, newLow) return <newHigh, newLow> else: while 1: tlow = gLow newHigh = thigh = gHigh newLow = PR_IntervalNow() if !overflow && tlow > newLow: overflow = 1 newHigh = newHigh + 1 if CAS(gHigh, thigh, newHigh): if CAS(gLow, tlow, newLow): return <newHigh, newLow> This is a synthesis of the two incorrect attempts from comment 11 and comment 12. The basic idea is that updates occur in two "phases." Since the counter will overflow every ~50 days, it's convenient to speak of the phases in terms of days. (If we switch to microsecond resolution, the times are scaled by 1.2hrs/50days.) During days 12.5-37.5 (probably should be days 0-37.5 before the counter first overflows), we do a simple 32-bit increment on the low word. Here I've made the optimizations suggested in comment 3: if the CAS fails, we don't care, since some other thread was "close enough." However, this is purely optional. The more complicated phase is the "overflow" phase, days 37.5-50 and 0-12.5, when we get "close" to an overflow. The problem with the last two tries was races around detecting overflow. This implementation avoids that by having the first thread to detect overflow set a global flag. Other threads may also concurrently detect overflow and set the flag, but this is OK: only one of those threads will successfully CAS and update the high word (and the others, after failing, will never see overflow again). In other words, there's a race around setting the overflow flag, but the CAS(gHigh, ...) protects against corruption. And since all threads return the register values <newHigh, newLow> only if both CAS's succeed, every thread returns a consistent value. The race around clearing the overflow flag is eliminated through "time." It's only cleared when we're "safely" far enough away from the possibility of overflow. The threads in the "increment" phase do this. So the millisecond-resolution version of this code is correct and detects every overflow when a thread calls Now() at least once every 12.5 days, and the microsecond-resolution version once every 18 minutes. The code is incorrect only if overflow is detected, and no thread calls Now() until after the start of a successive overflow phase. This causes the overflow flag not to be cleared, and bad things to potentially happen (non-monotonic time stamps). Otherwise the code is correct, but may miss overflows (not a big deal). So again, this needs model checking before use, but I'm pretty confident that it's correct. (BTW, the phase times are heuristics and can be tuned based on assumptions of thread scheduling/contention etc. The current values are meant to be as safe as possible.)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 15

•

16 years ago

It's a nice homework problem but we should go with the mutex for now --- sorry for hijacking your brain :-). Some day we'll be able to depend on 64-bit CAS (Arm6K has it), then this will be easy.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 16

•

16 years ago

(In reply to comment #15) > It's a nice homework problem but we should go with the mutex for now --- sorry > for hijacking your brain :-). No problem. It was fun. I checked my code in SPIN this morning, and save for one small bug (now fixed), it's correct (modulo the assumptions of comment 14). > Some day we'll be able to depend on 64-bit CAS > (Arm6K has it), then this will be easy. Well, we also have cmpxchg8b on (>Pentium) x86. If modern ARM and x86 are pretty much all we care about, we can go ahead with the 64-bit CAS approach.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 17

•

16 years ago

Unfortunately we currently care about ARM5, I'm told.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 18

•

16 years ago

Sorry, I didn't state my point clearly enough --- if modern ARM and x86 comprise the majority of the machines we support, it may be worth going ahead with the 64-bit CAS approach for them. (Assuming, of course, that this code is performance critical.) It should hopefully be easy enough to fall back on the mutex version for other architectures (like ARM v5), or the 32-bit CAS version.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 19

•

16 years ago

This isn't likely to be a performance problem.

Benjamin Smedberg

Comment 20

•

16 years ago

Comment on attachment 369599 [details] [diff] [review] patch >diff --git a/xpcom/ds/nsTimeStamp.h b/xpcom/ds/nsTimeStamp.h >+#ifndef NSTIMESTAMP_H_ >+#define NSTIMESTAMP_H_ nit, I prefer mixed-case nsTimeStamp_h here

Attachment #369599 - Flags: review?(benjamin) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 21

•

16 years ago

Pushed http://hg.mozilla.org/mozilla-central/rev/c86a9a90983b and bustage fix http://hg.mozilla.org/mozilla-central/rev/4de1c24ecf8e

Status: NEW → RESOLVED

Closed: 16 years ago

Flags: in-testsuite+

Resolution: --- → FIXED

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

16 years ago

Attachment #369599 - Flags: approval1.9.1?

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 22

•

16 years ago

Comment on attachment 369599 [details] [diff] [review] patch I think we actually need this in 1.9.1 so the media code, at least, doesn't have weird bugs when PRIntervalTime wraps around

Benjamin Smedberg

Updated

•

16 years ago

Attachment #369599 - Flags: approval1.9.1? → approval1.9.1+

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

16 years ago

Whiteboard: [needs 191 landing]

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 23

•

16 years ago

http://hg.mozilla.org/releases/mozilla-1.9.1/rev/ac1eb4340c6d http://hg.mozilla.org/releases/mozilla-1.9.1/rev/be5de7251d3f

Keywords: fixed1.9.1

Whiteboard: [needs 191 landing]

Karl Tomlinson (:karlt)

Updated

•

10 years ago

Blocks: 1216714