Open
Bug 1204823
Opened 9 years ago
Updated 1 year ago
TimeStamp::NowLowRes() behaves differently over computer sleep over platforms
Categories
(Core :: mozglue, defect, P4)
Core
mozglue
Tracking
()
NEW
Tracking | Status | |
---|---|---|
firefox43 | --- | affected |
People
(Reporter: gfritzsche, Unassigned)
References
(Blocks 2 open bugs)
Details
(Whiteboard: [measurement:client])
Attachments
(1 obsolete file)
We have TimeStamp::NowLowRes() etc. for monotonic clock usage, as e.g. used by nsITelemetry.msSinceProcessStart(): https://dxr.mozilla.org/mozilla-central/rev/9ed17db42e3e46f1c712e4dffd62d54e915e0fac/toolkit/components/telemetry/Telemetry.cpp#3565 However, we noticed that this behaves differently over platforms when a computer goes to sleep, so it's not useful for any cross-platform time measurements. On Windows this comes down to GetTickCount64(), which does increase over sleep periods: https://msdn.microsoft.com/en-us/library/windows/desktop/ms724411%28v=vs.85%29.aspx https://dxr.mozilla.org/mozilla-central/rev/9ed17db42e3e46f1c712e4dffd62d54e915e0fac/mozglue/misc/TimeStamp_windows.cpp#530 OS X comes down to mach_absolute_time(), which does not increase over sleep periods: https://dxr.mozilla.org/mozilla-central/rev/9ed17db42e3e46f1c712e4dffd62d54e915e0fac/mozglue/misc/TimeStamp_darwin.cpp#49 POSIX comes down to clock_gettime(CLOCK_MONOTONIC, &ts), which should not increase over sleep time either: https://dxr.mozilla.org/mozilla-central/rev/9ed17db42e3e46f1c712e4dffd62d54e915e0fac/mozglue/misc/TimeStamp_posix.cpp#77
Reporter | ||
Updated•9 years ago
|
Whiteboard: [unifiedTelemetry]
Reporter | ||
Updated•9 years ago
|
Priority: -- → P2
Reporter | ||
Comment 1•9 years ago
|
||
Benjamin, do you have an opinion on the expected behavior here or know who might have one? I'm leaning towards the Windows behavior of actual time elapsed, not "time spent in working state".
Flags: needinfo?(benjamin)
Comment 2•9 years ago
|
||
I don't have an opinion. You'd need to survey existing callers to figure out whether they expect one thing or the other. Nathan may know more. But rather than asking about TimeStamp specifically, can you just articulate the behavior you actually want and we can decide where that implementation should live?
Flags: needinfo?(benjamin) → needinfo?(nfroyd)
Reporter | ||
Comment 3•9 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #2) > But rather than asking about TimeStamp specifically, can you just articulate > the behavior you actually want and we can decide where that implementation > should live? There are two issues here: * the TimeStamp implementation behaves inconsistent across platforms * what behavior our affected use should have (moving that needinfo to another bug)
Comment 4•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] [away Sep 5 - 13] from comment #3) > * the TimeStamp implementation behaves inconsistent across platforms Is this true for high resolution timing as well? I can't tell straightaway from reading the Windows code. I can see arguments both ways for what people prefer. For telemetry measurements, I think you'd typically want actual time spent working. But I'm not entirely sure about that--would have to think about it a little more. The answer might depend on the particular measurement, too...
Flags: needinfo?(nfroyd)
Reporter | ||
Comment 5•9 years ago
|
||
From a casual check on DXR, TimeStamp seems to be used both ways: (1) for checking how long some operation takes (including sleep time probably undesired) (2) for checking how much time passed since a certain event (sleep time should probably be included) I'm thinking defaulting to including sleep time might be the safer approach with less issues triggered. Giving that is the effective behavior on Windows, features should already work ok with that. E.g. the slow script detection works around sleep time jumps somewhat awkwardly: https://dxr.mozilla.org/mozilla-central/rev/9ed17db42e3e46f1c712e4dffd62d54e915e0fac/js/xpconnect/src/XPCJSRuntime.cpp#1345
Reporter | ||
Comment 6•9 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #4) > (In reply to Georg Fritzsche [:gfritzsche] [away Sep 5 - 13] from comment #3) > > * the TimeStamp implementation behaves inconsistent across platforms > > Is this true for high resolution timing as well? I can't tell straightaway > from reading the Windows code. From this: http://www.janbambas.cz/usage-of-queryperformancecounter-in-gecko/ ... and a quick look over this: https://dxr.mozilla.org/mozilla-central/rev/9ed17db42e3e46f1c712e4dffd62d54e915e0fac/mozglue/misc/TimeStamp_windows.cpp#317 ... my understanding is that high-resolution timing on Windows should just fall back to GetTickCount64() values after sleep periods and thus behave the same. Honza, can you confirm?
Flags: needinfo?(honzab.moz)
Updated•9 years ago
|
Points: --- → 1
Whiteboard: [unifiedTelemetry] → [unifiedTelemetry][measurement:client]
Updated•9 years ago
|
Priority: P2 → P1
Comment 7•9 years ago
|
||
>... my understanding is that high-resolution timing on Windows should just fall > back to GetTickCount64() values after sleep periods Yes, the mechanism is described in my blog post. It's rather complicated and we often work with "difference of a difference". But gives good results. > and thus behave the same Yes. If you use NowLoRes(), it always only works with GTC which (AFAIK) reflects the sleep time as "working time". I don't see any problem with NowLoRes() on win according sleep. On Windows, I'm not sure which of the timers [1] is "stopped" during sleep/hibernation. I would have to check on all of them. Also, it can differ from hardware to hardware (as QPC does). The problem with sleep, GetTickCount and QueryPerformanceCoounter is that they perceive the sleep period time differently. But when you are using NowLoRes() you don't need to worry ;) It only uses GTC. Also, the implementation as is ensures "monotonicity" and constant "calibration" according to GTC values. GTC is considered (also according my findings) a very stable timer, tho with low usually 15.6ms resolution. [1] https://bugzilla.mozilla.org/show_bug.cgi?id=676349#c0
Flags: needinfo?(honzab.moz)
Reporter | ||
Comment 8•9 years ago
|
||
Nathan, what do you think? Do you think staying consistently with incrementing over sleep times sounds reasonable, given the last 3 comments? The approach chosen here affects how we solve this in Telemetry.
Flags: needinfo?(nfroyd)
Comment 9•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #8) > Nathan, what do you think? > Do you think staying consistently with incrementing over sleep times sounds > reasonable, given the last 3 comments? > > The approach chosen here affects how we solve this in Telemetry. Is it possible to address this in telemetry by always explicitly separating out things by platform (on the server/dashboard side) or by applying some sort of compensation for time spent asleep (possibly recorded in a separate histogram)? It's a good point that things already work this way on Windows *ought* to make changing it on other platforms work out just fine. But I'm a little leery of the consequences; support for monotonic suspend-aware clocks on Linux is not universal (>= 2.6.39, according to the manpage, which may not be reliable; I don't know how our kernel version and/or kernel support for this on Android looks), and it's certainly not portable across POSIX platforms. Mac seems to have support for this, though: http://stackoverflow.com/questions/12488481/getting-ios-system-uptime-that-doesnt-pause-when-asleep/12490414#12490414 Also worth noting that somebody was at least thinking about suspend when they wrote the TimeStamp code: http://mxr.mozilla.org/mozilla-central/source/mozglue/misc/TimeStamp.h#363 whether or not all the users of TimeStamp were aware of that is another question... I am cautiously in favor, but I'd like to know if there are other alternatives we can pursue first.
Flags: needinfo?(nfroyd)
Reporter | ||
Comment 10•9 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #9) > (In reply to Georg Fritzsche [:gfritzsche] from comment #8) > > Nathan, what do you think? > > Do you think staying consistently with incrementing over sleep times sounds > > reasonable, given the last 3 comments? > > > > The approach chosen here affects how we solve this in Telemetry. > > Is it possible to address this in telemetry by always explicitly separating > out things by platform (on the server/dashboard side) or by applying some > sort of compensation for time spent asleep (possibly recorded in a separate > histogram)? We have a chance to submit proper values now and we want to specifically measure time including sleep (and later also time excluding sleep etc.). So we should fix i now and avoid accumulating weird server-side special cases. If the decision for this bug is that it's ok for `TimeStamp`, then i will implement something specifically for Telemetry that gives us what we need - i just need to know :) > It's a good point that things already work this way on Windows *ought* to > make changing it on other platforms work out just fine. But I'm a little > leery of the consequences; support for monotonic suspend-aware clocks on > Linux is not universal (>= 2.6.39, according to the manpage, which may not > be reliable; I don't know how our kernel version and/or kernel support for > this on Android looks), and it's certainly not portable across POSIX > platforms. Mac seems to have support for this, though: > http://stackoverflow.com/questions/12488481/getting-ios-system-uptime-that- > doesnt-pause-when-asleep/12490414#12490414 Ok, so Linux/Android are the problematic parts. Do we have anyone with good knowledge there we could ping?
Reporter | ||
Updated•9 years ago
|
Flags: needinfo?(nfroyd)
Comment 11•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #10) > (In reply to Nathan Froyd [:froydnj] from comment #9) > > (In reply to Georg Fritzsche [:gfritzsche] from comment #8) > > > Nathan, what do you think? > > > Do you think staying consistently with incrementing over sleep times sounds > > > reasonable, given the last 3 comments? > > > > > > The approach chosen here affects how we solve this in Telemetry. > > > > Is it possible to address this in telemetry by always explicitly separating > > out things by platform (on the server/dashboard side) or by applying some > > sort of compensation for time spent asleep (possibly recorded in a separate > > histogram)? > > We have a chance to submit proper values now and we want to specifically > measure time including sleep (and later also time excluding sleep etc.). So > we should fix i now and avoid accumulating weird server-side special cases. > If the decision for this bug is that it's ok for `TimeStamp`, then i will > implement something specifically for Telemetry that gives us what we need - > i just need to know :) You want to measure time including and excluding sleep for everything? This paragraph confuses me. Did you mean "...is that it's *not* ok for TimeStamp"? > > It's a good point that things already work this way on Windows *ought* to > > make changing it on other platforms work out just fine. But I'm a little > > leery of the consequences; support for monotonic suspend-aware clocks on > > Linux is not universal (>= 2.6.39, according to the manpage, which may not > > be reliable; I don't know how our kernel version and/or kernel support for > > this on Android looks), and it's certainly not portable across POSIX > > platforms. Mac seems to have support for this, though: > > http://stackoverflow.com/questions/12488481/getting-ios-system-uptime-that- > > doesnt-pause-when-asleep/12490414#12490414 > > Ok, so Linux/Android are the problematic parts. Do we have anyone with good > knowledge there we could ping? I don't know if we have minimum kernel versions required for either Linux or Android, or whether our minimum required Android version implies a minimum kernel version. Maybe :snorp would know more about the Android situation?
Flags: needinfo?(nfroyd) → needinfo?(snorp)
Comment 12•9 years ago
|
||
Oh, :jld might know something about our Linux kernel support, since he's groveled around in places that care about that. Jed, see comment 11?
Flags: needinfo?(jld)
Reporter | ||
Comment 13•9 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #11) > (In reply to Georg Fritzsche [:gfritzsche] from comment #10) > > (In reply to Nathan Froyd [:froydnj] from comment #9) > > > (In reply to Georg Fritzsche [:gfritzsche] from comment #8) > > > > Nathan, what do you think? > > > > Do you think staying consistently with incrementing over sleep times sounds > > > > reasonable, given the last 3 comments? > > > > > > > > The approach chosen here affects how we solve this in Telemetry. > > > > > > Is it possible to address this in telemetry by always explicitly separating > > > out things by platform (on the server/dashboard side) or by applying some > > > sort of compensation for time spent asleep (possibly recorded in a separate > > > histogram)? > > > > We have a chance to submit proper values now and we want to specifically > > measure time including sleep (and later also time excluding sleep etc.). So > > we should fix i now and avoid accumulating weird server-side special cases. > > If the decision for this bug is that it's ok for `TimeStamp`, then i will > > implement something specifically for Telemetry that gives us what we need - > > i just need to know :) > > You want to measure time including and excluding sleep for everything? Yes - starting with excluding sleep here. Browser activity/usage/... times are very important to our analysis needs and we need consistent data here. > This paragraph confuses me. Did you mean "...is that it's *not* ok for > TimeStamp"? I mean, if the decision here is to leave TimeStamp as is, then i will move on. I just need to know either way, the current situation doesn't work for us.
Comment 14•9 years ago
|
||
One thing I know is that B2G ICS emulator is 2.6.29 but all the physical B2G ICS devices are 3.0.x. I'm a little fuzzy on the larger Android ecosystem (specifically, I'm not sure if Android ICS shipped 2.6.29 on actual devices or just the emulator), but we still support Gingerbread for Fennec so I don't think we can require 2.6.39 there. But... if I understand correctly, Android/B2G devices don't actually sleep like that, so if that's the case they could just use CLOCK_MONOTONIC. On desktop, we support (and build on!) CentOS 6, which is on a kernel that calls itself 2.6.32 but has a lot of patches; I don't know offhand whether it has CLOCK_BOOTTIME. If some kind of graceful degradation is possible and we're wondering what percentage of desktop Linux would be affected, FHR would have data on that but I'm pretty sure it's <5%. Also maybe useful to know: Linux desktop automated tests other than gtests currently run on 3.2 (gtests run on the build host), and all tests will run on 3.13 kernels under TaskCluster. This may have implications for test coverage.
Flags: needinfo?(jld)
Comment 15•9 years ago
|
||
kicked to p2 until we can pack into a sprint (likely next)
Priority: P1 → P2
As Jed said, the Android minimum kernel is pretty low. Eugen did some work around this recently, though, so maybe he has some more insight.
Flags: needinfo?(snorp) → needinfo?(esawin)
Comment 17•8 years ago
|
||
Android 3 (Honeycomb) is based on 2.6.36, Android 4 (ICS) on 3.0.1 [1]. If monotonicity is not a hard requirement, then you could use CPU time [2] instead, which should give consistent results including "sleep times" across platforms. [1] https://en.wikipedia.org/wiki/Android_version_history [2] bug 1193821
Flags: needinfo?(esawin)
Reporter | ||
Updated•8 years ago
|
Priority: P2 → P3
Reporter | ||
Updated•7 years ago
|
Whiteboard: [unifiedTelemetry][measurement:client] → [measurement:client]
Reporter | ||
Updated•7 years ago
|
Priority: P3 → P4
Comment 18•6 years ago
|
||
One thing I came across while investigating bug 1359403 relating to this is that if TimeStamp *were* to use CLOCK_BOOTTIME on Linux, the sleeping code that the timer thread uses is based on CLOCK_MONOTONIC (via pthreads), and it doesn't appear that that code can use CLOCK_BOOTTIME. That code would likely need to use different code for sleeping. Also, CLOCK_BOOTTIME came out in Linux 2.6.39, which came out about 7 years ago now - are Android kernels still that far behind?
Comment 19•3 years ago
|
||
TimeStamp::NowLoRes now always returns timestamps that increment during the
device is suspended.
Depends on D98317
Updated•3 years ago
|
Attachment #9190615 -
Attachment is obsolete: true
Updated•1 year ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•