Consider using QueryPerformancyCounter directly for PRMJ_Now instead via GetSystemTimePreciseAsFileTime
Categories
(Core :: JavaScript Engine, task, P2)
Tracking
()
People
(Reporter: jrmuizel, Unassigned)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
(Whiteboard: [sp3])
It seems like GetSystemTimePreciseAsFileTime is adding a lot of overhead and I'm not sure it's helpful. V8 implements Date.now() on top QueryPerformanceCounter directly.
| Reporter | ||
Comment 1•2 years ago
|
||
FWIW, this shows up in the backbone.js SP3 test under the debounce/debounced< function.
| Reporter | ||
Updated•2 years ago
|
Comment 2•2 years ago
|
||
We used to have a pretty complicated implementation of this that I replaced years ago with GetSystemTimePreciseAsFileTime for Windows 8+. A few months ago I removed the old implementation in this commit when we dropped support for Windows < 10.
I vaguely remember the old code either being slower or having similar performance back then, but that was a long time ago on Windows 8. It's probably worth checking the V8 code to see if they have a nicer version.
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 3•2 years ago
|
||
Based on an initial skim of the V8 code, they have a fast version that uses QueryPerformanceCounter directly and also a slower implementation they use in some cases. Maybe we could use GetSystemTimePreciseAsFileTime as our slow path and carve out a fast path using QueryPerformanceCounter.
NI myself to look into this more.
Comment 4•2 years ago
|
||
V8 uses QueryPerformanceCounter after checking CPUID for non-stop TSC and checking that QueryPerformanceFrequency returns a frequency > 0 (I think that's guaranteed to be true since XP).
They sync this timestamp with the system time every 60 seconds (kMaxTimeToAvoidDrift) in subtle::TimeNowIgnoringOverride. This uses GetSystemTimeAsFileTime as "base" wall clock.
I noticed we already have similar QPC timestamp code in mozglue/misc/TimeStamp_windows.cpp. For performance.now(), we use TimeStamp::Now() (in Performance::NowUnclamped).
One option is to use our TimeStamp code with periodic sync iff sHasStableTSC. If sHasStableTSC is false, we'd fall back to the current GetSystemTimePreciseAsFileTime.
Comment 5•2 years ago
|
||
Hey Jeff, which Windows version was this on? It's not a VM right?
I have a WIP patch to use TimeStamp::Now() and sync it if > 20 seconds have elapsed. On my laptop (Intel i7 8th gen, Windows 11) it's ~20% slower in the JS shell for a Date.now() micro-benchmark. It does use the HasStableTSC branch.
The numbers I get:
- Chrome Canary: 495 ms
- FF Try PGO build JS shell, without patch: 581 ms
- FF Try PGO build JS shell, with patch: 702 ms
- FF Nightly: 870 ms
I think a lot of the browser overhead is from timer precision/fingerprinting mitigations.
| Reporter | ||
Comment 6•2 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #5)
Hey Jeff, which Windows version was this on? It's not a VM right?
This was win10 on AMD Ryzen hardware.
Can you get before and after profiles using https://github.com/jrmuizel/etw-profiler/?
| Reporter | ||
Comment 7•2 years ago
|
||
Comment 8•2 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #6)
Can you get before and after profiles using https://github.com/jrmuizel/etw-profiler/?
Before: https://share.firefox.dev/42fg19V
After: https://share.firefox.dev/42el4Yn
In the Before profile, we spend relatively more time under GetSystemTimePreciseAsFileTime in RtlQueryPerformanceCounter than in your profile. Maybe they optimized this or maybe it's because I'm calling it in a tight loop?
In the After profile, the critical section in mozilla::PerformanceCounter is pretty expensive. I'm not sure if that code to handle jitter is still necessary, but even if I comment that out, we're still slower. I haven't investigated the remaining gap yet.
| Reporter | ||
Comment 9•2 years ago
|
||
Bug 1515155 is about how to fixing the critical section locking.(In reply to Jan de Mooij [:jandem] from comment #8)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #6)
Can you get before and after profiles using https://github.com/jrmuizel/etw-profiler/?
Before: https://share.firefox.dev/42fg19V
After: https://share.firefox.dev/42el4YnIn the Before profile, we spend relatively more time under
GetSystemTimePreciseAsFileTimeinRtlQueryPerformanceCounterthan in your profile. Maybe they optimized this or maybe it's because I'm calling it in a tight loop?
That could be system/CPU differences or perhaps a caching thing because my profile doesn't have the tight loop.
In the After profile, the critical section in
mozilla::PerformanceCounteris pretty expensive. I'm not sure if that code to handle jitter is still necessary, but even if I comment that out, we're still slower. I haven't investigated the remaining gap yet.
Bug 1515155 is about how to remove that locking in the common case.
Comment 10•9 months ago
|
||
Clearing the NI. I'm not currently working on this and there's related discussion happening in bug 1515155.
Description
•