Consider using QueryPerformancyCounter directly for PRMJ_Now instead via GetSystemTimePreciseAsFileTime
Categories
(Core :: JavaScript Engine, task, P2)
Tracking
()
People
(Reporter: jrmuizel, Unassigned, NeedInfo)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
(Whiteboard: [sp3])
It seems like GetSystemTimePreciseAsFileTime is adding a lot of overhead and I'm not sure it's helpful. V8 implements Date.now() on top QueryPerformanceCounter directly.
Reporter | ||
Comment 1•1 year ago
|
||
FWIW, this shows up in the backbone.js SP3 test under the debounce/debounced<
function.
Reporter | ||
Updated•1 year ago
|
Comment 2•1 year ago
|
||
We used to have a pretty complicated implementation of this that I replaced years ago with GetSystemTimePreciseAsFileTime
for Windows 8+. A few months ago I removed the old implementation in this commit when we dropped support for Windows < 10.
I vaguely remember the old code either being slower or having similar performance back then, but that was a long time ago on Windows 8. It's probably worth checking the V8 code to see if they have a nicer version.
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Comment 3•1 year ago
|
||
Based on an initial skim of the V8 code, they have a fast version that uses QueryPerformanceCounter
directly and also a slower implementation they use in some cases. Maybe we could use GetSystemTimePreciseAsFileTime
as our slow path and carve out a fast path using QueryPerformanceCounter
.
NI myself to look into this more.
Comment 4•1 year ago
|
||
V8 uses QueryPerformanceCounter
after checking CPUID for non-stop TSC and checking that QueryPerformanceFrequency
returns a frequency > 0 (I think that's guaranteed to be true since XP).
They sync this timestamp with the system time every 60 seconds (kMaxTimeToAvoidDrift
) in subtle::TimeNowIgnoringOverride
. This uses GetSystemTimeAsFileTime
as "base" wall clock.
I noticed we already have similar QPC timestamp code in mozglue/misc/TimeStamp_windows.cpp
. For performance.now()
, we use TimeStamp::Now()
(in Performance::NowUnclamped
).
One option is to use our TimeStamp
code with periodic sync iff sHasStableTSC
. If sHasStableTSC
is false
, we'd fall back to the current GetSystemTimePreciseAsFileTime
.
Comment 5•1 year ago
|
||
Hey Jeff, which Windows version was this on? It's not a VM right?
I have a WIP patch to use TimeStamp::Now()
and sync it if > 20 seconds have elapsed. On my laptop (Intel i7 8th gen, Windows 11) it's ~20% slower in the JS shell for a Date.now()
micro-benchmark. It does use the HasStableTSC
branch.
The numbers I get:
- Chrome Canary: 495 ms
- FF Try PGO build JS shell, without patch: 581 ms
- FF Try PGO build JS shell, with patch: 702 ms
- FF Nightly: 870 ms
I think a lot of the browser overhead is from timer precision/fingerprinting mitigations.
Reporter | ||
Comment 6•1 year ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #5)
Hey Jeff, which Windows version was this on? It's not a VM right?
This was win10 on AMD Ryzen hardware.
Can you get before and after profiles using https://github.com/jrmuizel/etw-profiler/?
Reporter | ||
Comment 7•1 year ago
|
||
Comment 8•1 year ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #6)
Can you get before and after profiles using https://github.com/jrmuizel/etw-profiler/?
Before: https://share.firefox.dev/42fg19V
After: https://share.firefox.dev/42el4Yn
In the Before profile, we spend relatively more time under GetSystemTimePreciseAsFileTime
in RtlQueryPerformanceCounter
than in your profile. Maybe they optimized this or maybe it's because I'm calling it in a tight loop?
In the After profile, the critical section in mozilla::PerformanceCounter
is pretty expensive. I'm not sure if that code to handle jitter is still necessary, but even if I comment that out, we're still slower. I haven't investigated the remaining gap yet.
Reporter | ||
Comment 9•1 year ago
|
||
Bug 1515155 is about how to fixing the critical section locking.(In reply to Jan de Mooij [:jandem] from comment #8)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #6)
Can you get before and after profiles using https://github.com/jrmuizel/etw-profiler/?
Before: https://share.firefox.dev/42fg19V
After: https://share.firefox.dev/42el4YnIn the Before profile, we spend relatively more time under
GetSystemTimePreciseAsFileTime
inRtlQueryPerformanceCounter
than in your profile. Maybe they optimized this or maybe it's because I'm calling it in a tight loop?
That could be system/CPU differences or perhaps a caching thing because my profile doesn't have the tight loop.
In the After profile, the critical section in
mozilla::PerformanceCounter
is pretty expensive. I'm not sure if that code to handle jitter is still necessary, but even if I comment that out, we're still slower. I haven't investigated the remaining gap yet.
Bug 1515155 is about how to remove that locking in the common case.
Description
•