Open Bug 1440863 Opened 6 years ago Updated 2 years ago

Unanticipated security/usability degradation from precision-lowering of performance.now() to 2ms

Categories

(Core :: DOM: Core & HTML, defect, P3)

60 Branch
defect

Tracking

()

People

(Reporter: mark, Unassigned)

References

()

Details

(Keywords: regression)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36

Steps to reproduce:

Note: Invited Expert in W3C Web Platform Working Group relating to requestAnimationFrame on high-Hz display, with one commit.

Hello,

Firstly, Spectre and Meltdown are indeed extremely serious security issues.  
Many vendors, rightfully, doing emergency fixes including temporarily reducing timer precision.

However, I have seriously grave concerns about Mozilla/FireFox's decision to reduce performance.now() precision to 2ms -- there are unanticipated major side effects, including one that will reduce security.


Actual results:

MAJOR SIDE EFFECT #1:

The precision degradation is so massive (even at 60Hz) -- affecting fullscreen WebGL and WebVR majorly -- that I have heard from a friend that a site that plans to automatically detect the 2ms imprecision and automatically prompt the user with step-by-step instructions to fix the imprecision for their site.  As a global FireFox setting, this is problematic from a security perspective. A global setting changed.

I think the safer solution is:

"In an absolute emergency: Personally if there's a duress to reduce timer precision, then a temporary permissions API (cameras and fullscreen have more security/privacy fears) to prompt the user for permission to stay at 5us precision.  At least during the duration of security studying, until being reverted to maximized precision."

MAJOR SIDE EFFECT #2:

As a past game developer (most standards writers aren't game developers), let me tell you:

Accurate gametimes are necessary for accurate calculations of 3D object positions, especially in full-screen WebGL games.

At sub-refresh-cycles, animations should still use sub-millisecond accuracy to prevent jank, when calculating time-based animations (e.g. calculating object positions based on time) -- that's what 3D graphics often do compensating for fluctuating frame rates, they calculate object world positions based on gametime.   

That means 1000 inch per second moving balls (~60mph) can be wrongly mispositioned by 2 inches if the gametime is off by 2 millisecond.  That can mean goal/nogoal.  Fastballs, baseballs, hockey pucks can go much faster than this.  Low-precision gametimes will ruin a lot of HTML5 and WebGL game programming.

Gametimes need sub-millisecond accuracy, especially full-screen WebGL games, because they calculate real-world 3D object positions based on gametime.  

That way everything looks correct regardless of framerate. Framerates fluctuate all over the place, so video games using 3D graphics use gametime to calculate the position of everything.  3D object positions are calculated based on gametime.  Wrong gametime means everything janks like crazy in full screen WebGL games.

Again, this is for 60 Hz.  I'm not even talking about higher refresh rates.  1ms gametime errors create noticeable motion-fluidity flaws in 60Hz full-screen WebGL games!

Things even janks crazily with 5ms and 8ms gametime errors on a 60Hz display.  You're in the driver's seat of a 300mph car, in a racing game -- a 2ms gametime error becomes large even on a 60 Hz display.   Jank/jerkiness/stutter can get huge even at sub-refresh-cycle levels even on a 60 Hz display, even with sub-refresh-cycle errors.

Also, don't forget that there are advantages to frame rates above refresh rates in reducing latency: https://www.blurbusters.com/faq/benefits-of-frame-rate-above-refresh-rate/

My prediction is FireFox may get a tsunami wave of huge complaints from game designers suddenly hit by major precision reductions in ability to calculate gameworld positions.  

MAJOR SIDE EFFECT #3:

Depending on how timer precision is degraded, it potentially eliminates educational motion tests in web browsers.  Many scientific tests such as www.testufo.com/ghosting and www.testufo.com/frameskipping (display overclocking) is heavily dependant on perfect frame rate synchronization, and the site is able to detect whenever Chrome misses a frame.   

Peer reviewed papers have already been written based on browser motion tests (including www.blurbusters.com/motion-tests/pursuit-camera-paper ...) thanks to browser's ability to achieve perfect refresh rate synchronization)

Even one of my sites, TestUFO, depending on how the site is degraded in upcoming FireFox A/B tests (new versus old browser) -- it may be forced to do something similar to the popup in "MAJOR SIDE EFFECT #1" -- for specific kinds of peer-reviewed scientific motion testing.  Is there a way for me to tell users how to whitelist only one site (TestUFO.com) for high-precision timers, without doing it as a global setting?   (Thought exercise for W3C)

The TestUFO website already automatically displays a message telling TestUFO users to switch web browsers instead of IE/Edge for 120Hz testing, due to IE/Edge continued violation of Section 7.1.4.2 of HTML 5.2.  More than 50% of TestUFO visitors are using a refresh rate other than 60 Hz.

MAJOR SIDE EFFECT #4:

It makes WebVR useless.  

Its premise (without creating nausea/headaches) is heavily dependant on accurate gametimes for accurate positions of 3D objects (see MAJOR SIDE EFFECT #2 above).

MAJOR SIDE EFFECTS #5-100

I have a much longer list, but for brevity, I am shortening this email to point out the graveness of FireFox's decision.


Expected results:

CONCLUSION

The approach by the FireFox team should be refined.

If continued short-term emergency mitigation is absolute critical, it should includes a Permissions API (much like Full Screen mode and WebRTC camera permissions) to case-by-case ask the user for permission to use high-precision timers (allowing 5us or 1us precision).  

If absolutely necessary, this could even be limited to Organization Validation HTTPS sites, combined with only exclusive same-origin use, even triggered by a click, and only after confirming via a popup Permission API (like for WebRTC), that 5us/1us precision becomes granted to that particular site.

Long-term, these restrictions should be removed once the underlying causes of Spectre/Meltdown becomes solved.  However, massive precision reductions, that forces web developers to give instructions to visitors how to configure their web browsers, is a less secure solution.

Thanks,
Mark Rejhon
Founder, Blur Busters / TestUFO
(Past Invited Expert, W3C Web Platform Working Group for HTML 5.2)
Mozilla needs to fix Firefox, because their mitigation is flawed...

MICROSOFT: "we are removing support for SharedArrayBuffer from Microsoft Edge (originally introduced in the Windows 10 Fall Creators Update), and reducing the resolution of performance.now() in Microsoft Edge and Internet Explorer from 5 microseconds to 20 microseconds, with variable jitter of up to an additional 20 microseconds".  [Source: https://blogs.windows.com/msedgedev/2018/01/03/speculative-execution-mitigations-microsoft-edge-internet-explorer/]

Testing confirms performance.now() in IE post Meltdown/Spectre is now 20us, with jitter.

GOOGLE: "Chrome has disabled SharedArrayBuffer on Chrome 63 starting on Jan 5th, and will modify the behavior of other APIs such as performance.now, to help reduce the efficacy of speculative side-channel attacks" [Source: https://www.chromium.org/Home/chromium-security/ssca]  Source code (TimeClamper.cpp / TimeClamper.h) changes reveals Chrome changed performance.now() to 100us accuracy, with jitter.

Testing confirms performance.now() in Chrome post Meltdown/Spectre is now 100us, with jitter.

MOZILLA: The Meltdown/Spectre mitigation that Mozilla implements appears flawed.  Firefox's performance.now() is truncated to 2ms [Source: https://developer.mozilla.org/en-US/docs/Web/API/Performance/now#Reduced_time_precision].  The flaw is that this truncating results in a hard clock edge, and it is well known how to recover a high resolution time from a clock with hard edges (https://blog.mozilla.org/security/2018/01/03/mitigations-landing-new-class-timing-attack/ points to https://gruss.cc/files/fantastictimers.pdf, which describes the process).

Testing confirms performance.now() in Firefox post Meltdown/Spectre is now 2000us, with NO jitter -- a hard clock edge.  Firefox source code (ClippedTime in jsdate.cpp) shows a simple "now = floor(now / sResolutionUsec) * sResolutionUsec", with no jitter adjustment seen.

Mozilla should fix their mitigation (change to 20us to 100us and add jitter) -- and by doing so, also fix and address most of what Mark has raised???
Component: Untriaged → JavaScript Engine
Keywords: regression
Product: Firefox → Core
Component: JavaScript Engine → DOM
Truncating performance.now to 2ms is probably going to break or regress a bunch of shipped code. Stuff that previously ran at 60fps is going to run at some value varying between 55 and 62 frames per second, which is going to cause noticeable variations in gameplay speed, juddery animation, and (for multiplayer games) potentially straight-up breakage. I've definitely shipped code that relies on performance.now having some reasonable amount of precision. This is an unacceptable mitigation.
I won't restate all the implications this has for HTML5 web games that Mark already stated quite succinctly. I will just add that as a HTML5/WebGL game developer this 2ms resolution change at best will make games 'feel' off on Firefox and at worst will cause game breaking bugs and inaccuracies. Please reconsider this, or I will have to add warnings to my web game preloaders warning prospective players to download and play on Chrome instead if they are on Firefox, and I assure you I will not be alone in this.
Given the "see also bugs", is this bug still current?
Flags: needinfo?(tom)
We've moved to 1ms with jitter, after conferring with our WebVR group, we haven't seen any particular cause for concern.  That said, from speaking with mark in other bugs, he's said that even 100us is too coarse of a precision value for certain purposes.

When we feel comfortable with our security posture based on ongoing development plans, we intend to revisit Spectre mitigations, so until that point I am inclined to close this as WONTFIX; but it could just as easily be left open to remind us to revisit it (and this bug does contain good info we would want to re-read.)
Flags: needinfo?(tom)
Leaving open in order not to lose good info.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
Component: DOM → DOM: Core & HTML
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.