Open Bug 1640085 Opened 4 years ago Updated 2 years ago

[Meta] [Gamepad] Increase privacy and security of GamePad API

Categories

(Core :: DOM: Device Interfaces, defect, P3)

defect

Tracking

()

ASSIGNED

People

(Reporter: marcosc, Assigned: cmartin)

References

(Blocks 1 open bug)

Details

(Keywords: meta)

Over last few months we've been coordinating with the Google folks on trying to address some privacy and security issues with the GamePad API.

The tl;dr is: a significant number of websites are using the API in both insecure and third-party contexts [1] so we need to figure out how we roll out the proposed mitigations while limiting breakage:

  1. Only expose in SecureContext
  2. Require a gesture on the gamepad itself to activate it (we may already support this, Kip to you know?)
  3. Don't allow the API to be used if "the document is not visible"
  4. add feature policy: 'self' + "gamepad" - which will break all third-party frames

The above are all technically easy to add, but because the API is about a decade old, it has a lot of usage... it also got a significant uptick in usage thanks to Covid (again, see [1]).

I spoke to @mt about how we might go about enabling the above mitigations while also minimizing the breakage. Martin suggested we take a phased approach to this, doing a phased rollout in coordination with the developer community (e.g., a blog post) and work with other browser vendors so we all try ship roughly around the same so to minimize developer confusion.

We should then do a staggered release... incrementally making our way through 1-4 of the above.

Thoughts?

[1] usage stats - cross origin sub-frame: https://www.chromestatus.com/metrics/feature/timeline/popularity/3054

Flags: needinfo?(kgilbert)
Depends on: 1591329
Depends on: 1640086

We (WebKit) already does #2.

I think Chrome and Firefox do, also. In fact I'd be flabbergasted if you didn't. It was a critical anti-fingerprinting measure in early iterations of the API.

I think we'd be happy to do 1, 3, and 4 also.

And I think we're probably less concerned about breakage than you are.

For 3, I think that "Don't allow the API to be used" is probably something slightly different. I don't quite understand the model, but it seems like the Gamepad object should remain fixed at the state it was in when the window loses focus and that no further events are fired until focus returns. It would also be OK to have all buttons released prior to loss of focus if that leads to better outcomes for pages.

(In reply to Martin Thomson [:mt:] from comment #2)

For 3, I think that "Don't allow the API to be used" is probably something slightly different. I don't quite understand the model, but it seems like the Gamepad object should remain fixed at the state it was in when the window loses focus and that no further events are fired until focus returns. It would also be OK to have all buttons released prior to loss of focus if that leads to better outcomes for pages.

We already have (2) for the first intent. That means users have to press a button or axis even a pose (IIRC) before seeing a gamepad from a HTML document. Though, we need to define a threshold (0.1f) for axis move.

(1) We partially have for the recent implemented APIs in Gamepad (GamepadLightIndicator and GamepadTouch).
(3) Does it mean if this nsIDocument is visible false, then we should not return navigator.getGamepad() to users?
(4) I don't quite understand about this. Does it mean cross-origin thing? We have very basic implementation in test_gamepad_multitouch_crossorigin_iframe.html for GamepadTouch.[1] We can make Gamepad itself have it either.

[1] https://searchfox.org/mozilla-central/rev/fc91a093e40dde71d10ad219946b8ae775aca9eb/dom/gamepad/GamepadManager.cpp#568

Blocks: 1643833
  1. Require a gesture on the gamepad itself to activate it (we may already support this, Kip to you know?)

For WebVR and WebXR; however, we require a regular (non-gamepad user) gesture to start an immersive VR session. This session is a prerequisite for accessing VR controllers as Gamepad objects.

In the VR case, the kind of gesture to start a VR session depends on if the user is on a desktop with a VR headset attached as a peripheral or if the VR headset contains (all-in-one) the computer itself. In the first case, the user would most often activate with the mouse by clicking in a regular 2d web page to start the session. In the second case, the VR controller is already active and the user is inside the headset viewing the 2d page before starting the immersive VR session controlled by the content in WebVR/WebXR. In this second case, the page sees an emulated mouse and keyboard; however, these are actually virtual objects the user is controlling with the VR controllers / gamepads.

If the user is visiting a 2d web page inside an all-in-one VR headset, the gesture on the gamepad itself may have been due to the user intending to interact as a mouse or by selecting characters from a virtual keyboard. Perhaps malicious sites may attempt to hijack these gestures to access the gamepad API if the gesture on the controller was the only protection.

Essentially, the controllers become modal -- sometimes acting as a mouse+keyboard, and other times acting as regular gamepads.

Perhaps @daosheng is familiar enough with the all-in-one Firefox Reality implementation to know if we enumerate the VR controllers as gamepads outside of WebVR/WebXR use cases?

Flags: needinfo?(kgilbert) → needinfo?(dmu)

For #4 - add feature policy: 'self' + "gamepad" - which will break all third-party frames...

This would not likely impact updated VR sites that have already shifted to the WebXR API, as that API already requires a feature policy, "xr-spatial-tracking". Unlike WebVR, WebXR returns "gamepad" objects for VR controllers, but enumerated through its own WebXR-specific function.

IMHO, a feature policy for gamepad API sounds like a good idea.. If this was added, I would like to suggest that it applies to the function for enumerating with the Gamepad API itself, with text written in a way inclusive of the VR controllers emulating keyboard and mouse input prior to activation. I suspect that this use case would be very similar to use of gamepads for accessible input (eg, XBox accessible controllers driving keyboard macros).

In both of these cases, there is the notion of modal behavior for gamepads.

Perhaps its also worth indicating to the user clearly that the gamepad is in use and/or providing an escape mechanism that can be performed on the controller itself? I imagine in cases where the sole input method is a gamepad / VR controller, that pages could trap users or attempt various phishing schemes.

For the gamepad usecases outside WebXR/VR, we start to enumerate gamepad when users request from navigator.getGamepads(). Until, users do their first intent, we will return this gamepad object to users.

Support a feature policy is a good thought and I like it. If there is no UAs again it, we have no reason not to do it.

Flags: needinfo?(dmu)
Assignee: nobody → cmartin
Severity: -- → S3
Status: NEW → ASSIGNED
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.