1692609 - resist-fingerprinting makes it hard to have smooth (60hz) animations

Reporter

Description

•

4 years ago

We know some of why this is (performance.now being truncated), but I'd like to know more about where we want the trade-offs to be, and where we are along that line.

This is a problem because people obviously want to resist fingerprinting, but they definitely don't realize it'll have all the adverse effects they run into.

What we don't want is people turning on RFP and then ditching Firefox because they end up with bad experiences around the web.

Kelsey Gilbert [:jgilbert]

Reporter

Comment 1

•

4 years ago

Raised most recently via https://github.com/aframevr/aframe/issues/4793

Kelsey Gilbert [:jgilbert]

Reporter

Comment 2

•

4 years ago

I'm thinking that RFP should truncate to the RAF interval, which would still be pretty good, without causing this class of issue.

RFP either already restricts RAF to 60hz, or it should be made to to prevent fingerprinting on RAF_count/second?

Flags: needinfo?(tom)

Tom Ritter [:tjr]

Assignee

Comment 3

•

4 years ago

I think there's two questions here: What should we do for Firefox, and what should Tor do. Typically, we've taken the opinion that RFP exists for Tor, so RFP should do what Tor wants it to do. I'm not strictly opposed to doing something different in Firefox via some mechanism (e.g. through an additional pref Tor sets) but it seems like we should first figure out what Tor wants and ideally we could keep thing uniform.

Bug 1577243 is where we started applying reduced precision to the timestamp in rAF; Bug 1585589 is where we exempted the System Principal context. Bug 1662277 refactored this code to where it lives today.

As background. RFP is the acronym for Reduce Timer Precision and refers to the current-default mode of Firefox were we clamp+jitter to 1ms. RFP refers to Resist Fingerprinting. We have four types of Timer Clamping+Jittering:

Absolutely none - this is used for browser internal stuff.
Unconditional: this clamps to 20 ns and is in effect for (today) COOP/COEP documents and was always the case (pre-spectre) for performance.now. It is also used in Animation Timelines because it was hard to make RTP work there. No jitter. In the code this can be identified by RFPOnly mode.
RTP - the default for Firefox: 1ms + jitter
RFP - 100ms + jitter

My understanding from re-reading Bug Bug 1577243 is that the rAF timestamp wasn't originally being affected by any timer clamping not for RFP, not even to do the unconditional 20ns clamping. Animation Timelines however were being affected by Unconditional. To make Animation Timelines match up with rAF and resolve the test failures, we applied (unconditional) Timer Clamping to rAF which also make rAF do RFP clamping when RFP is enabled.

We're focusing on changing behavior for RFP mode here. We would want to keep Animation Timelines and rAF in sync, so if we change how rAF behaves we will also change how Animation Timelines behave.

We could switch Animation Timelines and rAF to Unconditional Clamping in RFP mode. (This would be a 'fifth' type of Timer Clamping+Jittering but what's one more?) I think the argument for this would be: rAF gets called on a regular interval, 16.666666666666667 ms (or 60 times a second) so even thought the timer in the function argument lies to you, you always know the time delta between function calls anyway. Additionally, rAF yields execution to the browser, so you can't set a global variable in an rAF callback and then immediately use it in another JS function and keep the precision.

I think the argument against this is that rAF only gets called on a regular 60 times a second when operating at high speed. It'll never go faster than that. It will go slower. So if rAF is called, hypothetically, 30 times a second because your computer is slow (either by hardware specs or because an attacker is slowing it down) the calls to rAF will get a high-resolution timer that is disclosing useful information. It seems feasible that an attacker could queue a lot of work in a rAF callback that pegs the system, and then the next rAF callback, look at the amount of work completed, and be able to glean useful information about the time deltas between the callbacks. I don't know if this attack would work. Jeff, what do you think about this concern?

Given that using timing attacks to glean information from a browser hasn't been seen in practice (I think?) in the wild, I am sympathetic to the argument that Tor's level of timer precision is hurting usability with minimal security gain. Maybe a revisit of the precision for all of RFP is in order, and we could get a big gain of usability by lowering it to 10ms. Or 20ms. or 50ms. (Whatever the highest value is that makes things work well?)

Flags: needinfo?(tom)

Flags: needinfo?(sysrqb)

Flags: needinfo?(jgilbert)

Flags: needinfo?(gk)

Kelsey Gilbert [:jgilbert]

Reporter

Comment 4

•

4 years ago

Fwiw bug 1666160 is "RFP can surprise users by being painful", which I see as one way to help this problem for most users affected by it. We'll keep the discussion here about "assuming a user wants the RFP tradeoffs, how can they be less painful".

In order for things to go mostly smoothly, I think we need our RAF target interval to be the basis for clamping, or at least related to it. I could see this being 16.67ms clamping, and mandating RAF operate at 60hz maximum under RFP. I think that's a reasonable tradeoff here. Thoughts?

(Note that even the clamping to 1ms for RTP would probably be smoother if it were 1000/60/16 = 1.041667 clamping, rather than one in three consecutive raf frames commonly having 1/16.67 = 6% variance from true presentation time)

It seems feasible that an attacker could queue a lot of work in a rAF callback that pegs the system, and then the next rAF callback, look at the amount of work completed, and be able to glean useful information about the time deltas between the callbacks. I don't know if this attack would work. Jeff, what do you think about this concern?

One of the mitigation options here is to only update "is it done" flags at frame boundaries, or some other acceptable/regular timer. (tjr:) Do we handle (via timer-gating?) this case for events to/from a worker?

Flags: needinfo?(jgilbert) → needinfo?(tom)

Updated

•

4 years ago

Comment 5

•

4 years ago

(That said, I love that the truncation to 1.00ms makes all timing code have pretty round numbers usually!)

Tom Ritter [:tjr]

Assignee

Comment 6

•

4 years ago

(In reply to Jeff Gilbert [:jgilbert] from comment #4)

In order for things to go mostly smoothly, I think we need our RAF target interval to be the basis for clamping, or at least related to it. I could see this being 16.67ms clamping, and mandating RAF operate at 60hz maximum under RFP. I think that's a reasonable tradeoff here. Thoughts?

16.67 ms seems like a good starting point for consideration (kind of a PITA to write tests for a non-terminating decimal...)

60hz maximum seems reasonable; is that something easy to implement?

It seems feasible that an attacker could queue a lot of work in a rAF callback that pegs the system, and then the next rAF callback, look at the amount of work completed, and be able to glean useful information about the time deltas between the callbacks. I don't know if this attack would work. Jeff, what do you think about this concern?

One of the mitigation options here is to only update "is it done" flags at frame boundaries, or some other acceptable/regular timer. (tjr:) Do we handle (via timer-gating?) this case for events to/from a worker?

If I understand your question, it's roughly "Do we do anything that affects the behavior of <anything> based on how we reduce time precision?" and the answer is not directly. An inability to make assertions about Animations working as before if RTP was applied to them was one example of deciding not to do this.

Fuzzyfox (whose code is still in-tree but will be ripped out at some point soonish) actually delays browser operations - it will sleep the main thread and other threads will await events firing from there; but our timer mitigations today typically munge a time value directly before it is exposed to JS.

In the past we had instances where we were storing munged values and it led to bugs - so I'm not aware of situations where we store a munged value and then base logic on it, but it's not out of the question. But in general, gating browser behavior based on timer clamping/jittering is probably hard, and probably why Fuzzyfox was designed the way it was. I could be wrong about all this, but that's my educated guesswork.

Flags: needinfo?(tom)

Kelsey Gilbert [:jgilbert]

Reporter

Comment 7

•

4 years ago

RAF rate is controlled via layout.frame_rate, where -1 is 'use vsync' and e.g. 50 means 50Hz. RFP should set that to 60 if it doesn't already. (this will negatively affect e.g. 90hz displays) 30hz is another option that works fine for 90hz, but is a little chunky-feeling.

16.67 is indeed pretty annoying, but we're supposed to avoid magic numbers anyways, eh? :) const MS_60HZ= 1000 / 60;

I think about the best we can do is to use RAF interval as our time-atom. For example, it looks like CSS Animations (for example) only update once a tick, which is good.

Kelsey Gilbert [:jgilbert]

Reporter

Comment 8

•

4 years ago

Attached file Bug 1692609 - Force 60hz for RFP and use that as the time-atom. — Details

Phabricator Automation

Updated

•

4 years ago

Assignee: nobody → jgilbert

Status: NEW → ASSIGNED

Tom Ritter [:tjr]

Assignee

Comment 9

•

4 years ago

FYI, I gave this a r+ so we can move it forward and experience it on Nightly. Tor Browser for Android is based off release (not ESR), but the old behavior can be reverted to in TB by specifying privacy.resistFingerprinting.reduceTimerPrecision.microseconds = 100000. (Well, except for the locking-the-display-to-60Hz aspect of the patch, but that's an improvement to things anyway.)

Georg Koppen

Updated

•

4 years ago

Flags: needinfo?(gk)

BugBot [:suhaib / :marco/ :calixte]

Comment 10

•

4 years ago

There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:jgilbert, could you have a look please?
For more information, please visit auto_nag documentation.

Flags: needinfo?(jmuizelaar)

Flags: needinfo?(jgilbert)

Jeff Muizelaar [:jrmuizel]

Updated

•

4 years ago

Flags: needinfo?(jmuizelaar)

Gian-Carlo Pascutto [:gcp]

Comment 11

•

4 years ago

Bad bot.

Flags: needinfo?(jgilbert)

Tom Ritter [:tjr]

Assignee

Comment 12

•

4 years ago

Attached file Bug 1692609 - Force 60hz for RFP and use that as the time-atom r?tjr — Details

Tom Ritter [:tjr]

Assignee

Comment 13

•

4 years ago

Attached file Bug 1692609 - Update tests for a higher-precision RFP r?jgilbert (obsolete) — Details

Depends on D122045

Tom Ritter [:tjr]

Assignee

Comment 14

•

4 years ago

Jeff - I took a look at the test failures for this patch and think I resolved them. There were some failures on try but they seem unrelated and there's always a host of failures for a really full run.

https://treeherder.mozilla.org/jobs?repo=try&revision=5ed43b930879a7cc1344c35c9619313b0ebe0dde
https://treeherder.mozilla.org/push-health/push?revision=5ed43b930879a7cc1344c35c9619313b0ebe0dde&repo=try&testGroup=pr&selectedTest=dommediatesttestmediarecorderplaybackcanrepeathtml&selectedTaskId=&regressionsGroupBy=path&selectedJobName=dom%2Fmedia%2Ftest%2Ftest_mediarecorder_pause_resume_video.html+test-windows10-32-qr%2Fopt-mochitest-media-fis-gli-e10s-3

The most convincing failure is dom/media/test/test_mediarecorder_playback_can_repeat.html - there are several failures on mediarecorder on Windows so that might possible be something...

Kelsey Gilbert [:jgilbert]

Reporter

Comment 15

•

4 years ago

I worry that you're missing changes that I had made locally, but hadn't posted yet, but I'm happy to have to chip away at this.
Ping the assignee next time before taking over patches, please!

Assignee: jgilbert → tom

Kelsey Gilbert [:jgilbert]

Reporter

Comment 16

•

4 years ago

(I also would have preferred that you left the initial changes in my name, so we know who to blame :) )

Phabricator Automation

Updated

•

4 years ago

Attachment #9235229 - Attachment description: Bug 1692609 - Force 60hz for RFP and use that as the time-atom r?jgilbert → Bug 1692609 - Force 60hz for RFP and use that as the time-atom r?tjr

Pulsebot

Comment 17

•

4 years ago

Pushed by tritter@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d36f5f4ece1e Force 60hz for RFP and use that as the time-atom r=tjr https://hg.mozilla.org/integration/autoland/rev/ce4b4f786c4d Update tests for a higher-precision RFP r=jgilbert

Atila Butkovits

Comment 18

•

4 years ago

Backed out for causing failures at test_getUserMedia_addTrackRemoveTrack.

Backout link: https://hg.mozilla.org/integration/autoland/rev/36974366b0fbc17ac7e2ee7e6a39ce6cead207fa

Push with failures: https://treeherder.mozilla.org/jobs?repo=autoland&selectedTaskRun=QsyZqR5bRXKbKkeerfvALw.0&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel&revision=ce4b4f786c4d5d90da86ebaacd8e445be50d2ee2

Failure log:

Flags: needinfo?(tom)

Tom Ritter [:tjr]

Assignee

Comment 19

•

4 years ago

After a bunch of bisections on try; I eventually figured out that the test failures are due to this code:

+  Preferences::RegisterCallback(
+      FrameRatePrefChanged,
+      nsDependentCString(
+          StaticPrefs::GetPrefName_privacy_resistFingerprinting()));

I couldn't reproduce any of these failures locally so I'm not sure why this would be causing a problem. Jeff, would you have a guess? If not, you can pass back the NI and I'll do a bunch of MOZ_LOG debugging...

Flags: needinfo?(tom) → needinfo?(jgilbert)

Thorin [:thorin]

Comment 20

•

4 years ago

Hows does this fit in with Bug 1442863 ? Does it solve it?

at the very least animations should be 30fps (33ms) or preferably 60fps (17ms)

Thorin [:thorin]

Comment 21

•

4 years ago

FYI, Bug 1589060 is another one that could benefit from 60fps (if you wanted that)

Kelsey Gilbert [:jgilbert]

Reporter

Comment 22

•

4 years ago

(In reply to Tom Ritter [:tjr] (OOTO until Aug 23rd) from comment #19)

After a bunch of bisections on try; I eventually figured out that the test failures are due to this code:
+  Preferences::RegisterCallback(
+      FrameRatePrefChanged,
+      nsDependentCString(
+          StaticPrefs::GetPrefName_privacy_resistFingerprinting()));
I couldn't reproduce any of these failures locally so I'm not sure why this would be causing a problem. Jeff, would you have a guess? If not, you can pass back the NI and I'll do a bunch of MOZ_LOG debugging...

Oh! We probably just need to dispatch to the main thread for FrameRatePrefChanged, that's probably our threading issue then.

Flags: needinfo?(jgilbert)

Tom Ritter [:tjr]

Assignee

Comment 23

•

4 years ago

Is this what you had in mind? It doesn't seem to have fixed it.

Flags: needinfo?(jgilbert)

Kelsey Gilbert [:jgilbert]

Reporter

Comment 27

•

4 years ago

Better: https://treeherder.mozilla.org/jobs?repo=try&revision=d5f3c3e70eeb71e69cb077fe26c3bd299ec66ec0
But not as green as reference: https://treeherder.mozilla.org/jobs?repo=try&revision=a41b91b1026ffb0ee6cf5ca56b589c54b6483457

Flags: needinfo?(jgilbert)

Tom Ritter [:tjr]

Assignee

Comment 28

•

4 years ago

Looked at this again rebasing and running your patches. Confirmed that the issue is the listening for the RFP pref change. Remove that and it's green. I wondered if moving back and forth from a framterate of 60 for RFP and -1 from layout.frame_rate was causing an issue, so I set layout.frame_rate to 55 but it didn't make a difference.

At this point I'm wondering what the consequence of not listening for the pref change would be? It wouldn't affect people who have RFP enabled all the time. I don't know how often users toggle RFP on and off in a single session (I do it occasionally myself for sure though...) but would the consequence be significant?

Thorin [:thorin]

Comment 29

•

4 years ago

I don't know how often users toggle RFP on and off in a single session (I do it occasionally myself for sure though...) but would the consequence be significant?

Only per eTLD+1, where e.g. workers can retain the original state (even new instances), but the top level document doesn't. My understanding is that when we add site exceptions, it would behave much like Brave's Shields, where toggling the brave mode reloads the page (in our case it might be a message to say reload the page for changes to take affect). With multiple tabs from the same domain open, IDK what happens there (with Brave or what your thoughts are). If/when we give RFP a UI front, toggling it can also show the same message (reload web pages to take affect).

So I guess you really need to decide how RFP should work

Hsin-Yi Tsai (she/her) [:hsinyi]

Updated

•

3 years ago

Severity: -- → S3

Tom Ritter [:tjr]

Assignee

Comment 30

•

3 years ago

I rebased and sent in a try run which doesn't seem to reproduce the intermittents anymore (possibly after the changes in Bug 1765399 ....)

https://treeherder.mozilla.org/jobs?repo=try&revision=4bfcc119125da296149cdf03ea3141f541cbe124

Flags: needinfo?(jgilbert)

Comment 31

•

3 years ago

:D

Flags: needinfo?(jgilbert)

Kelsey Gilbert [:jgilbert]

Reporter

Comment 32

•

3 years ago

Let's land it then!

BugBot [:suhaib / :marco/ :calixte]

Comment 33

•

3 years ago

The severity field for this bug is relatively low, S3. However, the bug has 3 duplicates.
:tjr, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(tom)

Tom Ritter [:tjr]

Assignee

Comment 34

•

3 years ago

Okay; I've sent in a full try run here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b61436faa9fd28ac40a138ce2ef3feec185b7cd4 and if that looks good I'll post two new patches for review.

Flags: needinfo?(tom)

Flags: needinfo?(sysrqb)

Tom Ritter [:tjr]

Assignee

Comment 35

•

3 years ago

Push health has a bunch of failures, of course... these ones stood out to me:

gfx/layers/apz/test/mochitest/test_group_keyboard.html fails on all (I think?) Windows platforms - but this test has a bunch of different intermittent bugs: Bug 1761669, Bug 1771835, Bug 1771836, Bug 1659729....

browser/base/content/test/performance/browser_startup_flicker.js seems like a relevant test to this change; and it fails consistently on one flavor of platform: x64 opt devedition-qr swr fis - so it makes me think it might be okay if it's not failing on any other platform?

browser/base/content/test/performance/browser_vsync_accessibility.js also seems relevant - it fails on x64 opt linux qr swr fis only.

I'm going to post the patches and let you take a look at the results - let me know what you think we should do.

Tom Ritter [:tjr]

Assignee

Comment 36

•

3 years ago

Attached file Bug 1692609 - Force 60hz for RFP and use that as the time-atom r?jgilbert — Details

Tom Ritter [:tjr]

Assignee

Comment 37

•

3 years ago

Attached file Bug 1692609 - Update tests for a higher-precision RFP r?jgilbert — Details

Depends on D148122

Pulsebot

Comment 38

•

3 years ago

Pushed by tritter@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ca7e6cc96539 Force 60hz for RFP and use that as the time-atom r=jgilbert https://hg.mozilla.org/integration/autoland/rev/25dd05e29a4b Update tests for a higher-precision RFP r=jgilbert

Treeherder Bug Filer

Updated

•

3 years ago

Regressions: 1772418

Norisz Fay [:noriszfay]

Comment 39

•

3 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/ca7e6cc96539
https://hg.mozilla.org/mozilla-central/rev/25dd05e29a4b

Status: ASSIGNED → RESOLVED

Closed: 3 years ago

status-firefox103: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 103 Branch

Thorin [:thorin]

Comment 40

•

3 years ago

tested: clearly a lot smoother: https://ciechanow.ski/mechanical-watch/ - I played them side by side in nightly vs beta with RFP

Tom Ritter [:tjr]

Assignee

Comment 41

•

3 years ago

[Tracking Requested - why for this release]: It would be great if we could uplift this to 102 so Tor Browser can get it in the ESR without them needing to track the patch themselves.

tracking-firefox102: --- → ?

Pascal Chevrel:pascalc

Comment 42

•

3 years ago

Tom, could you make an uplift request? Thanks

status-firefox102: --- → affected

status-thunderbird_esr102: --- → affected

tracking-firefox102: ? → +

Flags: needinfo?(tom)

Tom Ritter [:tjr]

Assignee

Comment 43

•

3 years ago

Comment on attachment 9279416 [details]
Bug 1692609 - Force 60hz for RFP and use that as the time-atom r?jgilbert

Beta/Release Uplift Approval Request

User impact if declined: Tor will have to carry an additional patchset
Is this code covered by automated tests?: Yes
Has the fix been verified in Nightly?: Yes
Needs manual test from QE?: No
If yes, steps to reproduce:
List of other uplifts needed: None
Risk to taking this patch: Low
Why is the change risky/not risky? (and alternatives if risky): Only affects Resist Fingerprinting Mode
String changes made/needed:
Is Android affected?: Yes

Flags: needinfo?(tom)

Attachment #9279416 - Flags: approval-mozilla-beta?

Tom Ritter [:tjr]

Assignee

Updated

•

3 years ago

Attachment #9279417 - Flags: approval-mozilla-beta?

Ryan VanderMeulen [:RyanVM]

Updated

•

3 years ago

status-thunderbird_esr102: affected → ---

Pascal Chevrel:pascalc

Comment 44

•

3 years ago

Comment on attachment 9279416 [details]
Bug 1692609 - Force 60hz for RFP and use that as the time-atom r?jgilbert

Approved for 102 beta 6, thanks.

Attachment #9279416 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Pascal Chevrel:pascalc

Comment 45

•

3 years ago

Comment on attachment 9279417 [details]
Bug 1692609 - Update tests for a higher-precision RFP r?jgilbert

Approved for 102 beta 6, thanks.

Attachment #9279417 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Pascal Chevrel:pascalc

Comment 46

•

3 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-beta/rev/1bd5a00e2dec
https://hg.mozilla.org/releases/mozilla-beta/rev/c0a6e25511f8

status-firefox102: affected → fixed

Thorin [:thorin]

Updated

•

3 years ago

Regressed by: 1773560

Thorin [:thorin]

Updated

•

3 years ago

No longer regressed by: 1773560

Regressions: 1773560

Tom Ritter [:tjr]

Assignee

Updated

•

3 years ago

Regressions: 1774247

Tom Ritter [:tjr]

Assignee

Updated

•

2 years ago

Duplicate of this bug: 1589060

Phabricator Automation

Updated

•

7 months ago

Attachment #9235230 - Attachment is obsolete: true

Bug 1692609 - Force 60hz for RFP and use that as the time-atom. 4 years ago Kelsey Gilbert [:jgilbert] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1692609 - Force 60hz for RFP and use that as the time-atom r?tjr 4 years ago Tom Ritter [:tjr] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1692609 - Update tests for a higher-precision RFP r?jgilbert 4 years ago Tom Ritter [:tjr] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1692609 - Force 60hz for RFP and use that as the time-atom r?jgilbert 3 years ago Tom Ritter [:tjr] 48 bytes, text/x-phabricator-request	pascalc : approval-mozilla-beta+	Details \| Review
Bug 1692609 - Update tests for a higher-precision RFP r?jgilbert 3 years ago Tom Ritter [:tjr] 48 bytes, text/x-phabricator-request	pascalc : approval-mozilla-beta+	Details \| Review