Closed Bug 853198 Opened 7 years ago Closed 7 years ago

Defend against bad system clocks

Categories

(Toolkit :: Application Update, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla25

People

(Reporter: gps, Assigned: robert.strong.bugs)

References

Details

Attachments

(1 file, 2 obsolete files)

A number of important Firefox features rely on the accurate recording of absolute clock time. For example, everything tied off nsIUpdateTimerManager (blocklist ping, update ping, Places maintaineance, ...) uses preferences to store the last wall time a timer fired. This is used to calculate the next time the timer should fire. This is all fine in an ideal world.

Unfortunately, we live in the real world. System clocks are wrong and they behave erratically. They drift over time. They may jump around by hours or years at a time. They may not be accurate when resuming from sleep. It's quite amazing how often clocks fail in the real world. Yet here we are, assuming the system clock can be trusted.

Currently, nsIUpdateTimerManager (and a number of other product features storing absolute times) rely on the consistent behavior of system clocks to function properly. If the system clock misbehaves, bad things happen.

For example, say the system clock reports the current year is 2020. While it's reporting so, we send a blocklist ping. nsIUpdateTimerManager now says the last blocklist ping occurred in 2020 and schedules the next ping also for the year 2020. The blocklist ping never fires again. That's not good.

I believe we use the same mechanism for app update pings. This scenario could cause clients to never automatically check for updates, leading to poor adoption of new product versions. Sad panda.

*Anything that relies on system clocks needs to defend against bad system clocks.*

We have a number of options available to us. Here are some in the tree today:

1) Firefox Health Report detects when the next scheduled upload is more than a few days out from the currently-reported time and it readjusts the scheduled time accordingly (bug 809954).
2) Android Product Announcements uses the time of the build as a lower bound for expected times (bug 838416). If times are seen before the binary was built, they are ignored.

I think a good place to start is to fix nsIUpdateTimerManager to better handle bad system clocks. It should definitely handle the scenario where events are scheduled an unreasonable amount of time in the future. e.g. if the next scheduled time is greater than 2-3x the regular firing interval, it should move the schedule forward.

Long term, we may consider a product feature that alerts users when their system clocks are wrong. Bug 818339 was kinda filed on this. No clue if there will be movement on it. Doing that right is hard. But, even if we fix system clocks, they can still jump around. There's no getting around having to handle that.

We may also consider a "clock santity enforcing service." We could have preferences register themselves as "I contain a clock value that should be accurate." The "clock sanity enforcing service" could periodically look for badness and correct it.

Finally, there's a chance that nsIUpdateTimerManager is properly handling forward skewed clocks. But, I don't see any obvious code doing it, any comments about the issue, or any tests covering the scenario. So, I'm inclined to believe it does not. All it takes is the timer firing once with a way-in-the-future date and the timer never fires again. Sadness.
This fell off my todo list due to other work a while ago and wholeheartedly agree. Sanitizing the pref values during profile-after-change should be doable to fix this.
In the broad sense of this bug, see also Bug 851380 — we shouldn't be using timestamps where some other form of sequencing is possible.

And theoretically, these alternative clocks can be used to trigger "pseudo-time-based" activity — for example, we could use the history vector clock to trigger history pruning, rather than an idle daily timer. We could even use that same clock as a fallback secondary trigger for the update service — it indicates the passage of time in an active browser, and the occasional duplicate update check is cheap insurance.

The product announcements code in Bug 838416 actually employs a range of logic to defend against bad clocks, because it has some conception of valid values and how its own events work. Those strategies can be layered elsewhere in the tree, I imagine.
Agreed as long as the complexity doesn't add additional risk (e.g. the new clock mechanism breaks and thereby breaks auto-updates for a channel) or more bugs if a simple fix as outlined in comment #1 fixes this bug.
Assignee: nobody → robert.bugzilla
Status: NEW → ASSIGNED
Attachment #769416 - Flags: review?(netzen)
Comment on attachment 769416 [details] [diff] [review]
patch rev1 - reset time last update time if it is greater than current time

Missed consumers that use registerTimer
Attachment #769416 - Attachment is obsolete: true
Attachment #769416 - Flags: review?(netzen)
Comment on attachment 769653 [details] [diff] [review]
patch rev2

Review of attachment 769653 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/mozapps/update/nsUpdateTimerManager.js
@@ +215,5 @@
>        let timerData = this._timers[timerID];
> +      // If the last update time is greater than the current time then reset
> +      // it to 0 and the timer manager will correct the value when it fires
> +      // next for this consumer.
> +      if (timerData.lastUpdateTime > now) {

nit: Why is this one "> now" but the other checks are ">= now"?
I think all should either be > or >=
Attachment #769653 - Flags: review?(netzen) → review+
(In reply to Brian R. Bondy [:bbondy] from comment #7)
> Comment on attachment 769653 [details] [diff] [review]
> patch rev2
> 
> Review of attachment 769653 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/mozapps/update/nsUpdateTimerManager.js
> @@ +215,5 @@
> >        let timerData = this._timers[timerID];
> > +      // If the last update time is greater than the current time then reset
> > +      // it to 0 and the timer manager will correct the value when it fires
> > +      // next for this consumer.
> > +      if (timerData.lastUpdateTime > now) {
> 
> nit: Why is this one "> now" but the other checks are ">= now"?
> I think all should either be > or >=

Late night silly mistake.
Carrying forward r+
Attachment #769653 - Attachment is obsolete: true
Attachment #770575 - Flags: review+
Pushed to mozilla-inbound
https://hg.mozilla.org/integration/mozilla-inbound/rev/657f8910c112
Flags: in-testsuite+
Target Milestone: --- → mozilla25
https://hg.mozilla.org/mozilla-central/rev/657f8910c112
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.