Closed Bug 1040741 Opened 6 years ago Closed 3 years ago

Track and record clock skew client side for Telemetry

Categories

(Toolkit :: Telemetry, defect, P4)

defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: gerv, Unassigned, Mentored)

References

Details

(Whiteboard: [measurement:client])

(Hope this is the right place for this bug.)

I'd like to get some info from telemetry on OS clock skew. Ideally, I'd like to plot a chart with skew along the bottom and frequency of occurrence up the side, so we can say things like "98% of clients have a clock skew of less than 2 hours".

Due to lack of knowledge about how the telemetry system works, I don't know if it's possible to work this out without requiring Firefox to obtain a canonical source of time (which would require shipping an NTP client, or accessing a webservice, or something), or whether we could just submit the time the client thinks it is in a ping and let the server work it out, or what. Suggestions welcomed.

There's no need for sub-second accuracy here.

Gerv
Flags: needinfo?(benjamin)
Similar to bug 818339. The client should send it's submission time is sent along with the ping, maybe in a HTTP header not in the payload itself since we don't always send the payload at the same time as we write it. The server should record it's (presumably-accurate) time and add both of those to the payload from which we can calculate the delta.

This is not trivial on the client or the server, so I am currently not prioritizing it, but I'd take patches.
Flags: needinfo?(benjamin) → needinfo?(mreid)
I think we should track clock skew as "just another histogram" on the client.

In section 14.18 of http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html we have a "Date" header that should be included in all server responses.  The client can examine the value of this header after each payload is submitted, and calculate the local clock skew and update a histogram.

We would then need to make sure that the telemetry server is sending this header and that its clock is synchronized with ntp or similar.

This seems simpler to me - we must already have a reliable way to parse HTTP dates in the client.
Flags: needinfo?(mreid)
mreid: that sounds like a good idea indeed. Are you in a position to write the client side patch? (We can file a bug to get whoever maintains the server to speak about its clock accuracy or lack thereof, and perhaps start ntpd.)

Gerv
No, but I can take care of the server part :)
Depends on: 1040858
Mentor: rnewman
See Also: → 818339
OS: Linux → All
Hardware: x86 → All
Priority: -- → P3
Whiteboard: [measurement:client]
Duplicate of this bug: 818339
Priority: P3 → P4
Note that in bug 1264914 (with context bug 712612) we are already collecting clock skew information based on server-side responses in a pref (services.kinto.clock_skew_seconds).

We could consider submitting that if present.
See Also: → 1264914
Yes, can we hook the value of that pref up to telemetry? Surely that can't be hard...

rbarnes: is there someone you can task with this patch?

Gerv
Flags: needinfo?(rlb)
Mark / Tarek?  Should be a quick patch to add telemetry collection.  Would also be nice to collect the gap since PREF_KINTO_LAST_UPDATE if it's easy.

Gerv: FYI, this will only give you a bounded notion of clock skew.  This query relies on the Kinto ping succeeding, so if the clock skew is so bad that it causes that ping to fail, then we won't get telemetry on it.
Flags: needinfo?(tarek)
Flags: needinfo?(rlb)
Flags: needinfo?(mgoodwin)
We already will get clock skew data in Telemetry from bug 1144778.
Submitting the pref value surely is not hard, but making sure that this is a robust measure potentially is. The pref value mentioned serves a different use-case than this bug and has some data quality constraints.

If it's useful to specifically submit the kinto clock skew value for that team, lets do that in a separate bug.
(In reply to Georg Fritzsche [:gfritzsche] from comment #9)
> We already will get clock skew data in Telemetry from bug 1144778.

Ah, super! So this bug is fixed, then? Where can we see a chart?

Gerv
Morphing this bugs title to reflect comment 2 et al.

Bug 1144778 has not landed yet, lets take further discussions there.
Summary: Get telemetry on OS clock skew → Track and record clock skew client side for Telemetry
See also: bug 712612

We also get a timestamp as part of the TLS error reports. Since I'm working on bug 1253545 this week, we'll see how clock skew looks for our TLS error reporting population too.
Flags: needinfo?(mgoodwin)
Flags: needinfo?(tarek)
We have bug 1144778 in place for a while, sending the client date with Telemetry requests.
That is available for clock skew analysis in re:dash etc.

I don't think there is any direct need for this bug now and i'm inclined to close this.
mreid, any objections?
Flags: needinfo?(mreid)
It would be nice to close the loop with some visualization of the distribution of clock skew, but I think we have all the raw materials we need.

I'm OK to close this.
Flags: needinfo?(mreid)
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.