Uptake reported age seems to be wrong in some situations
Categories
(Firefox :: Remote Settings Client, defect, P1)
Tracking
()
People
(Reporter: leplatrem, Unassigned)
References
Details
(Whiteboard: telescope poucave delivery-checks prod remotesettings-uptake-release/max-age)
Attachments
(1 file)
|
63.15 KB,
image/png
|
Details |
We report the age of obtained data when changes are pulled from the server.
For scheduled or startup synchronizations, the reported age increases with the time passing (until a new change is published). In the uptake telemetry, we can observe this through the «sawtooth» aspect of the graph (switchback?).
But for broadcast synchronizations, the reported age should roughly be stable, and should not follow the pattern of those described above.
However, it does. Meaning that we may have an issue with the code that reports the age.
One possibility would be that client reconnections are handled as "broadcast" and would thus spoil the proper realtime broadcast reported values.
| Reporter | ||
Updated•6 years ago
|
Comment 1•5 years ago
|
||
I spent some time today trying to figure out if this could be due to client reconnections. I haven't done any actual tests using a live browser, but from looking at the code, it doesn't seem like reconnections should be a factor. The code is contained in https://dxr.mozilla.org/mozilla-central/source/dom/push/PushServiceWebSocket.jsm. We trigger reconnects at https://dxr.mozilla.org/mozilla-central/source/dom/push/PushServiceWebSocket.jsm#247-251. Reconnecting means tearing down the socket and "starting over", https://dxr.mozilla.org/mozilla-central/source/dom/push/PushServiceWebSocket.jsm#346-350, https://dxr.mozilla.org/mozilla-central/source/dom/push/PushServiceWebSocket.jsm#183. We don't keep any previous state around when we handle handshake replies, https://dxr.mozilla.org/mozilla-central/source/dom/push/PushServiceWebSocket.jsm#612-617, so if that's what we're getting, then we should still be seeing "hello" as the context.
| Reporter | ||
Comment 2•5 years ago
|
||
Thanks for digging into this!
This is the query that I used to look at the data: https://sql.telemetry.mozilla.org/queries/67038/source#169792
Although there are some spikes, it's true that when we get rid of noise (ignore periods were too few events are received), we get a lot less of this sawtooth pattern.
So, it could totally be related to our query in Telemetry and not so much about the client code.
| Reporter | ||
Comment 3•5 years ago
|
||
It can very a lot by channel, which could mean it comes from Megaphone
| Reporter | ||
Updated•5 years ago
|
| Reporter | ||
Updated•4 years ago
|
| Reporter | ||
Comment 4•3 years ago
|
||
We improved the Telemetry queries to remove noise, and ignore periods of time where few clients were reporting values. This seems to have fixed the problem.
Description
•