Odd spike in Linux clients on nightlies since 2020-06-02
Categories
(Data Platform and Tools :: General, defect, P1)
Tracking
(Not tracked)
People
(Reporter: chutten, Assigned: wlach)
Details
(Whiteboard: [dataquality])
Attachments
(1 file)
Spike in proportion of Linux clients on Nightly: https://sql.telemetry.mozilla.org/queries/71714/source#180020
Spike attributed to a configuration with CPU Model 63, 10 cores, 256GB of RAM, Linux 3.14.32-xxxx-grs-ipv6-64: https://sql.telemetry.mozilla.org/queries/71721/source
Apparently the locale is French
Comment 1•4 years ago
|
||
10 core machines are relatively uncommon. Removing this model type reduces the count of 10 core machines from 2154 to 78. Therefore, this specific configuration represents a 26x increase. As a reference, counts for
- single core machines: ~900
- 2 and 4 core: > 20K
Reporter | ||
Comment 2•4 years ago
|
||
that kernel version seems to be a custom kernel by french server provider OVH
Comment 3•4 years ago
|
||
This looks like a bot running from OVH datacenter as these pings come mostly from IPs from OVH SAS
ISP (https://sql.telemetry.mozilla.org/queries/71768/source).
I'm not sure there are any good next steps to take about this particular case, but we should do some analysis of prevalence of datacenter IPs/ISPs as they may be inflating some internal metrics.
Assignee | ||
Comment 4•4 years ago
|
||
(In reply to Arkadiusz Komarzewski [:akomar] from comment #3)
This looks like a bot running from OVH datacenter as these pings come mostly from IPs from
OVH SAS
ISP (https://sql.telemetry.mozilla.org/queries/71768/source).I'm not sure there are any good next steps to take about this particular case, but we should do some analysis of prevalence of datacenter IPs/ISPs as they may be inflating some internal metrics.
Thanks :akomar, on further investigation it looks like most of these are actually "headless" (at least judging by the value of environment.system.gfx.headless
):
https://sql.telemetry.mozilla.org/queries/71778/source
Seems like an easy workaround might be to just adjust queries to filter out headless clients?
Comment 5•4 years ago
|
||
Assignee | ||
Comment 6•4 years ago
|
||
We discussed this in #data-day-to-day (https://mozilla.slack.com/archives/CKFKC7Y1Y/p1591367498415600) which resulted in the above PR. I think this situation is rare/unlikely to happen on any channel other than Nightly (where the number of client pings is quite small and thus vulnerable to skew from an automated system) as such, we're leaning towards just adding some documentation and calling it a day.
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Updated•1 year ago
|
Description
•