Closed Bug 1305043 Opened 8 years ago Closed 8 years ago

Duplicate entries (by client_id) exist in longitudinal telemetry data.

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mlopatka, Assigned: rvitillo)

References

Details

Attachments

(1 file)

Some client_ids are returned in multiple rows when querying the longitudinal telemetry data. Minimum working example to replicate the issue can be found here: https://gist.github.com/mlopatka/bfd70c494d38aa18f0b30455299a8be9 This should not be the case. In the current dataset it appears that 81147 duplicate entries are present, this may be impacting analyses using this data.
Attached file patch
This issue is caused by a bug in the trimming logic for clients with very long histories.
Attachment #8794198 - Flags: review?(mreid)
Assignee: nobody → rvitillo
Points: --- → 1
Priority: -- → P1
Blocks: 1255755
Attachment #8794198 - Flags: review?(mreid) → review+
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: