Closed Bug 1129928 Opened 9 years ago Closed 9 years ago

Analyze pairs of url data to extract related domains

Categories

(Content Services Graveyard :: Tiles: Data Processing, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mzhilyaev, Unassigned)

References

Details

(Whiteboard: [story])

Analyze url-pairs data collected via Bug #1110506
The analysis will include
1. frequency filtering 
 the match is described here: https://docs.google.com/a/mozilla.com/document/d/1o5DB-OFABV0Ze9ye9ve3gyBs-VHIsaLZtFDG58MQoKg/edit#heading=h.xwjr9eu9xn5c

 - high frequency sites are excluded based to random assumption
 - site pairs are ordered by how unlikely their observed co-occurrence is
 - verify that top pairs are meaningful
 - build a cluster of "recommending" site whereby if any cluster-site is in user history, any other cluster site can be recommended

2. We may potentially need to reran telemetry experience to identify impressions coming from a single user
Depends on: 1129932
Depends on: 1129934
Depends on: 1129938
Status: NEW → RESOLVED
Points: 13 → ---
Closed: 9 years ago
Resolution: --- → FIXED
Whiteboard: .? → [story]
You need to log in before you can comment on or make changes to this bug.