Closed Bug 1378977 Opened 8 years ago Closed 5 years ago

Add document deduplication to topline_summary

Categories

(Data Platform and Tools :: General, enhancement, P3)

x86
macOS
enhancement
Points:
2

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: amiyaguchi, Unassigned)

References

Details

Document deduplication was removed in bug 1377730 due to high overhead. It would be nice to have inside of the topline_summary to prevent inflation of numbers. The rate of document duplication and the overhead that `.drop_duplicates(columns)` adds are relevant, because it may be significant.
Points: --- → 2
Depends on: 1329844
Priority: -- → P3

The topline_summary dataset is deprecated.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
Component: Datasets: General → General
You need to log in before you can comment on or make changes to this bug.