Closed Bug 1225140 Opened 8 years ago Closed 8 years ago

Further parallelize executive report

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: mreid, Assigned: mreid)

References

Details

Currently the executive report needs to ingest data in order, one day at a time.

This makes it very time consuming to run the report on a large date range, since it can't easily be parallelized across analysis machines (beyond parallelizing by reporting period).

I would like to be able to run each day's report to produce a cuckoo filter containing that day's summary, then combine these summaries into a report.

This would mean that all the days could be run at the same time in parallel, hopefully speeding things up a lot.
Blocks: 1175583
Assignee: nobody → mtrinkala
There is a custom one-off cuckoo filter version I provided to mreid.  However, custom one-offs will not scale well so mreid is testing out longitudinal analysis using the Redshift databases.
Assignee: mtrinkala → nobody
Priority: -- → P1
Assignee: nobody → mreid
The change to use Redshift / SQL to power the exec report renders this obsolete.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.