ADI counts vastly over-counted

RESOLVED FIXED

Status

RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: peterbe, Assigned: peterbe)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Assignee)

Description

3 years ago
This morning (March 31st, crontabber job running *for* March 30th), the fetch-adi-from-hive job failed to complete because the database password for the secondary destination was incorrect. I corrected that this morning. 

But the job had failed 11 times since I caught this. Each time, it would successfully run the Hive query, then insert into raw_adi and raw_adi_logs on prod but when it came to writing into stage's DB it failed and the whole app thus failed. For each re-try of the whole crontabber app it would accumulate more and more counts on the prod DB.
(Assignee)

Comment 1

3 years ago
To undo this, I wipe away the numbers first: 

breakpad=> begin;
BEGIN
breakpad=> delete from raw_adi_logs where report_date >= '2016-03-30'; delete from raw_adi where date >= '2016-03-30'; delete from product_adu where adu_date >= '2016-03-30';
DELETE 2702260
DELETE 2695070
DELETE 149
breakpad=> commit;
COMMIT
(Assignee)

Comment 2

3 years ago
Next, I wind back the clock on the crontabber state:

breakpad=> begin;
BEGIN
breakpad=> update crontabber set last_success = last_success - interval '1 day', next_run = now() where app_name = 'fetch-adi-from-hive';
UPDATE 1
breakpad=> commit;
COMMIT
(Assignee)

Comment 3

3 years ago
Now raw_adi and raw_adi_logs look much more normal:

breakpad=> select sum(count) from raw_adi_logs where report_date >= '2016-03-30';
    sum
-----------
 150251605
(1 row)

breakpad=> select sum(count) from raw_adi_logs where report_date >= '2016-03-29' and report_date < '2016-03-30';
    sum
-----------
 150999725
(1 row)
(Assignee)

Comment 4

3 years ago
And...

breakpad=> select sum(adi_count) from raw_adi where date >= '2016-03-30';
    sum
-----------
 150014271
(1 row)

breakpad=> select sum(adi_count) from raw_adi where date >= '2016-03-29' and date < '2016-03-30';
    sum
-----------
 150760431
(1 row)
(Assignee)

Comment 5

3 years ago
Next, we need to rewind so the adu-matview (which populates product_adu from the raw_adi table):

breakpad=> begin;
BEGIN
breakpad=> update crontabber set last_success = last_success - interval '1 day', next_run=now() where app_name='adu-matview';
UPDATE 1
breakpad=> commit;
COMMIT
(Assignee)

Comment 6

3 years ago
And build-adu-matview:

breakpad=> begin;
BEGIN
breakpad=> update crontabber set last_success = last_success - interval '1 day', next_run=now() where app_name='build-adu-matview';
UPDATE 1
breakpad=> commit;
COMMIT
(Assignee)

Comment 7

3 years ago
Now the product_adu numbers look much better:


breakpad=> select sum(adu_count) from product_adu where adu_date >= '2016-03-30' and adu_date < '2016-03-31';
    sum
-----------
 121287666
(1 row)
breakpad=> select sum(adu_count) from product_adu where adu_date >= '2016-03-29' and adu_date < '2016-03-30';
    sum
-----------
 122065764
(1 row)
(Assignee)

Comment 8

3 years ago
Going to run full backfill for march 30th.
(Assignee)

Comment 9

3 years ago
Backfill complete for Mar 30
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.