Closed
Bug 1023867
Opened 10 years ago
Closed 10 years ago
Failure on 'integer out of range' on duplicates checker - causing most matviews to fail (reports-clean depends on this)
Categories
(Socorro :: Database, task)
Socorro
Database
Tracking
(Not tracked)
RESOLVED
FIXED
89
People
(Reporter: selenamarie, Assigned: selenamarie)
Details
(Whiteboard: [DB change])
Attachments
(1 file)
Woke to this alert:
05:42 < nagios-phx1> | Wed 05:42:31 PDT [1398] sp-admin01.phx1.mozilla.com:Socorro Admin - crontab is CRITICAL: CRITICAL - duplicates
(DuplicatesCronApp) (http://m.mozilla.org/Socorro+Admin+-+crontab)
Had been occurring since:
22:55 < nagios-phx1> | Tue 22:55:53 PDT [1135] sp-admin01.phx1.mozilla.com:Socorro Admin - crontab is CRITICAL: CRITICAL - duplicates
(DuplicatesCronApp) (http://m.mozilla.org/Socorro+Admin+-+crontab)
Initial triage indicates a previously un-checked input field, although unsure exactly which one at this point:
2014-06-11 06:35:07,186 DEBUG - MainThread - error when running <class 'socorro.cron.jobs.matviews.DuplicatesCronApp'> on None
Traceback (most recent call last):
File "/data/socorro/socorro-virtualenv/lib/python2.6/site-packages/crontabber/app.py", line 973, in _run_one
for last_success in self._run_job(job_class, config, info):
File "/data/socorro/socorro-virtualenv/lib/python2.6/site-packages/crontabber/base.py", line 199, in main
function(when)
File "/data/socorro/socorro-virtualenv/lib/python2.6/site-packages/crontabber/mixins.py", line 150, in _run_proxy
**kwargs
File "/data/socorro/socorro-virtualenv/lib/python2.6/site-packages/crontabber/transaction_executor.py", line 46, in __call__
result = function(connection, *args, **kwargs)
File "/data/socorro/application/socorro/cron/jobs/matviews.py", line 174, in run
self.run_proc(connection, [start_time, end_time])
File "/data/socorro/application/socorro/cron/jobs/matviews.py", line 28, in run_proc
cursor.callproc(self.get_proc_name(), signature)
DataError: integer out of range
CONTEXT: SQL statement "create temporary table new_reports_duplicates
on commit drop
as
select follower.uuid as uuid,
leader.uuid as duplicate_of,
follower.date_processed
from
(
select uuid,
install_age,
uptime,
client_crash_date,
date_processed,
first_value(uuid)
over ( partition by
product,
version,
build,
signature,
cpu_name,
cpu_info,
os_name,
os_version,
address,
topmost_filenames,
reason,
app_notes,
url
order by
client_crash_date,
uuid
) as leader_uuid
from reports
where date_processed BETWEEN start_time AND end_time
) as follower
JOIN
( select uuid, install_age, uptime, client_crash_date
FROM reports
where date_processed BETWEEN start_time AND end_time ) as leader
ON follower.leader_uuid = leader.uuid
WHERE ( same_time_fuzzy(leader.client_crash_date, follower.client_crash_date,
leader.uptime, follower.uptime)
OR follower.uptime < 60
)
AND
same_time_fuzzy(leader.client_crash_date, follower.client_crash_date,
leader.install_age, follower.install_age)
AND follower.uuid <> leader.uuid"
PL/pgSQL function update_reports_duplicates(timestamp with time zone,timestamp with time zone) line 10 at SQL statement
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → sdeckelmann
Assignee | ||
Comment 1•10 years ago
|
||
Specific query that started the failure is:
SELECT * FROM update_reports_duplicates('2014-06-11T01:28:19.183999+00:00'::timestamptz,'2014-06-11T02:28:19.183999+00:00'::timestamptz)
Assignee | ||
Comment 2•10 years ago
|
||
Found it! We're allowing subtraction on large negative and positive values in the 'install_age' fields. We should detect the out of bounds error *before* the subtraction to fix this. :)
Here's how I found the error:
create table selena_new_reports_duplicates
as
select follower.uuid as uuid,
leader.uuid as duplicate_of,
follower.date_processed
from
(
select uuid,
install_age,
uptime,
client_crash_date,
date_processed,
first_value(uuid)
over ( partition by
product,
version,
build,
signature,
cpu_name,
cpu_info,
os_name,
os_version,
address,
topmost_filenames,
reason,
app_notes,
url
order by
client_crash_date,
uuid
) as leader_uuid
from reports
where date_processed BETWEEN 2014-06-11T01:28:19.183999+00:00'::timestamptz AND 2014-06-11T02:28:19.183999+00:00'::timestamptz
) as follower
JOIN
( select uuid, install_age, uptime, client_crash_date
FROM reports
where date_processed BETWEEN 2014-06-11T01:28:19.183999+00:00'::timestamptz AND 2014-06-11T02:28:19.183999+00:00'::timestamptz ) as leader
ON follower.leader_uuid = leader.uuid
WHERE ( same_time_fuzzy(leader.client_crash_date, follower.client_crash_date,
leader.uptime, follower.uptime)
OR follower.uptime < 60
)
AND
same_time_fuzzy(leader.client_crash_date, follower.client_crash_date,
leader.install_age, follower.install_age)
AND follower.uuid <> leader.uuid
DO language plpgsql $$
DECLARE
r record;
temp boolean;
BEGIN
for r in select * from selena_new_reports_duplicates
LOOP
RAISE NOTICE 'install ages: leader (%), follower (%)', r.linstall_age, r.finstall_age;
SELECT same_time_fuzzy(r.ldate_processed, r.fdate_processed, r.linstall_age, r.finstall_age) INTO temp;
END LOOP;
END$$;
Assignee | ||
Comment 3•10 years ago
|
||
Attachment #8438474 -
Flags: review+
Comment 4•10 years ago
|
||
Commits pushed to master at https://github.com/mozilla/socorro
https://github.com/mozilla/socorro/commit/ff5bddd62b565d107e7dadb5c6dad4761de45fe5
Fixes bug 1023867 - stop treating negative install_age as valid
https://github.com/mozilla/socorro/commit/f0579ebf28141df9aef69496572a3c864e2a29b3
Migration for bug 1023867
https://github.com/mozilla/socorro/commit/0f1d3886a70162ad933701e5d86cabb8381438cc
Merge pull request #2104 from selenamarie/bug1023867-duplicates-integer-problem
Fixes bug 1023867 - stop treating negative install_age as valid
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•10 years ago
|
Target Milestone: --- → 89
Assignee | ||
Updated•10 years ago
|
Whiteboard: [DB change]
You need to log in
before you can comment on or make changes to this bug.
Description
•