Closed
Bug 545035
Opened 14 years ago
Closed 14 years ago
prod Socorro /200912/adu/byday web service not aggregating oses properly
Categories
(Socorro :: General, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: ozten, Assigned: griswolf)
References
Details
Calls to /200912/adu/byday have results that vary over time. I've traced the call and see that multiple 'Windows' rows are returned, but the python code maps this into a response without aggregating these various Windows rows.
Reporter | ||
Comment 1•14 years ago
|
||
Given select CAST(ceil(EXTRACT(EPOCH FROM (window_end - timestamp without time zone '2010-01-24 00:00:00' - interval '8 hours')) / 86400) AS INT) * interval '24 hours' + timestamp without time zone '2010-01-24 00:00:00' as day, case when os.os_name = 'Windows NT' then 'Windows' when os.os_name = 'Mac OS X' then 'Mac' else os.os_name end as os_name, sum(count) from top_crashes_by_signature tcbs join osdims os on tcbs.osdims_id = os.id and os.os_name in ('Windows','Mac OS X','Linux','Windows NT') where (timestamp without time zone '2010-01-24 00:00:00' - interval '8 hours') < window_end and window_end <= (timestamp without time zone '2010-02-07 00:00:00' - interval '8 hours') and productdims_id = 171 group by day, os_name order by 1, 2; Results: 2010-01-24 00:00:00 Linux 20 2010-01-24 00:00:00 Mac 8295 2010-01-24 00:00:00 Windows 61472 2010-01-24 00:00:00 Windows 12 2010-01-25 00:00:00 Linux 46 2010-01-25 00:00:00 Mac 16335 2010-01-25 00:00:00 Windows 8 2010-01-25 00:00:00 Windows 109567 2010-01-26 00:00:00 Linux 67 2010-01-26 00:00:00 Mac 16655 2010-01-26 00:00:00 Windows 114005 2010-01-26 00:00:00 Windows 5 2010-01-27 00:00:00 Linux 74 2010-01-27 00:00:00 Mac 13657 2010-01-27 00:00:00 Windows 98353 2010-01-28 00:00:00 Linux 70 2010-01-28 00:00:00 Mac 16042 2010-01-28 00:00:00 Windows 4 2010-01-28 00:00:00 Windows 109888 2010-01-29 00:00:00 Linux 39 2010-01-29 00:00:00 Mac 11302 2010-01-29 00:00:00 Windows 80232 2010-01-29 00:00:00 Windows 6 2010-01-30 00:00:00 Linux 22 2010-01-30 00:00:00 Mac 6694 2010-01-30 00:00:00 Windows 49526 2010-01-30 00:00:00 Windows 2 2010-01-31 00:00:00 Linux 17 2010-01-31 00:00:00 Mac 6854 2010-01-31 00:00:00 Windows 2 2010-01-31 00:00:00 Windows 46473 2010-02-01 00:00:00 Linux 13 2010-02-01 00:00:00 Mac 4756 2010-02-01 00:00:00 Windows 33247 2010-02-01 00:00:00 Windows 2 2010-02-02 00:00:00 Linux 7 2010-02-02 00:00:00 Mac 1461 2010-02-02 00:00:00 Windows 11158 2010-02-03 00:00:00 Linux 35 2010-02-03 00:00:00 Mac 10395 2010-02-03 00:00:00 Windows 1 2010-02-03 00:00:00 Windows 65585 2010-02-04 00:00:00 Linux 47 2010-02-04 00:00:00 Mac 14806 2010-02-04 00:00:00 Windows 92339 2010-02-04 00:00:00 Windows 3 2010-02-05 00:00:00 Linux 45 2010-02-05 00:00:00 Mac 15010 2010-02-05 00:00:00 Windows 97549 2010-02-05 00:00:00 Windows 5 2010-02-06 00:00:00 Linux 55 2010-02-06 00:00:00 Mac 15235 2010-02-06 00:00:00 Windows 96039 2010-02-06 00:00:00 Windows 5 2010-02-07 00:00:00 Linux 11 2010-02-07 00:00:00 Mac 6737 2010-02-07 00:00:00 Windows 39244 2010-02-07 00:00:00 Windows 1 JSON response is: { 'product': 'Firefox', start_date': '2010-01-24 00:00:00', 'end_date': ' 2010-02-07 00:00:00', ... [{'date': '2010-01-24', 'os: 'Windows', 'users: '57434300', 'crashes: '12'}, ... },{'date: '2010-01-25', 'os: 'Windows', 'users: ' 67752231', 'crashes: ' 109567'} ... } Either we should look at how osdims is used or we should aggregate os numbers before formatting response.
Comment 2•14 years ago
|
||
fascinating. I think we're seeing something very very subtle in the SQL or it's a bug in PostgreSQL. I can correct the behavior of this query by simply changing the name of the second column: select CAST(ceil(EXTRACT(EPOCH FROM (window_end - timestamp without time zone '2010-01-24 00:00:00' - interval '8 hours')) / 86400) AS INT) * interval '24 hours' + timestamp without time zone '2010-01-24 00:00:00' as day, case when os.os_name = 'Windows NT' then 'Windows' when os.os_name = 'Mac OS X' then 'Mac' else os.os_name end as bloop, sum(count) from top_crashes_by_signature tcbs join osdims os on tcbs.osdims_id = os.id and os.os_name in ('Windows','Mac OS X','Linux','Windows NT') where (timestamp without time zone '2010-01-24 00:00:00' - interval '8 hours') < window_end and window_end <= (timestamp without time zone '2010-02-07 00:00:00' - interval '8 hours') and productdims_id = 171 group by day, bloop order by 1, 2; day | bloop | sum ---------------------+---------+-------- 2010-01-24 00:00:00 | Linux | 20 2010-01-24 00:00:00 | Mac | 8295 2010-01-24 00:00:00 | Windows | 61484 2010-01-25 00:00:00 | Linux | 46 2010-01-25 00:00:00 | Mac | 16335 2010-01-25 00:00:00 | Windows | 109575 2010-01-26 00:00:00 | Linux | 67 2010-01-26 00:00:00 | Mac | 16655 2010-01-26 00:00:00 | Windows | 114010 2010-01-27 00:00:00 | Linux | 74 2010-01-27 00:00:00 | Mac | 13657 2010-01-27 00:00:00 | Windows | 98353 2010-01-28 00:00:00 | Linux | 70 2010-01-28 00:00:00 | Mac | 16042 2010-01-28 00:00:00 | Windows | 109892 2010-01-29 00:00:00 | Linux | 39 2010-01-29 00:00:00 | Mac | 11302 2010-01-29 00:00:00 | Windows | 80238 ...
Reporter | ||
Comment 3•14 years ago
|
||
(In reply to comment #2) Excellent! Thanks lars!
Assignee | ||
Comment 4•14 years ago
|
||
Indeed, lose 'bloop' and get 'osname' and it works as expeccted
Comment 5•14 years ago
|
||
After digging through the SQL standard, I finally understand what's going on here. Select list aliases are not understood in the group by clause in standard SQL. PostgreSQL, however, allows it as an extension. In the original SQL statement, 'os_name' matched a existing column name, so PostgreSQL took that existing column name as the group by parameter. PostgreSQL followed the standard even though 'os_name' was not qualified with a tag. My replacement from Comment #1 worked because it invoked the PostgreSQL extension behavior. The correct standard SQL should be: select CAST(ceil(EXTRACT(EPOCH FROM (window_end - timestamp without time zone '2010-01-24 00:00:00' - interval '8 hours')) / 86400) AS INT) * interval '24 hours' + timestamp without time zone '2010-01-24 00:00:00' as day, case when os.os_name = 'Windows NT' then 'Windows' when os.os_name = 'Mac OS X' then 'Mac' else os.os_name end as os_name, sum(count) from top_crashes_by_signature tcbs join osdims os on tcbs.osdims_id = os.id and os.os_name in ('Windows','Mac OS X','Linux','Windows NT') where (timestamp without time zone '2010-01-24 00:00:00' - interval '8 hours') < window_end and window_end <= (timestamp without time zone '2010-02-07 00:00:00' - interval '8 hours') and productdims_id = 171 group by 1, 2 order by 1, 2; While I have the solution, I should not be the one that actually fixes this bug: I'm not on the payroll right now. This bug needs to be taken by someone. Griswolf?
Assignee | ||
Comment 7•14 years ago
|
||
Fixed in revision 1772
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 8•14 years ago
|
||
(In reply to comment #7) Awesome, thanks for the quick turnaround! Can you track staging this. Once it's staged and verified, I'll track pushing to prod. Thanks.
Assignee | ||
Comment 9•14 years ago
|
||
re-opening to track resolution of bug #546523
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 10•14 years ago
|
||
for some reason that I don't understand, IT didn't how / where to run this on stage, so it has been delayed.
Reporter | ||
Comment 11•14 years ago
|
||
This shipped with 1.5.
Status: REOPENED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
Component: Socorro → General
Product: Webtools → Socorro
You need to log in
before you can comment on or make changes to this bug.
Description
•