Closed Bug 426389 Opened 17 years ago Closed 17 years ago

Fix scripts that populate aggregate reporting tables (topcrash is broken)

Tracking

(Not tracked)

Status:

RESOLVED FIXED

Milestone:

0.6

People

(Reporter: justin.gallardo, Assigned: ozten)

References

Details

Attachments

(2 files, 3 obsolete files)

First attempt at the script 17 years ago Justin Gallardo 3.48 KB, text/plain		Details
Second try. 17 years ago Justin Gallardo 3.73 KB, text/plain		Details
Patch to fix counters for off by 1 errors 17 years ago Austin King [:ozten] 3.98 KB, patch		Details \| Diff \| Splinter Review
Incorporating Lars feedback into new patch 17 years ago Austin King [:ozten] 4.01 KB, patch		Details \| Diff \| Splinter Review
Incorporating Lars feedback Comment #22 into new patch 17 years ago Austin King [:ozten] 4.03 KB, patch		Details \| Diff \| Splinter Review

Justin Gallardo

Reporter

Description

•

17 years ago

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13 Build Identifier: Trunk There needs to be some scripts written that will be executed by cron to populate and update the data in the aggregate reporting data. This allows large queries such as top 100 crashers, and top crashers of all time to be accessed much quicker than the current setup. Reproducible: Didn't try

Michael Morgan [:morgamic]

Updated

•

17 years ago

Assignee: nobody → justin.gallardo

Status: UNCONFIRMED → NEW

Ever confirmed: true

Michael Morgan [:morgamic]

Updated

•

17 years ago

Priority: -- → P1

Target Milestone: --- → 0.5

Justin Gallardo

Reporter

Updated

•

17 years ago

Blocks: 426410

Justin Gallardo

Reporter

Comment 1

•

17 years ago

Attached file First attempt at the script (obsolete) — Details

I attached my first attempt at a script to populate the summary table. The only thing it leaves out is the trend, as I am not 100% sure how/what we want to store there.

Michael Morgan [:morgamic]

Updated

•

17 years ago

Attachment #313231 - Attachment mime type: text/x-python → text/plain

Justin Gallardo

Reporter

Updated

•

17 years ago

Status: NEW → ASSIGNED

Michael Morgan [:morgamic]

Comment 2

•

17 years ago

Comment on attachment 313231 [details] First attempt at the script Hey, good so far - comments: s/aggreate/aggregate/ Would like to see either a separate config file or passed args from command line -- probably separate file. We might be able to use one of the existing configs -- see http://code.google.com/p/socorro/source/browse/trunk/socorro/lib/config.py.dist -- we could piggyback off that pretty easily. See a couple of things that worry me as far as scaling. SELECT * FROM reports and the for loop over that result set will probably not work very well. So -- consider how can we make this do updates for only date ranges that the script knows are new and I think we solve a lot of that.

Justin Gallardo

Reporter

Comment 3

•

17 years ago

Attached file Second try. — Details

This version of the script is using the socorro.lib.config configuration file now, and also has gotten rid of the ugly 'SELECT * FROM reports' query. Now, it checks the summary table for its latest update, and then queries reports for all records new since that time. Thoughts?

Attachment #313231 - Attachment is obsolete: true

Michael Morgan [:morgamic]

Comment 4

•

17 years ago

Comment on attachment 313238 [details] Second try. Can you add these as text/plain?

Attachment #313238 - Attachment mime type: text/x-python → text/plain

Michael Morgan [:morgamic]

Comment 5

•

17 years ago

Could join the last_updated query and the reports query into one by using a subquery. Would like better exception handling -- what if data is in an unexpected format? What if crashes is empty? That said, looks ok functionally but think of how you would write tests for these methods, and how we should handle any failures in data getting, mutating or setting by using try/except statements. For example, what would we output to cron in the event of a catastrophic failure? This looks a little strange: if row[16] == "mac" or row[16] == "Mac OS X" or row[16] == "#3C3": I don't think the value is going to be 'mac' or '#3C3' ever. First element in the platforms list is the value to key off of, so: row[16] == "Mac OS X"

Michael Morgan [:morgamic]

Comment 6

•

17 years ago

Comment on attachment 313238 [details] Second try. Lars, could you give Justin some feedback?

Attachment #313238 - Flags: review?(lars)

K Lars Lohn [:lars] [:klohn]

Comment 7

•

17 years ago

Which socorro is being used, the old one or the refactored new one? In the new one, socorro.lib.config is deprecated in favor of each script having its config file. See socorro/monitor/config.py and socorro/processor/config.py. Since monitor and processor both now allow command line parameters, I think that other scripts based on the socorro code should follow suit. See .../scripts/startMonitor.py for an example how the class ConfigurationManager coupled with the form of the new style config files makes command line stuff trivial. line 28: map(itemgetter(1), ranks) - makes a list of the items drawn from ranks and then does nothing with it. line 39: ..."dbname='" + config.db_name + "' user='" + config.db_user + "' host='" + config.db_host + "' password='" + config.db_pass + "'"... is fine PHP, but inefficient in Python because each plus sign results in another complete copy of the string being created. A more “Pythonic” solution that results in only one string is: “dbname='%s' user='%s' host='%s' password='%s'” % (dbname, user, host, password). Though, the new configuration files offer the entire databaseDSN as one string... line 41: since this program runs in cron, logging error messages is more valuable than printing to stdout. See .../scripts/startMonitor.py for an example of how to set it up. In lines such as 55: key = (row[3], row[4], row[6]), a comment that indicates which corresponding columns you're shooting for would be helpful to facilitate some random maintainer in the future understand the code more quickly. This is one of those annoyances that hits all programs that use the Python DBAPI. line 56: the call to .has_key is deprecated in future versions of Python. The correct way to do it now is: if key not in summary_crashes: line 75: summary_crashes[key]['uptime'] = (row[10] + summary_crashes[key]['uptime']) / crash_count[key]; not sure I understand what's happening here. Is summary_crashes[key]['uptime'] supposed to hold an average? Say we had four crashes all with an up time of 10. On processing the first one we'd have a value of 10. On the processing the second we'd get (10 + 10) / 2.0 = 10. On processing the third, we'd get (10 + 10) /3.0 = 6.6667 On the fourth we'd get (10 + 6.6667) / 4.0 = 4.16667 lines 88,89: not necessary to return and rebind the dictionary summary_crashes. The dictionary is effectively passed by reference into the functions. Modifications of the dictionaries within the functions will persist outside the functions. lines 95,96 & 104,105: since these lines execute sql queries within a loop, you want the queries to use positional rather than absolute parameters. That way the server will plan and compile the queries only once rather than each time the queries are submitted. Replace all substitution parameters with %s with no quotes regardless of the type. Move the sql definition outside the loop and then move the parameters tuples into the cur.execute call in this manner: sql = "SELECT rank FROM topcrashes WHERE product=%s AND branch=%s AND signature=%s ORDER BY last_updated DESC LIMIT 1" cur.execute(sql, (data['product'], data['version'], data['signature']))

Justin Gallardo

Reporter

Comment 8

•

17 years ago

Lars: Thanks for the input! I was able to improve the script quite a bit using the libs from socorro. Thanks for the pointers! I just created /socorro/cron in svn, with topcrashes.py and topcrashes_config.py.dist, with the latest version of the script that I have. I was able to implement logging, configuration options, command line switches, and added some more error handling into the script. I added the 'initMode' flag as well. With this flag, it will perform a full select of the reports table, as oppose to grabbing everything since the last update. Useful for doing the first run of data. We may want to make this only grab two weeks of data or the like, just to prevent a massively huge dataset being calculated.

Michael Morgan [:morgamic]

Comment 9

•

17 years ago

Comment on attachment 313238 [details] Second try. Thanks for the comments, Lars.

Attachment #313238 - Flags: review?(lars)

Justin Gallardo

Reporter

Comment 10

•

17 years ago

Merged into trunk in r433. Still waiting to setup on staging for testing.

Michael Morgan [:morgamic]

Updated

•

17 years ago

Target Milestone: 0.5 → ---

Michael Morgan [:morgamic]

Updated

•

17 years ago

Target Milestone: --- → 0.6

Michael Morgan [:morgamic]

Updated

•

17 years ago

Assignee: justin.gallardo → nobody

Michael Morgan [:morgamic]

Comment 11

•

17 years ago

This was setup on staging and tested, however we are seeing inconsistency in data. Next step is to understand why our lists differ from what is seen in a simple query on reports.*

Michael Morgan [:morgamic]

Updated

•

17 years ago

Target Milestone: 0.6 → ---

(not currently active) Ted Mielczarek

Updated

•

17 years ago

Summary: Need scripts to populate aggregate reporting tables. → Fix scripts that populate aggregate reporting tables (topcrash is broken)

Michael Morgan [:morgamic]

Updated

•

17 years ago

Assignee: nobody → aking

Target Milestone: --- → 0.6

Michael Morgan [:morgamic]

Comment 12

•

17 years ago

If we do find out that the database is correct we just need to correct the reporter query.

Austin King [:ozten]

Assignee

Comment 13

•

17 years ago

Attached patch Patch to fix counters for off by 1 errors (obsolete) — Details — Splinter Review

Looking at the data, the aggregate numbers in topcrashes didn't match the individual numbers in reports table. I found two seperate off by 1 errors in code. Both are fixed by eliminating if / else which initialize state.

Attachment #342657 - Flags: review?

Austin King [:ozten]

Assignee

Updated

•

17 years ago

Attachment #342657 - Flags: review?(morgamic)

Attachment #342657 - Flags: review?(lars)

Attachment #342657 - Flags: review?

Austin King [:ozten]

Assignee

Comment 14

•

17 years ago

Comment on attachment 342657 [details] [diff] [review] Patch to fix counters for off by 1 errors Making reviewers specific lars or morgamic.

Samuel Sidler (old account; do not CC)

Comment 15

•

17 years ago

I could be wrong, but I think you need to patch the PHP file. Since the rewrite from Python to PHP, I think all the web-app Python code is dead. Maybe this? http://code.google.com/p/socorro/source/browse/trunk/webapp-php/application/controllers/topcrasher.php

(not currently active) Ted Mielczarek

Comment 16

•

17 years ago

There's a Python script run from cron that fills a table for the webapp to use: http://code.google.com/p/socorro/source/browse/#svn/trunk/socorro/cron

Samuel Sidler (old account; do not CC)

Comment 17

•

17 years ago

Ah, wow. Well I stand corrected. :)

K Lars Lohn [:lars] [:klohn]

Comment 18

•

17 years ago

On reviewing the code: Your first change on lines 78-81 involves testing to see if row has anything in it before bothering to proceed. The old code will simply raise an exception that is caught in main. Your patch, tests and if finding a problem, reports the problem. However, it then lets the program proceed, where The next sessas it fails and raises an exception on the next line. A more 'Pythonic' solution would be like this: try: start_time = row[0] except IndexError: logger.warn("topcrashers table is empty...") raise The next section of your changes looks fine, flattening 'if' hierarchies is always a nice thing to do. However, one thing looks suspicious: lines 130:132. In the case where the signature is not in summary_crashes[key], you set crash_count[fullKey] = 1.0. Immediately thereafter, it is incremented to 2. Is this correct?

Austin King [:ozten]

Assignee

Comment 19

•

17 years ago

Attached patch Incorporating Lars feedback into new patch (obsolete) — Details — Splinter Review

Good points Lars. I have incorporated your two suggestions. Initalization to 0: summary_crashes[key][signature]['users'] = 0 crash_count[fullKey] = 0 crash_count[fullKey] += 1 and using raise: if row: start_time = row[0] else: msg = "topcrashers table is empty, use -I to run in initializing mode" logger.warn(msg) raise msg

Attachment #342657 - Attachment is obsolete: true

Attachment #342937 - Flags: review?(lars)

Attachment #342657 - Flags: review?(morgamic)

Attachment #342657 - Flags: review?(lars)

K Lars Lohn [:lars] [:klohn]

Comment 20

•

17 years ago

actually, raising strings as exceptions is deprecated in Python. Only instances of classes derived from BaseException can be raised in future versions. In v2.5, Python will issue a deprecation warning. In v2.6, I believe it is an error. To avoid issues in the the future when we inevitably migrate, you can change the code to: raise Exception(msg)

Austin King [:ozten]

Assignee

Comment 21

•

17 years ago

What is a good Exception type? It looks like TypeError(msg) is the default. In Java I would use IllegalStateException or something like that... What do you suggest for Python?

K Lars Lohn [:lars] [:klohn]

Comment 22

•

17 years ago

It's perfectly fine just to use Exception(msg) as a generic error. Typically most modules define their own exception hierarchy. However, in this case that's probably overkill since the calling program doesn't bother to respond to exceptions. If you were to go that route, however, it would look like this: class NotInitializedException(Exception): def __init__(self): super (NotInitializedException, self).__init__("topcrashers table is empty,...") On continuing the review, I see that you changed the crash_count from a floating point value to an integer. I suspect that it was originally set a floating point value to coerce floating point rather than integer division on line 25. You may want to check to see if integer division is acceptable for that calculation.

Austin King [:ozten]

Assignee

Comment 23

•

17 years ago

Attached patch Incorporating Lars feedback Comment #22 into new patch — Details — Splinter Review

Using Exception(msg) and reverted to initialing count as a float.

Attachment #342937 - Attachment is obsolete: true

Attachment #342954 - Flags: review?(lars)

Attachment #342937 - Flags: review?(lars)

K Lars Lohn [:lars] [:klohn]

Comment 24

•

17 years ago

it looks fine to me -

Austin King [:ozten]

Assignee

Comment 25

•

17 years ago

Checked in r626

Status: ASSIGNED → RESOLVED

Closed: 17 years ago

Resolution: --- → FIXED

(not currently active) Ted Mielczarek

Comment 26

•

17 years ago

Did this get deployed in production? If not, bug 460704 is a dupe of this. If so, then see that bug which says it's still not correct.

Michael Morgan [:morgamic]

Updated

•

17 years ago

Keywords: push-needed

Samuel Sidler (old account; do not CC)

Updated

•

17 years ago

Keywords: push-needed

Henrik Skupin [:whimboo][⌚️UTC+1]

Comment 28

•

17 years ago

Removing the push-needed keyword means the fix was pushed to production? The topcrasher list still reports wrong numbers (as what I reported in bug 460704). When the correct ones will be shown?

Austin King [:ozten]

Assignee

Comment 29

•

17 years ago

Sorry for the shuffle of bug #s. There is a patch under review attached to 461977, release TBD.

Wayne Mery (:wsmwk)

Updated

•

17 years ago

Depends on: 466549

Henrik Skupin [:whimboo][⌚️UTC+1]

Updated

•

17 years ago

Depends on: 461977

Austin King [:ozten]

Assignee

Updated

•

15 years ago

Attachment #342954 - Flags: review?(lars)

Nobody; OK to take it and work on it

Updated

•

14 years ago

Component: Socorro → General

Product: Webtools → Socorro

You need to log in before you can comment on or make changes to this bug.