Closed Bug 455080 Opened 16 years ago Closed 16 years ago

Create cron script in socorro.cron for populating server status

Categories

(Socorro :: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: morgamic, Assigned: ozten)

Details

Attachments

(1 file, 1 obsolete file)

We need to set up a cron job to query the pending jobs and either write a static HTML page or update a simple key->value database table for later use in the reporter.
Assignee: ted.mielczarek → aking
So this new cron will make the php code behind http://crash-stats.mozilla.com/status obsolete, right?
Yeah, the cron should basically do the same queries as that php page, but store the results in the db for the new status page to pull out.
I am looking for feedback on the SQL I have attached. It is a single INSERT statement that does all of the processing inside of Postgres to populate a server_status table. This reduces number of round trips and load on DB, but leaves Python layer out of the picture. Does this sound like a good approach? Other ideas?

Attachment has more comments on meaning and type of each piece of data.
Attachment #343464 - Flags: review?(lars)
I see no problems in skipping a python layer, but would like to know how the cron job is submitting the query.  Psql?

An INSERT statement eschews a column list is brittle if we ever change the schema in the future.  I would suggest that it be added.  It should leave off the 'id' column completely as its value is assigned automatically.

To make several of these subqueries faster, I have added an index to the jobs table on (completeddatetime, queueddatetime).

We've seen that count() can be absurdly slow in PostgreSQL.  Do we have any stats to tell us what the average size is for that table?
I would still have the SQL called from Python via psycopg2.

I will add the list of columns.

Thanks for adding indexes.

I don't know about size of prod jobs table. Will need access to staging server.
I've now got access to the database on the staging server:  I created your table, and ran the insert query.  It appears to have a snapshot of the database from about 2008-09-21.  There are about 5 million jobs in the 'jobs' table of which it appears that 1.2M are not complete.  I don't understand why there are jobs as old as 2008-07-09 in that table.

 id |  date_recently_completed  |   date_oldest_job_queued   | avg_process_sec | avg_wait_sec | waiting_job_count | processors_count |        date_created        
----+---------------------------+----------------------------+-----------------+--------------+-------------------+------------------+----------------------------
  1 | 2008-09-21 00:04:33.40608 | 2008-07-09 19:31:23.643028 |        0.596183 |   4.0827e+06 |           1220246 |                7 | 2008-10-17 11:29:08.819012
New socorro/sql/server_status_sql.txt includes SQL for indexes suggested by lars.

cron will process stats for last 5 minutes of activity in jobs table, or configurable via processingInterval param.

It logs last entry and how long it took to process INSERT for troubleshooting.

Example:
./startServerStatus.py -D true --processingInterval "00:10:00"
Attachment #343464 - Attachment is obsolete: true
Attachment #344218 - Flags: review?(morgamic)
Attachment #344218 - Flags: review?(lars)
Attachment #343464 - Flags: review?(lars)
Checked in code r633.
Tested against a subset of staging with dates updated to current times to make testing possible. Looks good so far.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
cursory examination of this code looks fine.  I cannot speak as to whether is does exactly the right thing.  However, nothing in the code sets off any of my intuitive alarm bells.

The following is a legacy issue as well as an editorial comment.  It does not have to be acted on:

Why does it have a manual init mode?  I would think that if it is running and detects that it has not be initialized, that it should just go ahead and initialize itself.  Why force a person to manually perform the act?  I guess I've just got this old school objected oriented mindset of encapsulation: construction (creating tables), intialization (the init mode) and the function should be localized in one neat and tidy package.
Attachment #344218 - Flags: review?(morgamic)
Attachment #344218 - Flags: review?(lars)
Attachment #344218 - Flags: review+
Component: Socorro → General
Product: Webtools → Socorro
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: