Closed Bug 559880 Opened 14 years ago Closed 14 years ago

Generate wait time reports from schedulerdb

Categories

(Release Engineering :: General, defect, P3)

x86
Linux
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: astoica)

References

Details

(Whiteboard: [automation][reporting][q3goal])

Attachments

(5 files, 8 obsolete files)

1.10 MB, patch
catlee
: review+
Details | Diff | Splinter Review
591.56 KB, text/plain
catlee
: review+
Details
23.93 KB, patch
catlee
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
1.68 KB, patch
catlee
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
2.06 KB, patch
catlee
: review+
catlee
: checked-in+
Details | Diff | Splinter Review
The wait time reports should be generated using the schedulerdb as its data source.

The nightly e-mail reports should still be generated.

I'd also like to have a web app that the e-mail reports can link to where you can drill down and find out what caused delays.
Whiteboard: [automation][reporting][q2goal]
Assignee: nobody → astoica
The web app page displaying the wait times is located at URL:
http://localhost:5000/waittimes/[{poolname}][?starttime=12345][&endtime=12345][&format=json]

, where: poolname := buildpool | trybuildpool; default: buildpool, if no poolname is specified 

Both starttime and endtime are optional, and are expressed in seconds since the epoch, in UTC. If only startime is specified, endtime will be startime plus 24 hours. If only endtime is specified, starttime will be endtime minus 24 hours.

For JSON output, add format=json get parameter.

Example:
http://localhost:5000/waittimes/buildpool?endtime=1277415290 => Wed Jun 23 14:34:50 2010 and Thu Jun 24 14:34:50 2010

Patch contains:
* controllers/waittimes.py - new controller for wait times
* templates/waittimes.mako - template HTML renderer displaying the wait times
* model/query.py:
- PLATFORMS_BUILDERNAME - dictionary specifying the platforms and a list of compiled regex-es to scan for in the buildrequests.buildername column fields; 
- PLATFORMS_BUILDERNAME_EXCLUDE - list of excluded platforms by matching them against buildrequests.buildername 
- BUILDSET_REASON_SQL_EXCLUDE - list of excluded reasons from buildsets.reason columns like force builds and rebuilds; (used when creating the sql query)
- BUILDPOOL_MASTERS - lists master names in each pool, as they are found in the buildrequests.claimed_by_name column fields

- GetWaitTimes(pool, minutes_per_block=15, starttime=None, endtime=None)

The wait times are computed as the difference between:
builds.start_time (end time of interest) - changes.when_timestamp (start time of interest) for each build request in the buildrequests table.

TO DO:
1/ the script that send e-mails with the wait times is also done, but needs testing. I made it as an independent script which calls http://localhost:5000/waittimes/{poolname}?format=json to get the wait times. 

2/ the script that periodically calls the emailing script (at 1/)
This patch contains in addition (to the previous one) the script which mails the wait times. This script is located in buildapi.scripts.mailwaittimes.py.

I made it as an independent script which calls
http://localhost:5000/waittimes/{poolname}?format=json to get the wait times. 

The wait times mailing script can be run either on its own, or by calling a function in the module.

If run on its own, the call should like:

python mailwaittimes.py -S smtp.mozilla.org -a youremail@mozilla.com -W http://domain/waittimes -p buildpool -e 1277415290,
, -e is the endtime, -s is the starttime; if none are specified, the current time is used as the end time, and the start time will be endtime-24h. 

Therefore, the nighly mailer should call:
python mailwaittimes.py -S smtp.mozilla.org -a youremail@mozilla.com -W http://domain/waittimes -p buildpool

If called from within Python, the following method could be called, which accomplished the same thing:

def mail_wait_times(server=SMTP_SERVER_DEFAULT, sender=SMTP_SENDER_DEFAULT, receivers=WT_RECEIVERS_DEFAULT, wt_service=WT_SERVICE_DEFAULT, pool=WT_POOL_DEFAULT, starttime=WT_STARTTIME_DEFAULT, endtime=WT_ENDTIME_DEFAULT, minutes_per_block=WT_MPB_DEFAULT)

As for the nightly job scheduler, are we using something like cron? Or a different system? What else should I add to the patch in order to make them work together?

Besides this, I think the patch is ready for review.
Attachment #455206 - Attachment is obsolete: true
Comment on attachment 455559 [details] [diff] [review]
Wait time reports generated using the schedulerdb and web app pages AND script for nightly e-mail reports

Looking good!  Just a few comments on my first read-through:

Your calls to int/float(request.GET(...)) need to be in try/except clauses in case non-numeric arguments are passed in.

What's the purpose of self.template?  What if self.template isn't set and we're returning html?

I think you're going to be counting builds multiple times if they have multiple changes.  Also, some builds will have 0 changes, so they'll be missed by the query as well I think.

Would it make sense to rename 'excludedplatforms' to 'unknownbuilders'?

Please use urllib.urlencode to encode parameters in wtservice_get_full_url

add a Date header to email message

is mail_wait_times return value 'True' if no recipients were rejected?  I think that 'not {}' is true.
Hi Chris! I updated the patch to reflect your previous comments. Thanks!

The changes contain:
- added try/except clauses in the WaittimesController to catch int/float malformed parameters for starttime, endtime and minutes per block (mpb).

- removed the self.template in WaittimesController, now renders directly the waittimes.mako template by calling:  render('/waittimes.mako'), if the required format is HTML

- after our conversation last week, I left as it was, counting multiple times a build for which there are multiple changes, BUT added the builds that have no changes (like nightly builds), by making an outer join
	* the SQL Query I tried to model the sqlalchemy code after is:
SELECT br.buildername, b.start_time, br.submitted_at, c.when_timestamp FROM builds b, buildrequests br, buildsets bs, sourcestamps s LEFT JOIN sourcestamp_changes sch on (s.id=sch.sourcestampid) LEFT JOIN changes c on (sch.changeid=c.changeid) WHERE br.id=b.brid and br.buildsetid=bs.id and bs.sourcestampid=s.id and br.complete=1 and b.start_time<=1276941742 and (br.submitted_at>=1276855342 or c.when_timestamp>=1276855342)

, plus a few more restrictions on buildsets.reason (to exclude force builds and rebuilds) and on buildrequests.claimed_by_name (to select only the masters of interest)

	* uses submitted_at time if changes are missing (when_timestamp is null/none)

- added a field with the number of builds with no changes (in our case being just the nightly builds, rebuilds/force builds being excluded from the stats) -- wait_times['no_changes']

- changed wait times JSON parameter from 'excludedplatforms' to 'unknownbuilders', does sound more appropriate

- used urllib.urlencode to encode parameters in wtservice_get_full_url, oops I forgot about that

- added the Data header to the wait time email like:
    headers.append("Date: %s" % email.utils.formatdate(localtime=True))    -- hope it's OK like this

- changed the return type of mail_wait_times to a dictionary containing the status of the operation: success or error, and some additional information. In case of success, it looks like: {status: 'success', refused: refused_rcv, msg:'info'}, where refused_rcv is a dictionary, with one entry for each recipient that was refused, thus empty if all recipients received the email.
In case of error/failure, the result looks like: {status: 'error', msg: 'reason'}, if all of the receivers were refused or an exception was raised.

I changed it because before the response was misleading and other users of this method could have easily fall into the same trap. The success response ({} - empty dictionary) could easily be confused with the failure (None) in conditional statements like if not resp: is_error(). So, the response used to be: 
	* {} - if all recipients receive the e-mails --> complete success
	* { rejected_recipients_dictionary } - dictionary, with one entry for each recipient that was refused; at least one recipient received the e-mail, but not all
	* None - if all of the receivers were refused or an exception was raised

New in the Web app UI:
- input fields for start date/end date. Used jQuery's DatePicker
- quick links to buildpool | trybuildpool

NOTE: I think the date/times are still messed up. 
- What should the time zone be for each of the: web app, mail, query parameters? 
- How about the database?
Attachment #455559 - Attachment is obsolete: true
Ok, looks good.  I'll get this committed and we can make future iterations on this.
- time zone conversion fixed in JS (by using a AnyTime JS library) and on the server side. Now all times should be in Pacific time, both the ones generated on the server in Python, and the on the client side in JS. The start/end interval can be selected in any timezone by using AnyTime's datetime widget

- Android7, Maemo4, /5, /5gtk are now listed under the *linux* platform

- fixed buildpool|trybuildpool urls in UI
Attachment #457246 - Attachment is obsolete: true
Whiteboard: [automation][reporting][q2goal] → [automation][reporting][q3goal]
Attached patch Wait time report without charts (obsolete) — Splinter Review
- maxb - maximum block value for wait times GET parameter
- wait time statistics break down by interval
- changed query for wait times, to look only if job start time is between starttime and endtime parameters:
changed:    q = q.where(b.c.start_time<=endtime)
to:         q = q.where(or_(c.c.when_timestamp<=endtime, br.c.submitted_at<=endtime))
Attachment #458843 - Attachment is obsolete: true
Comment on attachment 461694 [details] [diff] [review]
Wait time report without charts

Having the timezone defined in two places is going to lead us into trouble.  Can we use the pytz library for this?
Comment on attachment 461694 [details] [diff] [review]
Wait time report without charts

Is this also missing the charts controller?
yes, it's missing everything about charts
Wait Times Charts, Pushes Reports and Charts, Wait Times report using class container
Attachment #461694 - Attachment is obsolete: true
Attachment #464622 - Flags: review?(catlee)
Attachment #464622 - Flags: review?(catlee)
Attachment #464622 - Flags: review+
Attachment #464622 - Flags: checked-in+
Attached patch Pushes Reports (obsolete) — Splinter Review
Pushes Reports
Attachment #464622 - Attachment is obsolete: true
Attachment #467474 - Flags: review?(catlee)
Attachment #467474 - Flags: review?(catlee) → review+
Attachment #467474 - Flags: checked-in+
Attached patch end to end times and pushes fix (obsolete) — Splinter Review
End to end times: all buildruns within a timeframe for a branch and individual buildrequests for one buildrun by sourcestamp revision number

Pushes
Fixes pushes count by grouping build requests by changes.when_timestamp and sourcestamps.branch, and ignoring talos branches

UI
Removed AnyTime picker and replaced it by jQuery UI Datepicker
Added jQuery DataTables-1.7.1 (new version)

Templates
Refactored buildrun, endtoend, pushes, waittimes mako templates to inherit a base template (report.mako)
Created mako functions for menus and datepicker.
Attachment #467474 - Attachment is obsolete: true
Attachment #469356 - Flags: review?(catlee)
Comment on attachment 469356 [details] [diff] [review]
end to end times and pushes fix

For the endtoend and endtoend_revision controller methods, can you move the @beaker_cache call down to cache just the calculation of the report, not the whole method?

model/endtoend.py also needs a lot more comments explaining what's going on, especially some of the big queries (like BuildRunsQuery).

Also, we already have dataTables-1.6.  Do you need dataTables-1.7?  If so, please remove 1.6 and update any existing code that points to it.
Attachment #469356 - Flags: review?(catlee) → review-
updated previous patch following review in Comment 14 (https://bugzilla.mozilla.org/show_bug.cgi?id=559880#c14)

Re dataTables-1.7: I switched to the new version as I wanted to use FixedHeader plug-in for the large tables (though the current version of the patch doesn't use it yet)

The patch also contains in addition:
- colored green/orange/red rows depending on the build request or build run buildrequests.results value and a few extra column values.
Attachment #469356 - Attachment is obsolete: true
Attachment #470687 - Flags: review?(catlee)
Attachment #470687 - Flags: review?(catlee) → review+
Removed datatables-1.6, anytime library, templates/sourcestamps, templates/waittimes.mako
Attachment #470898 - Flags: review?(catlee)
Attachment #470898 - Flags: review?(catlee) → review+
Improvements on End-to-End times:

- Avg time / build run within a time frame := SUM(duration)/no_build_runs ; e2e times page
- Make build run report per branch; add branch in URL and query for build runs: /reports/revision/{branch_name}/{revision}
- Merge try/tryserver branches
- Delete addontester branch / both on pushes and e2e times
- Fixed branch menu + links :: created branch_name as sourcestamps.branch dictionary keys
- Added a Complete yes/no column per build run
Attachment #472714 - Flags: review?(catlee)
applied against version:
d4bc0b863541 2010-08-31 18:01 -0400	Chris AtLee - Don't need to do anything with databases on setup.
Attachment #472818 - Flags: review?(catlee)
Attachment #472818 - Flags: review?(catlee) → review+
Attachment #472714 - Flags: review?(catlee) → review+
Attachment #472714 - Flags: checked-in+
Attachment #472818 - Flags: checked-in+
Pushes report got broke due to field change name for buildapi.model.pushes.Push class from branch to branch_name and inconsistent update of all code to reflect the change.
Attachment #473249 - Flags: review?(catlee)
Attachment #473249 - Flags: review?(catlee)
Attachment #473249 - Flags: review+
Attachment #473249 - Flags: checked-in+
Also contains patches for Pushes Report and End to End Times Report 

(For more on E2E Times report see: https://bugzilla.mozilla.org/show_bug.cgi?id=559885).
See Also: → 559885
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Depends on: 594496
No longer depends on: 539588
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: