Closed
Bug 674267
Opened 13 years ago
Closed 13 years ago
Recent jobs page is not loading up
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: armenzg, Assigned: catlee)
References
Details
(Keywords: buildapi, Whiteboard: [buildapi][reporting])
Attachments
(1 file, 1 obsolete file)
6.43 KB,
patch
|
nthomas
:
review+
catlee
:
checked-in+
|
Details | Diff | Splinter Review |
https://build.mozilla.org/buildapi/recent
##################################
Proxy Error
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /buildapi/recent.
Reason: Error reading from remote server
##################################
Comment 2•13 years ago
|
||
This was due to some slow MySQL queries. Mpressman had a look at this today, and optimized several tables, to the point where the slow queries started using the appropriate indexes. The indexes were being ignored because their cardinality indicated that a table scan would be faster.
Matthew's going to try to patch buildapi to generate a more efficient query, so I'm assigning this to him.
Assignee: nobody → server-ops-releng
Component: Release Engineering → Server Operations: RelEng
QA Contact: release → zandr
Comment 3•13 years ago
|
||
It's probably also worth optimizing these on a regular basis from a crontab. Let me know if you set that up, and I'll update the buildbot documentation on the wiki.
Comment 4•13 years ago
|
||
From IRC:
<justdave> dustin: I lied.
# Analyze the buildbot_schedulers tables to keep the indexes performant
export HOME=/root
mysql buildbot_schedulers -e "analyze table `mysql buildbot_schedulers --skip-column-names -e 'show tables' | tr '\n' ',' | sed 's/,$//'`"
was looking in the wrong place
it's there
<justdave> it's in cron.weekly, so it'll run around 4:20am every Sunday morning
Getting back to the .../recent query, I bumped the apache timeout in cruncher:/etc/httpd/conf/httpd.conf from 120 and 180 seconds and it loads for me now, albeit slowly. This report hits the status db (aka buildbot), but others like the endtoend hit schedulerdb (buildbot_schedulers) and are also timing out. Perhaps we're just filling up the tables so it takes longer to get the data we want, or it could be other load on the db servers is impacting on RelEng db's (setting a dep on bug 674298 to try to find out).
Depends on: 674298
Comment 5•13 years ago
|
||
Let's see if we can get this moving. The /recent page is still timing out for me, although partcular-slave recent pages are not, e.g.,
https://build.mozilla.org/buildapi/recent/talos-r3-w7-011
I'm not sure what to look at next, here - would the new redis host help?
Updated•13 years ago
|
Assignee: server-ops-releng → dustin
Comment 6•13 years ago
|
||
Catlee, what can we do here?
Assignee | ||
Comment 7•13 years ago
|
||
I don't know. The query itself is pretty simple:
SELECT builds.id, builders.name AS buildname, builds.buildnumber, builds.starttime, builds.endtime, builds.result, slaves.name AS slavename, masters.name AS master FROM builds, builders, slaves, masters WHERE builds.slave_id = slaves.id AND builds.builder_id = builders.id AND builds.master_id = masters.id AND builds.result IS NOT NULL ORDER BY builds.id DESC LIMIT 20;
explain has this to say:
+----+-------------+----------+--------+--------------------------------------------------------------------+--------------------+---------+----------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+--------+--------------------------------------------------------------------+--------------------+---------+----------------------------+------+---------------------------------+
| 1 | SIMPLE | slaves | ALL | PRIMARY | NULL | NULL | NULL | 1346 | Using temporary; Using filesort |
| 1 | SIMPLE | builds | ref | master_id,ix_builds_slave_id,ix_builds_builder_id,ix_builds_result | ix_builds_slave_id | 4 | buildbot.slaves.id | 1060 | Using where |
| 1 | SIMPLE | builders | eq_ref | PRIMARY | PRIMARY | 4 | buildbot.builds.builder_id | 1 | |
| 1 | SIMPLE | masters | eq_ref | PRIMARY | PRIMARY | 4 | buildbot.builds.master_id | 1 | |
+----+-------------+----------+--------+--------------------------------------------------------------------+--------------------+---------+----------------------------+------+---------------------------------+
and yet the query takes a long time to run (66 seconds on my first try)
Comment 8•13 years ago
|
||
I'm moving this back to release engineering for the moment, since it looks like this is a query-optimization problem, and rather low priority. We have a new DBA starting in December, so this may be worth discussing with her at that point.
Assignee: dustin → nobody
Component: Server Operations: RelEng → Release Engineering
QA Contact: zandr → release
Updated•13 years ago
|
Assignee: nobody → catlee
Assignee | ||
Comment 9•13 years ago
|
||
http://cruncher.build.mozilla.org/~catlee/wsgi/reports/waittimes
http://cruncher.build.mozilla.org/~catlee/wsgi/recent?count=200
Attachment #575274 -
Flags: review?(nrthomas)
Comment 10•13 years ago
|
||
Comment on attachment 575274 [details] [diff] [review]
faster waittime and queries
>diff --git a/buildapi/model/waittimes.py b/buildapi/model/waittimes.py
I compared the waittimes for 2011-11-13 thru to 17 on your instance and the production one (buildpool), and there were some new long-wait builds that showed up. eg:
http://cruncher.build.mozilla.org/buildapi/reports/waittimes?starttime=1321268400&endtime=1321354800
http://cruncher.build.mozilla.org/~catlee/wsgi/reports/waittimes?starttime=1321268400&endtime=1321354800
Silly question, but does your repo have the changes from bug 674057 in it ? Any idea what's going on here ?
We could remove this wonky routing from buildapi/buildapi/config/routing.py too:
map.connect('/recent/{slave}/{count}', controller='recent', action='index')
so that all queries take the same ?count=N syntax.
Comment 11•13 years ago
|
||
Otherwise that patch looks great.
Assignee | ||
Comment 12•13 years ago
|
||
changed max(builds.start_time) to min(builds.start_time) as a better indication of wait times...also makes the results match the original.
Attachment #575274 -
Attachment is obsolete: true
Attachment #575274 -
Flags: review?(nrthomas)
Attachment #575345 -
Flags: review?(nrthomas)
Comment 13•13 years ago
|
||
Comment on attachment 575345 [details] [diff] [review]
faster waittime and queries
Nice one.
Attachment #575345 -
Flags: review?(nrthomas) → review+
Assignee | ||
Updated•13 years ago
|
Attachment #575345 -
Flags: checked-in+
Assignee | ||
Updated•13 years ago
|
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•