Talos machines not reporting performance numbers

RESOLVED FIXED

Status

Release Engineering
General
P1
critical
RESOLVED FIXED
11 years ago
5 years ago

People

(Reporter: Robert Sayre, Assigned: mrz)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

11 years ago
Not seeing any data on Talos machines. Just "L C".

Could be related to bug 413305. Won't open the tree until perf numbers are back, but understand this may not be an ops issue.
Assignee: server-ops → aravind
* The tinderbox server is set to scrape the log correctly

* There are no RETURN or TinderPrint lines at the end of the full log (for the runs with no perf data). The former send data to the graph server; the later the display on the tinderbox waterfall.

Seems like an unlikely time for something to change in the talos code, but the 1.8 boxes are also affected so it can't be a trunk checkin.
graphs.m.o seems to be running okay.  Machine seems to be getting some hits from QA boxes.  Not sure what else we can do here, sending this on to folks that know more.
Assignee: aravind → nobody
Component: Server Operations: Tinderbox Maintenance → Testing
Product: mozilla.org → Core
QA Contact: justin → testing
Version: other → unspecified
Priority: -- → P1
Hardware: PC → All
filing another IT request to ask for more details on the graph server. The talos machines appear to be running fine and should be reporting data, but the graph server hasn't seen anything new since 1800, Jan 21.

Once confirmed that graphs is ok, I'll try rebooting the talos farm.
sorry, meant to say 1800, Jan 20. Same time the tinderbox stopped reporting digits.
added the --debug option to talos builders in buildbot. we should start seeing some debug output on next runs surrounding the data reporting.

talking with alice in irc, she confirmed my suspicions that this is the graph server. Note that the talos builders reporting to MozillaTest and graphs-stage are still posting scores and they're running the same builds.
Status: NEW → ASSIGNED
What's the bug # on the other IT request? Is there a reason why is it private? This is keeping the tree closed. :(
the tail of: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1200939480.1200947376.26190.gz&fulltext=1

shows that the graphserver's database is full. Looks like we need to add more storage?

DEBUG: process_Request line:   File "/data/graphs/bulk.cgi", line 131, in ?
DEBUG: process_Request line:     db.execute("INSERT INTO dataset_branchinfo (dataset_id, time, branchid) VALUES (?,?,?)", (setid, timeval, branchid))
DEBUG: process_Request line:   File "/data/graphs/utils/../databases/mysql.py", line 10, in execute
DEBUG: process_Request line:     result = cur.execute(query,args)
DEBUG: process_Request line:   File "/data/graphs/utils/../databases/mysql.py", line 16, in execute
DEBUG: process_Request line:     return MySQLdb.cursors.Cursor.execute(self, query, args)
DEBUG: process_Request line:   File "/usr/lib/python2.4/site-packages/MySQLdb/cursors.py", line 163, in execute
DEBUG: process_Request line:     self.errorhandler(self, exc, value)
DEBUG: process_Request line:   File "/usr/lib/python2.4/site-packages/MySQLdb/connections.py", line 35, in defaulterrorhandler
DEBUG: process_Request line:     raise errorclass, errorvalue
DEBUG: process_Request line: OperationalError: (1114, "The table 'dataset_branchinfo' is full")
(Assignee)

Updated

11 years ago
Assignee: nobody → mrz
Severity: blocker → critical
Status: ASSIGNED → NEW
Depends on: 413344
space added. We were blowing out mysql's default table size.

We'll need to figure out how we're going to handle these databases in the future. Aravind suggests adding monitoring.
Status: NEW → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED

Comment 9

11 years ago
(In reply to comment #8)
> space added. We were blowing out mysql's default table size.
> 
> We'll need to figure out how we're going to handle these databases in the
> future. Aravind suggests adding monitoring.
> 

Can we get a bug on file for that?
(In reply to comment #9)
> Can we get a bug on file for that?
> 

Filed bug 413401
Mass move of Core:Testing bugs to mozilla.org:ReleaseEngineering. Filter on RelEngMassMove to ignore.
Component: Testing → Release Engineering
Product: Core → mozilla.org
QA Contact: testing → release
Version: unspecified → other
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.