Closed
Bug 1076623
Opened 11 years ago
Closed 10 years ago
Aggregate db exceptions in emails from masters
Categories
(Release Engineering :: General, defect, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: nthomas, Assigned: coop)
Details
Attachments
(1 file)
2.53 KB,
patch
|
Callek
:
review+
coop
:
checked-in+
|
Details | Diff | Splinter Review |
During a full buildbot db reboot we experienced a lot of tracebacks on buildbot masters, and the exception watcher hit this problem:
Traceback (most recent call last):
File "/builds/buildbot/build1/tools/buildfarm/maintenance/watch_twistd_log.py", line 247, in <module>
hostname, exceptions, options.name)
File "/builds/buildbot/build1/tools/buildfarm/maintenance/watch_twistd_log.py", line 117, in send_msg
s.sendmail(fromaddr, [addr], m.as_string())
File "/tools/python27/lib/python2.7/smtplib.py", line 722, in sendmail
raise SMTPSenderRefused(code, resp, from_addr)
smtplib.SMTPSenderRefused: (552, '5.3.4 Message size exceeds fixed limit', 'cltbld@buildbot-master86.srv.releng.scl3.mozilla.com')
The main culprit was:
2014-10-01 19:34:52-0700 [-] Unhandled Error
Traceback (most recent call last):
File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/base.py", line 1165, in run
self.mainLoop()
File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/base.py", line 1174, in mainLoop
self.runUntilCurrent()
File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/base.py", line 796, in runUntilCurrent
call.func(*call.args, **call.kw)
File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbot-0.8.2_hg_a52601db35c3_production_0.8-py2.7.egg/buildbot/util/loop.py", line 146, in _loop_start
self._remaining = list(self.get_processors())
--- <exception caught here> ---
File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbot-0.8.2_hg_a52601db35c3_production_0.8-py2.7.egg/buildbot/master.py", line 153, in _get_processors
builders = sorter(self.parent, builders)
File "/builds/buildbot/build1/master/master_common.py", line 153, in prioritizeBuilders
(time.time() - 3600, buildmaster.master_name, buildmaster.master_incarnation))
File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbot-0.8.2_hg_a52601db35c3_production_0.8-py2.7.egg/buildbot/db/connector.py", line 182, in runQueryNow
return self.runInteractionNow(self._runQuery, *args, **kwargs)
File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbot-0.8.2_hg_a52601db35c3_production_0.8-py2.7.egg/buildbot/db/connector.py", line 212, in runInteractionNow
return self._runInteractionNow(interaction, *args, **kwargs)
File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbot-0.8.2_hg_a52601db35c3_production_0.8-py2.7.egg/buildbot/db/connector.py", line 234, in _runInteractionNow
conn = self.get_sync_connection()
File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbot-0.8.2_hg_a52601db35c3_production_0.8-py2.7.egg/buildbot/db/connector.py", line 228, in get_sync_connection
self._nonpool = self._spec.get_sync_connection()
File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbot-0.8.2_hg_a52601db35c3_production_0.8-py2.7.egg/buildbot/db/dbspec.py", line 250, in get_sync_connection
conn = dbapi.connect(*self.connargs, **connkw)
File "/builds/buildbot/build1/lib/python2.7/site-packages/MySQLdb/__init__.py", line 81, in Connect
return Connection(*args, **kwargs)
File "/builds/buildbot/build1/lib/python2.7/site-packages/MySQLdb/connections.py", line 187, in __init__
super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (2013, "Lost connection to MySQL server at 'reading initial communication packet', system error: 0")
Aggregation was originally implemented in bug 623594.
Comment 1•11 years ago
|
||
I wonder if we could hook these up to Sentry in some way? It might too hard to do in Buildbot itself, but the twistd log watcher could probably send them...
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → coop
Status: NEW → ASSIGNED
Priority: -- → P2
Assignee | ||
Comment 2•10 years ago
|
||
This patch yields exception entries formatted like this:
--------------------------------------------------------------------------------
Count: 2, Exception: Failure: twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion.
First instance: 2015-02-24 05:07:47-0800, Most recent instance: 2015-02-24 05:07:47-0800
Example:
Exception in bm70/twistd.log.2:
2015-02-24 05:07:47-0800 [HTTPPageGetter,client] Unhandled Error
Traceback (most recent call last):
Failure: twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion.
Attachment #8569363 -
Flags: review?(bugspam.Callek)
Comment 3•10 years ago
|
||
Comment on attachment 8569363 [details] [diff] [review]
Aggregate all exceptions in master twistd.logs
Review of attachment 8569363 [details] [diff] [review]:
-----------------------------------------------------------------
stamp
Attachment #8569363 -
Flags: review?(bugspam.Callek) → review+
Assignee | ||
Comment 4•10 years ago
|
||
Comment on attachment 8569363 [details] [diff] [review]
Aggregate all exceptions in master twistd.logs
Review of attachment 8569363 [details] [diff] [review]:
-----------------------------------------------------------------
https://hg.mozilla.org/build/tools/rev/aef4138c6baa
Attachment #8569363 -
Flags: checked-in+
Assignee | ||
Comment 5•10 years ago
|
||
In production.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•