Closed Bug 861692 Opened 11 years ago Closed 11 years ago

Try mail not being sent from buildbot-master5{1..4}

Categories

(Release Engineering :: Release Automation: Other, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Assigned: rail)

References

Details

Attachments

(1 file)

The try push http://hg.mozilla.org/try/rev/d7a810265c5a sets -e in the trychooser message, so we try to send email for all jobs. These are failing to run try_emailer, and end up with nagios complaining about dead jobs, eg
 buildbot-master53.srv.releng.usw2.mozilla.com:Command Queue is CRITICAL: 4 dead items

If you look at one of those logs it has:
Running [u'/builds/buildbot/tests1-linux/bin/python', u'/builds/buildbot/tests1-linux/lib/python2.7/site-packages/buildbotcustom/bin/try_mailer.py', u'--log-url', u'http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/yura.zenevich@gmail.com-d7a810265c5a/try-linux64/try_ubuntu64_vm_test-mochitest-5-bm53-tests1-linux-build29.txt.gz', u'-f', u'tryserver@build.mozilla.org', u'--to-author', u'/builds/buildbot/tests1-linux/master/try_ubuntu64_vm_test-mochitest-5', u'29']
Killing with 2
Killing with 15

Result: -15

That's a 'this took too long, killing it' type of message, so something must be wrong with the mail setup on these masters. That isn't too surprising given they're on new puppet and this is probably the first time we tried to email. I see we swapped from sendmail to smarthost in the move from CentOS 5 to 6.
I've moved the failing jobs away to /dev/shm/queue/commands/bug861692/ to clear nagios alerts.
Assignee: nobody → rail
Apr 14 03:53:46 buildbot-master52 postfix/smtpd[1959]: fatal: open database /etc/aliases.db: No such file or directory

I ran the following command on all masters to make postfix start:

# /usr/sbin/postalias /etc/aliases
# service postfix restart

Puppet tries to use newaliases to generate aliases.db, but

# newaliases
newaliases: Aliases are not used in sSMTP
# echo $?
0
# rpm -qf `which newaliases`
ssmtp-2.61-15.el6.x86_64

Also, when I tried to use "mail" to send a test message, I got:
Apr 14 18:50:58 buildbot-master52 sSMTP[763]: Unable to locate mail
Apr 14 18:50:58 buildbot-master52 sSMTP[763]: Cannot open mail:25

After I removed ssmtp (which provide sednmail binary):
$ echo 123 | mail meee@mozilla.com
/usr/sbin/sendmail: No such file or directory
"/root/dead.letter" 8/197
. . . message not sent.

To fix this I did the following:
# ln -s /usr/lib/sendmail /usr/sbin/sendmail
(/usr/lib/sendmail is provided by postfix)

Also, postfix complains about file ownership:
Apr 14 18:58:21 buildbot-master52 postfix/postfix-script[979]: warning: not owned by root: /etc/postfix/main.cf


I'll prepare a puppet patch to fix the issue
$ cp bug861692/1365954452-1-15896jBvFKV.5 new/1365954452-1-15896jBvFKV

Apr 14 19:16:21 buildbot-master54 postfix/smtpd[4891]: warning: dict_nis_init: NIS domain name not set - NIS lookups disabled
Apr 14 19:16:21 buildbot-master54 postfix/smtpd[4891]: connect from localhost.localdomain[127.0.0.1]
Apr 14 19:16:21 buildbot-master54 postfix/smtpd[4891]: 7B28741423: client=localhost.localdomain[127.0.0.1]
Apr 14 19:16:21 buildbot-master54 postfix/cleanup[4894]: 7B28741423: message-id=<20130415021621.7B28741423@buildbot-master54>
Apr 14 19:16:21 buildbot-master54 postfix/qmgr[3990]: 7B28741423: from=<tryserver@build.mozilla.org>, size=1107, nrcpt=1 (queue active)
Apr 14 19:16:21 buildbot-master54 postfix/smtpd[4891]: disconnect from localhost.localdomain[127.0.0.1]
Apr 14 19:16:23 buildbot-master54 postfix/smtp[4895]: 7B28741423: to=<xxxxxxxxxxx@gmail.com>, relay=gmail-smtp-in.l.google.com[74.125.130.27]:25, delay=1.8, delays=0.03/0.02/0.89/0.88, dsn=2.0.0, status=sent (250 2.0.0 OK 1365992183 v67si18108499yhm.58 - gsmtp)
Apr 14 19:16:23 buildbot-master54 postfix/qmgr[3990]: 7B28741423: removed
Attached patch postfix fixesSplinter Review
Remove ssmtp. "alternatives --auto mta" fixes symlinks (ssmtp uninstall doesn't handle this). Tested.
Attachment #737350 - Flags: review?(dustin)
Blocks: 803823
Comment on attachment 737350 [details] [diff] [review]
postfix fixes

Huh, where's ssmtp coming from?  Is the AWS kickstart not a clean Base-only image?
Attachment #737350 - Flags: review?(dustin) → review+
(In reply to Dustin J. Mitchell [:dustin] from comment #5)
> Huh, where's ssmtp coming from?  Is the AWS kickstart not a clean Base-only
> image?

Probably it comes as a part of Base yum group. I'll check that.
This should be fixed now.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: