If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Try mail not being sent from buildbot-master5{1..4}

RESOLVED FIXED

Status

Release Engineering
Release Automation
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: nthomas, Assigned: rail)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

5 years ago
The try push http://hg.mozilla.org/try/rev/d7a810265c5a sets -e in the trychooser message, so we try to send email for all jobs. These are failing to run try_emailer, and end up with nagios complaining about dead jobs, eg
 buildbot-master53.srv.releng.usw2.mozilla.com:Command Queue is CRITICAL: 4 dead items

If you look at one of those logs it has:
Running [u'/builds/buildbot/tests1-linux/bin/python', u'/builds/buildbot/tests1-linux/lib/python2.7/site-packages/buildbotcustom/bin/try_mailer.py', u'--log-url', u'http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/yura.zenevich@gmail.com-d7a810265c5a/try-linux64/try_ubuntu64_vm_test-mochitest-5-bm53-tests1-linux-build29.txt.gz', u'-f', u'tryserver@build.mozilla.org', u'--to-author', u'/builds/buildbot/tests1-linux/master/try_ubuntu64_vm_test-mochitest-5', u'29']
Killing with 2
Killing with 15

Result: -15

That's a 'this took too long, killing it' type of message, so something must be wrong with the mail setup on these masters. That isn't too surprising given they're on new puppet and this is probably the first time we tried to email. I see we swapped from sendmail to smarthost in the move from CentOS 5 to 6.
(Reporter)

Comment 1

5 years ago
I've moved the failing jobs away to /dev/shm/queue/commands/bug861692/ to clear nagios alerts.
(Assignee)

Updated

5 years ago
Assignee: nobody → rail
(Assignee)

Comment 2

5 years ago
Apr 14 03:53:46 buildbot-master52 postfix/smtpd[1959]: fatal: open database /etc/aliases.db: No such file or directory

I ran the following command on all masters to make postfix start:

# /usr/sbin/postalias /etc/aliases
# service postfix restart

Puppet tries to use newaliases to generate aliases.db, but

# newaliases
newaliases: Aliases are not used in sSMTP
# echo $?
0
# rpm -qf `which newaliases`
ssmtp-2.61-15.el6.x86_64

Also, when I tried to use "mail" to send a test message, I got:
Apr 14 18:50:58 buildbot-master52 sSMTP[763]: Unable to locate mail
Apr 14 18:50:58 buildbot-master52 sSMTP[763]: Cannot open mail:25

After I removed ssmtp (which provide sednmail binary):
$ echo 123 | mail meee@mozilla.com
/usr/sbin/sendmail: No such file or directory
"/root/dead.letter" 8/197
. . . message not sent.

To fix this I did the following:
# ln -s /usr/lib/sendmail /usr/sbin/sendmail
(/usr/lib/sendmail is provided by postfix)

Also, postfix complains about file ownership:
Apr 14 18:58:21 buildbot-master52 postfix/postfix-script[979]: warning: not owned by root: /etc/postfix/main.cf


I'll prepare a puppet patch to fix the issue
(Assignee)

Comment 3

5 years ago
$ cp bug861692/1365954452-1-15896jBvFKV.5 new/1365954452-1-15896jBvFKV

Apr 14 19:16:21 buildbot-master54 postfix/smtpd[4891]: warning: dict_nis_init: NIS domain name not set - NIS lookups disabled
Apr 14 19:16:21 buildbot-master54 postfix/smtpd[4891]: connect from localhost.localdomain[127.0.0.1]
Apr 14 19:16:21 buildbot-master54 postfix/smtpd[4891]: 7B28741423: client=localhost.localdomain[127.0.0.1]
Apr 14 19:16:21 buildbot-master54 postfix/cleanup[4894]: 7B28741423: message-id=<20130415021621.7B28741423@buildbot-master54>
Apr 14 19:16:21 buildbot-master54 postfix/qmgr[3990]: 7B28741423: from=<tryserver@build.mozilla.org>, size=1107, nrcpt=1 (queue active)
Apr 14 19:16:21 buildbot-master54 postfix/smtpd[4891]: disconnect from localhost.localdomain[127.0.0.1]
Apr 14 19:16:23 buildbot-master54 postfix/smtp[4895]: 7B28741423: to=<xxxxxxxxxxx@gmail.com>, relay=gmail-smtp-in.l.google.com[74.125.130.27]:25, delay=1.8, delays=0.03/0.02/0.89/0.88, dsn=2.0.0, status=sent (250 2.0.0 OK 1365992183 v67si18108499yhm.58 - gsmtp)
Apr 14 19:16:23 buildbot-master54 postfix/qmgr[3990]: 7B28741423: removed
(Assignee)

Comment 4

5 years ago
Created attachment 737350 [details] [diff] [review]
postfix fixes

Remove ssmtp. "alternatives --auto mta" fixes symlinks (ssmtp uninstall doesn't handle this). Tested.
Attachment #737350 - Flags: review?(dustin)
(Assignee)

Updated

5 years ago
Blocks: 803823
Comment on attachment 737350 [details] [diff] [review]
postfix fixes

Huh, where's ssmtp coming from?  Is the AWS kickstart not a clean Base-only image?
Attachment #737350 - Flags: review?(dustin) → review+
(Assignee)

Comment 6

5 years ago
(In reply to Dustin J. Mitchell [:dustin] from comment #5)
> Huh, where's ssmtp coming from?  Is the AWS kickstart not a clean Base-only
> image?

Probably it comes as a part of Base yum group. I'll check that.
(Assignee)

Comment 7

5 years ago
Comment on attachment 737350 [details] [diff] [review]
postfix fixes

https://hg.mozilla.org/build/puppet/rev/53df2b645bd9
Attachment #737350 - Flags: checked-in+
(Assignee)

Comment 8

5 years ago
This should be fixed now.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.