Closed Bug 1034848 Opened 10 years ago Closed 10 years ago

all puppetagain servers (os x, centos, ubuntu) can't send mail to root

Categories

(Infrastructure & Operations :: RelOps: Puppet, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

References

Details

Attachments

(1 file)

Jul  5 00:00:01 mobile-imaging-001 postfix/sendmail[27335]: warning: the Postfix sendmail command has set-uid root file permissions
Jul  5 00:00:01 mobile-imaging-001 postfix/sendmail[27335]: warning: or the command is run from a set-uid root process
Jul  5 00:00:01 mobile-imaging-001 postfix/sendmail[27335]: warning: the Postfix sendmail command must be installed without set-uid root file permissions
Jul  5 00:00:02 mobile-imaging-001 postfix/pickup[27155]: 13CB03804EF: uid=48 from=<root>
Jul  5 00:00:02 mobile-imaging-001 postfix/cleanup[27338]: 13CB03804EF: message-id=<20140705070002.13CB03804EF@mobile-imaging-001>
Jul  5 00:00:02 mobile-imaging-001 postfix/qmgr[1651]: 13CB03804EF: from=<root@mobile-imaging-001.p1.releng.scl3.mozilla.com>, size=875, nrcpt=1 (queue active)
Jul  5 00:00:02 mobile-imaging-001 postfix/smtp[27340]: 13CB03804EF: to=<apache@mobile-imaging-001.p1.releng.scl3.mozilla.com>, orig_to=<apache>, relay=none, delay=0.22, delays=0.21/0/0/0, dsn=5.4.6, status=bounced (mail for mobile-imaging-001.p1.releng.scl3.mozilla.com loops back to myself)
Jul  5 00:00:02 mobile-imaging-001 postfix/cleanup[27338]: 442CC3804F0: message-id=<20140705070002.442CC3804F0@mobile-imaging-001>
Jul  5 00:00:02 mobile-imaging-001 postfix/bounce[27341]: 13CB03804EF: sender non-delivery notification: 442CC3804F0
Jul  5 00:00:02 mobile-imaging-001 postfix/qmgr[1651]: 442CC3804F0: from=<>, size=2995, nrcpt=1 (queue active)
Jul  5 00:00:02 mobile-imaging-001 postfix/qmgr[1651]: 13CB03804EF: removed
Jul  5 00:00:02 mobile-imaging-001 postfix/smtp[27340]: 442CC3804F0: to=<root@mobile-imaging-001.p1.releng.scl3.mozilla.com>, relay=none, delay=0.05, delays=0.04/0/0/0, dsn=5.4.6, status=bounced (mail for mobile-imaging-001.p1.releng.scl3.mozilla.com loops back to myself)
Jul  5 00:00:02 mobile-imaging-001 postfix/qmgr[1651]: 442CC3804F0: removed
Same happens on old mozpool servers in scl1, as well as on slaveapi1.  The latter successfully sent an abrtd email as recently as June 19, and the hosts' smtp config hasn't changed since then (or long before).  That host was upgraded to CentOS 6.5 on June 11, so it's not a CentOS version issue.  Anyway, I see the same on foopy117.tegra.releng.scl3.mozilla.com which is CentOS 6.2.

Easy to replicate with 'echo test | mail -s test root'.

tcpdump shows no smtp traffic.  The DNS traffic is

12:07:53.054881 IP mobile-imaging-005.p5.releng.scl1.mozilla.com.34716 > ns-vip.infra.scl1.mozilla.com.domain: 36954+ MX? mobile-imaging-005.p5.releng.scl1.mozilla.com. (63)
12:07:53.055078 IP mobile-imaging-005.p5.releng.scl1.mozilla.com.59808 > ns-vip.infra.scl1.mozilla.com.domain: 52109+ PTR? 11.75.12.10.in-addr.arpa. (42)
12:07:53.056028 IP ns-vip.infra.scl1.mozilla.com.domain > mobile-imaging-005.p5.releng.scl1.mozilla.com.34716: 36954* 0/1/0 (123)
12:07:53.056138 IP mobile-imaging-005.p5.releng.scl1.mozilla.com.46167 > ns-vip.infra.scl1.mozilla.com.domain: 17511+ A? mobile-imaging-005.p5.releng.scl1.mozilla.com. (63)
12:07:53.056219 IP ns-vip.infra.scl1.mozilla.com.domain > mobile-imaging-005.p5.releng.scl1.mozilla.com.59808: 52109* 1/2/2 PTR ns-vip.infra.scl1.mozilla.com. (164)
12:07:53.056342 IP mobile-imaging-005.p5.releng.scl1.mozilla.com.45093 > ns-vip.infra.scl1.mozilla.com.domain: 48054+ PTR? 33.132.12.10.in-addr.arpa. (43)
12:07:53.056906 IP ns-vip.infra.scl1.mozilla.com.domain > mobile-imaging-005.p5.releng.scl1.mozilla.com.46167: 17511* 1/2/2 A 10.12.132.33 (158)
12:07:53.057400 IP ns-vip.infra.scl1.mozilla.com.domain > mobile-imaging-005.p5.releng.scl1.mozilla.com.45093: 48054* 1/2/2 PTR mobile-imaging-005.p5.releng.scl1.mozilla.com. (181)
12:07:53.161189 IP mobile-imaging-005.p5.releng.scl1.mozilla.com.47660 > ns-vip.infra.scl1.mozilla.com.domain: 17513+ MX? mobile-imaging-005.p5.releng.scl1.mozilla.com. (63)
12:07:53.162133 IP ns-vip.infra.scl1.mozilla.com.domain > mobile-imaging-005.p5.releng.scl1.mozilla.com.47660: 17513* 0/1/0 (123)
12:07:53.162255 IP mobile-imaging-005.p5.releng.scl1.mozilla.com.48930 > ns-vip.infra.scl1.mozilla.com.domain: 51418+ A? mobile-imaging-005.p5.releng.scl1.mozilla.com. (63)
12:07:53.163024 IP ns-vip.infra.scl1.mozilla.com.domain > mobile-imaging-005.p5.releng.scl1.mozilla.com.48930: 51418* 1/2/2 A 10.12.132.33 (158)

which is to say, it's looking up its own hostname for MX and then A, and delivering there -- so it's ignoring /etc/aliases.

[root@mobile-imaging-005.p5.releng.scl1.mozilla.com ~]# postconf  | grep alias_da
alias_database = hash:/etc/aliases

I've run 'newaliases' and 'postfix reload' to no avail.

I'm betting that some postfix config changes could fix this -- but this used to work, so what changed?
I can't figure out why this used to work (or even find concrete proof that it did).  I suspect what I'm remembering as "working" was really crontasks with an explicit MAIL=releng-shared@mozilla.org in them.

I found that removing myhostname, mydomain, and myorigin from the config makes it work again.  Also, the relay stuff is wrong, and isn't necessary.  Also, we're listening on port 25 on all interfaces, when we really should just be listening on localhost.
Ah, but setting relayhost = [smtp.mozilla.org] (not relay_domains) will avoid postini.
Summary: mobile-imaging servers can't send mail → all puppetagain servers (os x, centos, ubuntu) can't send mail to root
Attached patch bug1034848.patchSplinter Review
I tested this with manual reconfiguration of postfix on OS X (signing servers) and Ubuntu (Openstack servers), and with a puppet run on CentOS (mobile-imaging-stage1).

Orgs that don't specify relayhost won't have any trouble - the server will just deliver based on the domain in the email address.
Assignee: relops → dustin
Attachment #8457007 - Flags: review?(arich)
Comment on attachment 8457007 [details] [diff] [review]
bug1034848.patch

Review of attachment 8457007 [details] [diff] [review]:
-----------------------------------------------------------------

I presume you undef the relayhost in case this is ever used on a mail server. I wonder if this will break mail relaying outside of moco (releng), but I suspect it's already broken.  Might be worth a note to other people using puppetagain that they'd need also need to set this variable to smtp.mozilla.com to get their mail relaying to work.
Attachment #8457007 - Flags: review?(arich) → review+
I hope we'll never find ourselves running our own mail server!

To be clear, the default is no relayhost, which is the safe option -- every host just delivers directly to the email address's domain.  At Mozilla, we have a specific relay host that works better (avoids postini, among other things), so we use that.
Attachment #8457007 - Flags: checked-in+
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: