Closed Bug 1109646 Opened 10 years ago Closed 10 years ago

reconfigure puppet emails

Categories

(Infrastructure & Operations :: RelOps: Puppet, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

Details

(Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/4218] )

Attachments

(1 file, 1 obsolete file)

These currently go to the releng-shared account, which under gmail is a "delegated" account and thus not visible unless you switch to it and away from your own mail.

Options I see:

 * change this from a shared account to a google group, with all of relops and release on it
 * set up filters in the releng-shared account to forward to individuals
 * figure out some other alerting mechanism based on error rate
   * some way for nagios to check average error rate in foreman? but foreman breaks a lot..
   * log parsing on the puppetmasters with rate detection?
centralized logging + alerting?
Sure, it's just that we don't have time for this to be a "project".  "Small task" is about all we have.
I don't quite understand why we don't just send mail to release@ like we do for cronmail, release mail, etc. Even if that's not ideal, maybe it's a good short term workaround?
relops isn't on that list.  We could use it and add ourselves, or just add relops@ and release@ to a new google group.

It's sounding like a google group is the way to go.
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/4218]
Dec 17 08:48:51 releng-puppet2 postfix/smtp[11669]: 7D186C029E: to=<releng-puppet-mail@mozilla.com>, relay=smtp.mozilla.org[63.245.216.69]:25, delay=0.17, delays=0.03/0.01/0.06/0.07, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 94FAFF22FC)

yet still no mail..
It looks like about a 10-minute delay on mail.  That's not so good!
You can filter with
  Has the words: list:"releng-puppet-mail.mozilla.com"
And more recently,

Dec 17 09:26:48 releng-puppet2 postfix/pickup[18145]: AD7C0C0612: uid=519 from=<dmitchell>
Dec 17 09:26:48 releng-puppet2 postfix/cleanup[18403]: AD7C0C0612: message-id=<20141217172648.AD7C0C0612@releng-puppet2.srv.releng.scl3.mozilla.com>
Dec 17 09:26:48 releng-puppet2 postfix/qmgr[18146]: AD7C0C0612: from=<dmitchell@releng-puppet2.srv.releng.scl3.mozilla.com>, size=551, nrcpt=1 (queue active)
Dec 17 09:26:48 releng-puppet2 postfix/smtp[18405]: AD7C0C0612: to=<releng-puppet-mail@mozilla.com>, relay=smtp.mozilla.org[63.245.216.70]:25, delay=0.2, delays=0.03/0.01/0.06/0.1, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as C6411F246F)
Dec 17 09:26:48 releng-puppet2 postfix/qmgr[18146]: AD7C0C0612: removed

arrived at 9:28, which isn't so bad.
Attached patch bug1109646.patch (obsolete) — Splinter Review
Hal, since you were asking about this..
Attachment #8537927 - Flags: review?(hwine)
Ed found a misconfiguration in the forwarding path for this email that's probably adding some of the time.  He will fix that, which should make things faster.

Note that until attachment 8537927 [details] [diff] [review] lands, mail is still going to the shared mailbox -- so keep an eye on that if landing patches.
The cause was this

[root@mx1.mail.corp.phx1 postfix]# dig mozilla.com mx +short
10 mx1.corp.phx1.mozilla.com.
10 mx2.corp.phx1.mozilla.com.
10 ASPMX3.GOOGLEMAIL.com.

Internal DNS was still pointing mozilla.com's MX to mx1/2 we don't want that. And in this case it will go round and round until it hits ASPMX3 and then send out to google. I just updated DNS so lets wait and see
Should be fixed now

[root@mx1.mail.corp.phx1 postfix]# dig mozilla.com mx +short
1 ASPMX.L.GOOGLE.com.
5 ALT1.ASPMX.L.GOOGLE.com.
5 ALT2.ASPMX.L.GOOGLE.com.
10 ASPMX2.GOOGLEMAIL.com.
10 ASPMX3.GOOGLEMAIL.com.
Comment on attachment 8537927 [details] [diff] [review]
bug1109646.patch

Review of attachment 8537927 [details] [diff] [review]:
-----------------------------------------------------------------

that'll teach me to ask questions.

two followups:

1) does it make sense to put any sort of error message in modules/smarthost/templates/main.cf.erb if relayhost is unset? (Or is that a global shared template)

2) does relayhost also need to be added to manifests/qa-config.pp ?
I shouldn't have left the relabs-config change in there -- it's already in place on the relabs branch, as is QA's.

And yes, (1) is a different bug that's on my list.
Attachment #8537927 - Attachment is obsolete: true
Attachment #8537927 - Flags: review?(hwine)
Attachment #8537944 - Flags: review?(hwine)
Comment on attachment 8537944 [details] [diff] [review]
bug1109646-r2.patch

Review of attachment 8537944 [details] [diff] [review]:
-----------------------------------------------------------------

makes sense now :)
Attachment #8537944 - Flags: review?(hwine) → review+
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
I'll get the releng-shared mailbox deleted by-and-by.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: