All users were logged out of Bugzilla on October 13th, 2018

PuppetAgain error e-mails should get processed more timely.



6 years ago
5 months ago


(Reporter: Callek, Unassigned)





6 years ago
So, there was a brief syntax error checked in earlier today, which caused all ec2 slaves to scream about puppet errors.

That however has been gradually filling our shared inbox for the greater part of 7 hours by now, which would surely mask other real issues if there were any transient or code-related ones.

The e-mail headers show (ET) ~4pm, while its now 11:17pm.

We should find a way to let these get processed/sent faster. I'm not sure if the delay is on a puppetAgain system, mozilla's e-mail servers, or what, but I think this is a huge pain, and has caused me confusion in the past since they are received well past the moment of incident.

Comment 1

6 years ago
Note the e-mail is still strolling in for this same event, so far > 14 hours past the event.
As I understand it, that's because they're in AWS.  I don't know how AWS handles email, but presumably it's rate-limited to reduce spamming.
Assignee: server-ops-releng → nobody
Component: Server Operations: RelEng → Release Engineering: Machine Management
QA Contact: arich → armenzg
I think that's our smarthost limiting us: 

Feb 12 05:52:41 puppetmaster-02 postfix/error[28966]: B4B5639FE: to=<>, relay=none, delay=59398, delays=59355/43/0/0, dsn=4.4.1, status=deferred (delivery temporarily suspended: connect to[]:25: Connection timed out)

Comment 4

6 years ago
Point of reference, error e-mails still flowing in, now 27 hours after event

Comment 5

6 years ago
Ok, so props go to IT for their help with me to track this down.

tldr; we have to request from amazon to fix this for us, I submitted said request

Myself, Limed, Solarce, Ravi, and Justdave were throwing ideas up.

We realized, for sure, that we have routes from AWS --> smtp.m.o
We confirmed that smtp.m.o maps to mx[12].corp....
We confirmed that the AWS puppetmaster can connect to mx[12] on port 25

The mails are held in postfix queue on the puppetmaster, due to connection timing out against mx[12]

solarce noticed (specifically last response there)

which describes Amazon restricting, by default, SMTP connection limits, even within the VPC. And pointed at a way to file a request to extend said limit.

I filled out said request at

Rail also mentions that this likely was the root cause of Bug 824485

Comment 6

6 years ago

We've reviewed and approved your request for the removal of the EC2 e-mail sending limitations on your Amazon Web Services account. There are no longer limitations on your account for any IPs and instances under your account. If you requested removal of e-mail sending limits on Amazon Elastic IPs, they've also been removed.

Because reverse DNS record entries are commonly considered in anti-SPAM filters, we recommend assigning a reverse DNS record to the Elastic IP address you use to send email to third parties. Please use the form located at this link to request a reverse DNS entry:

Note that a corresponding forward DNS record mapping only one domain to one Elastic IP address must exist before we can create the reverse DNS record on our side.

Thank you for your inquiry. Did I solve your problem?
If yes, please click here:<SNIP>
If no, please click here:<SNIP>

Best regards,

Oren M.

---- Original message: ----

AWS AccountId                        <SNIP>
AccountEmailAddress                        <SNIP>
UseCaseDescription                        We have puppet masters in AWS EC2 instances, which we want to e-mail us if there are errors in puppetizing our machines.

When a puppet-wide error happens, (say a typo in node definition) we'd normally get hundreds of e-mails per machine over the course of a few minutes. Though with the SMTP limiting I see (based on expirimentation and ) it takes well over a day for a very brief event to get purged from the mail queue we have on the puppet machine(s).

All mails are internal to us (routing from our AWS systems to an in-house mail server) I would like the ability for our machines to send a max of 1000 e-mails within an hour.

The max will rarely be reached at present, but when the burst of emails happen we want them to still get delivered, but not end up blocking future issues.
Last Resolved: 6 years ago
Resolution: --- → FIXED


5 years ago
Product: → Release Engineering


5 months ago
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.