Closed
Bug 1086923
Opened 10 years ago
Closed 10 years ago
MDN: unable to connect to smtp.socketlabs.com
Categories
(Infrastructure & Operations Graveyard :: WebOps: Community Platform, task)
Infrastructure & Operations Graveyard
WebOps: Community Platform
x86
macOS
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: groovecoder, Unassigned)
Details
(Whiteboard: [kanban:https://kanbanize.com/ctrl_board/4/1708] )
10 of these in the last 3 days: socket:error: [Errno 110] Connection timed out [1] I thought the first few might be transient errors, but now I think the email server is down or inaccessible? It's trying to use smtp.socketlabs.com. (See settings_local.py) [1] https://rpm.newrelic.com/accounts/263620/applications/3172075/traced_errors/2410522397
Comment 1•10 years ago
|
||
best guess is that netflows are restricting this access outbound. copying :dcurado and :XioNoX from netops for their thoughts here. [cturra@developer1.webapp.scl3 ~]$ nc -zv smtp.socketlabs.com 465 nc: connect to smtp.socketlabs.com port 465 (tcp) failed: Connection timed out
Flags: needinfo?(dcurado)
Flags: needinfo?(arzhel)
Comment 2•10 years ago
|
||
How about this: can you ping it, but not get to port 465/tcp? That's a great way to tell if the firewalls are blocking something. We let all icmp through. Thanks, Dave
Comment 3•10 years ago
|
||
pings are getting through... [cturra@developer1.webapp.scl3 ~]$ ping -c5 smtp.socketlabs.com PING in.socketlabs.com (142.0.180.14) 56(84) bytes of data. 64 bytes from lb.h.in.socketlabs.com (142.0.180.14): icmp_seq=1 ttl=114 time=95.7 ms 64 bytes from lb.h.in.socketlabs.com (142.0.180.14): icmp_seq=2 ttl=114 time=79.8 ms 64 bytes from lb.h.in.socketlabs.com (142.0.180.14): icmp_seq=3 ttl=114 time=79.2 ms 64 bytes from lb.h.in.socketlabs.com (142.0.180.14): icmp_seq=4 ttl=114 time=78.5 ms 64 bytes from lb.h.in.socketlabs.com (142.0.180.14): icmp_seq=5 ttl=114 time=78.7 ms --- in.socketlabs.com ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4084ms rtt min/avg/max/mdev = 78.548/82.422/95.743/6.675 ms
Comment 4•10 years ago
|
||
Can you specify the src FQDNs and/or IPs, and if there are any other dst hosts and/or any other ports? Thanks.
Comment 5•10 years ago
|
||
:groovecoder - pls correct me if any of these details are incorrect (specifically tcp port used in your code). (In reply to Dave Curado :dcurado from comment #4) > Can you specify the src FQDNs and/or IPs, and if there are any other dst > hosts and/or any other ports? looks to be happening on all dev/stage/prod nodes, which have a source of: dev: developer1.dev.webapp.scl3.mozilla.com (10.22.81.16) stage: developer1.stage.webapp.scl3.mozilla.com (10.22.81.17) prod: developer1.webapp.scl3.mozilla.com (10.22.81.18) developer2.webapp.scl3.mozilla.com (10.22.81.19) developer3.webapp.scl3.mozilla.com (10.22.81.20) i have no idea about the dst.
Comment 6•10 years ago
|
||
Thanks for providing the source side of the information. Will Luke know about the destination host(s)? If not, can you guys figure out who knows and update the bug when you get the info? That would be greatly appreciated. Thanks
Comment 7•10 years ago
|
||
all i know is about this socketlabs smtp service is what dns tells me. $ dig +short smtp.socketlabs.com in.socketlabs.com. 142.0.179.10 :groovecoder - anything else you know here?
Flags: needinfo?(lcrouch)
Reporter | ||
Comment 8•10 years ago
|
||
We don't override the EMAIL_PORT setting, so it should be the default port 25.
Flags: needinfo?(lcrouch)
Reporter | ||
Comment 9•10 years ago
|
||
:cturra - note the email tasks are executed by the developer-celeryN hosts.
Comment 10•10 years ago
|
||
(In reply to Luke Crouch [:groovecoder] from comment #9) > :cturra - note the email tasks are executed by the developer-celeryN hosts. with this info... my tests look fine from those hosts. could this issue be outside of our control? [cturra@developeradm.private.scl3 ~]$ issue-multi-command celery nc -zv smtp.socketlabs.com 25 [2014-10-21 15:32:13] [developer-celery1.webapp.scl3.mozilla.com] running: nc -zv smtp.socketlabs.com 25 [2014-10-21 15:32:13] [developer-celery2.webapp.scl3.mozilla.com] running: nc -zv smtp.socketlabs.com 25 [2014-10-21 15:32:13] [developer-celery3.webapp.scl3.mozilla.com] running: nc -zv smtp.socketlabs.com 25 [2014-10-21 15:32:14] [developer-celery1.webapp.scl3.mozilla.com] finished: nc -zv smtp.socketlabs.com 25 (0.335s) [developer-celery1.webapp.scl3.mozilla.com] out: Connection to smtp.socketlabs.com 25 port [tcp/smtp] succeeded! [2014-10-21 15:32:14] [developer-celery3.webapp.scl3.mozilla.com] finished: nc -zv smtp.socketlabs.com 25 (0.393s) [developer-celery3.webapp.scl3.mozilla.com] out: Connection to smtp.socketlabs.com 25 port [tcp/smtp] succeeded! [2014-10-21 15:32:14] [developer-celery2.webapp.scl3.mozilla.com] finished: nc -zv smtp.socketlabs.com 25 (0.451s) [developer-celery2.webapp.scl3.mozilla.com] out: Connection to smtp.socketlabs.com 25 port [tcp/smtp] succeeded!
Updated•10 years ago
|
Severity: critical → major
Comment 11•10 years ago
|
||
clearing the needinfo from me, as comment #10 suggests there, uh, isn't an issue? (please pull me in again if needs be!) THanks
Comment 12•10 years ago
|
||
And I dropped the severity because this was paging me.
Flags: needinfo?(dcurado)
Reporter | ||
Comment 13•10 years ago
|
||
There's no outage report from SocketLabs. [1] Another possibility is the SocketLabs account we're using has been closed. :cturra - does WebOps have a new SocketLabs account? [1] http://status.socketlabs.com/
Flags: needinfo?(cturra)
Comment 14•10 years ago
|
||
no, we (webops) have no access to socketlabs. in fact, this is the only site that i know of using them. additionally, looking at the configuration, it looks like morgamic was the one who create it. from the settings file... # bug 869588 EMAIL_HOST = 'smtp.socketlabs.com' EMAIL_HOST_USER = 'morgamic' EMAIL_HOST_PASSWORD = <REMOVED>
Flags: needinfo?(cturra)
Comment 15•10 years ago
|
||
(In reply to Luke Crouch [:groovecoder] from comment #13) > There's no outage report from SocketLabs. [1] Another possibility is the > SocketLabs account we're using has been closed. :cturra - does WebOps have a > new SocketLabs account? > > [1] http://status.socketlabs.com/ Oh this is likely. I heard something about people (in the SF office) asking about Socketlabs and which credit card it was going to. I'll try to track it down tomorrow if I can. Luke, Can you guys in the meantime figure out how to get your own Socketlabs account? In case this is closed...that would be the way forward.
Updated•10 years ago
|
Flags: needinfo?(arzhel)
Comment 16•10 years ago
|
||
Hi there, we're seeing more timeouts. Please provide alternative SMTP credentials, MDN's users aren't getting emails because of that.
Severity: major → critical
Comment 17•10 years ago
|
||
Webops are looking lowering the priority so Onduty doesn't get paged.
Severity: critical → normal
Comment 18•10 years ago
|
||
(In reply to Ludovic Hirlimann [:Usul] from comment #17) > Webops are looking lowering the priority so Onduty doesn't get paged. :jezdez - socketlabs is not a service we've (IT) purchased or provided. as mentioned in comment 14, it was setup by morgamic. :groovecoder - if MDN needs to continue using this service, we'll need you guys to provide us with updated credentials for socketlabs.
Flags: needinfo?(lcrouch)
Comment 19•10 years ago
|
||
:cturra I've seen the comment, :groovecoder is on PTO today and can't organize a new socketlabs account. I don't have a credit card or access to the socketlabs account to create an alternative. Since morgamic isn't with the company anymore I would guess this is a legacy that we should get rid as soon as possible, at best now. :cyliang mentioned on IRC that one of the two people able to log into socketlabs Wil Clouser is also on PTO today. The other person is Jared Hirsch, who she wasn't able to get in touch with yet. All in all, I know it's not your fault, but I honestly don't know who to talk to instead. This is the third day we're seeing SMTP timeouts, even though it was only filed yesterday (UTC) which is why I ask for your help here.
Comment 20•10 years ago
|
||
After much sturm und drang, we think that: 1. The old morgamic credentials should be working (based on the fact that marketplace also uses them). 2. There might be an issue with SMTP timeouts due to a new host being added to their service: "A new IP address of [ 54.213.1.165 ] has been added to DNS resolution of the smtp.socketlabs.com gateway". @dcurado: Do you know if there is anything, networking-wise, where we might need to add a new host to an ACL, list of white-listed IPs, etc. (The list of IPs that are already probably in that list can be found at https://support.socketlabs.com/index.php/Knowledgebase/Article/View/94.)
Flags: needinfo?(lcrouch) → needinfo?(dcurado)
Comment 21•10 years ago
|
||
Ammended to point out that attempts to netcat to the new IP never return: [cliang@developer-celery1.webapp.scl3 ~]$ nc -vz 54.213.1.165 25 ^C
Comment 22•10 years ago
|
||
Here's how this request should have been written: Please ensure that following list of source hosts: developer1.dev.webapp.scl3.mozilla.com (10.22.81.16) developer1.stage.webapp.scl3.mozilla.com (10.22.81.17) developer1.webapp.scl3.mozilla.com (10.22.81.18) developer2.webapp.scl3.mozilla.com (10.22.81.19) developer3.webapp.scl3.mozilla.com (10.22.81.20) can get to the following destination hosts: 142.0.179.10 142.0.180.14 23.23.219.154 54.86.14.32 54.213.1.165 54.187.77.82 On the SMTP port (port 25/tcp) You could also add that there may already be an existing policy, and that 54.187.77.82 is new and may need to be added.
Comment 23•10 years ago
|
||
The existing policy was: From zone: webapp, To zone: untrust Source addresses: developer-celery3: 10.22.81.83/32 developer-celery2: 10.22.81.82/32 developer-celery1: 10.22.81.40/32 developer3: 10.22.81.20/32 developer2: 10.22.81.19/32 developer1: 10.22.81.18/32 developer1.stage: 10.22.81.17/32 developer1.dev: 10.22.81.16/32 Destination addresses: cidr-block.socketlabs.com: 142.0.176.0/20 lb.sg.in.socketlabs.com: 142.0.179.10/32 lb.h.in.socketlabs.com: 142.0.180.14/32 lb.rsc.in.socketlabs.com: 184.106.77.171/32 lb.east.aws.in.socketlabs.com: 23.23.219.154/32 Application: junos-smtp IP protocol: tcp, ALG: 0, Inactivity timeout: 1800 Source port range: [0-0] Destination port range: [25-25] The policy now is: From zone: webapp, To zone: untrust Source addresses: developer3: 10.22.81.20/32 developer2: 10.22.81.19/32 developer1: 10.22.81.18/32 developer-celery3: 10.22.81.83/32 developer-celery2: 10.22.81.82/32 developer-celery1: 10.22.81.40/32 developer1.stage: 10.22.81.17/32 developer1.dev: 10.22.81.16/32 Destination addresses: socketlabs-54.187.77.82: 54.187.77.82/32 socketlabs-54.213.1.165: 54.213.1.165/32 socketlabs-54.86.14.32: 54.86.14.32/32 cidr-block.socketlabs.com: 142.0.176.0/20 lb.sg.in.socketlabs.com: 142.0.179.10/32 lb.h.in.socketlabs.com: 142.0.180.14/32 lb.rsc.in.socketlabs.com: 184.106.77.171/32 lb.east.aws.in.socketlabs.com: 23.23.219.154/32 Application: junos-smtp IP protocol: tcp, ALG: 0, Inactivity timeout: 1800 Source port range: [0-0] Destination port range: [25-25] So there were missing source hosts and missing destination hosts. I hope this fixes the problem. If not, please update this bug and specify the source IP, dest IP, port and protocol that is not working for you. Thanks!
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment 24•10 years ago
|
||
Thanks! I wasn't sure if there was an existing policy; now that I know that there is one, we'll know to file an ACL if SocketLabs updates their list again.
Comment 25•10 years ago
|
||
Just for documentation purposes, the socketlabs IP addresses came from [1], which hasn't changed since April 2012, but might change in the future :-) [1] https://support.socketlabs.com/index.php/Knowledgebase/Article/View/94
Comment 26•10 years ago
|
||
x
Updated•10 years ago
|
Flags: needinfo?(dcurado)
Updated•6 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•