Closed
Bug 1436623
Opened 7 years ago
Closed 6 years ago
Redistribute connections among Pulse nodes
Categories
(Webtools :: Pulse, enhancement)
Webtools
Pulse
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mcote, Unassigned)
Details
Right now there are 18 socket descriptors in use on Pulse node 1, 2 on node 2, and 127 on node 3. This makes memory usage on node 3 quite a bit higher than the rest. This is likely what triggered the memory-usage alerts of the past day or so.
As far as I can tell, most of these connections are from taskcluster-queue. They *may* have shifted over after I rebooted node 3 and then node 1, although why almost none are on node 2, I'm not sure.
Can we somehow redistribute these connections across the nodes to equalize load?
Updated•7 years ago
|
Assignee: nobody → dustin
Comment 1•7 years ago
|
||
dustin@jemison ~ $ dig pulse.mozilla.org
;; ANSWER SECTION:
pulse.mozilla.org. 35 IN CNAME orange-antelope.rmq.cloudamqp.com.
orange-antelope.rmq.cloudamqp.com. 5 IN CNAME ec2-52-52-230-243.us-west-1.compute.amazonaws.com.
ec2-52-52-230-243.us-west-1.compute.amazonaws.com. 86375 IN A 52.52.230.243
so we're not getting the DNS round-robin we might expect here. It looks like this is just connecting to one of the three instancess (-01, specifically):
dustin@jemison ~ $ host orange-antelope-01.rmq.cloudamqp.com.
orange-antelope-01.rmq.cloudamqp.com is an alias for ec2-52-52-230-243.us-west-1.compute.amazonaws.com.
ec2-52-52-230-243.us-west-1.compute.amazonaws.com has address 52.52.230.243
dustin@jemison ~ $ host orange-antelope-02.rmq.cloudamqp.com.
orange-antelope-02.rmq.cloudamqp.com is an alias for ec2-52-52-230-113.us-west-1.compute.amazonaws.com.
ec2-52-52-230-113.us-west-1.compute.amazonaws.com has address 52.52.230.113
dustin@jemison ~ $ host orange-antelope-03.rmq.cloudamqp.com.
orange-antelope-03.rmq.cloudamqp.com is an alias for ec2-52-8-30-112.us-west-1.compute.amazonaws.com.
ec2-52-8-30-112.us-west-1.compute.amazonaws.com has address 52.8.30.112
Repeatedly querying the authoritative DNS server for this domain (route53) switches apparently randomly between -01 and -03:
dustin@jemison ~ $ dig @ns-1998.awsdns-57.co.uk. orange-antelope.rmq.cloudamqp.com.
;; ANSWER SECTION:
orange-antelope.rmq.cloudamqp.com. 30 IN CNAME ec2-52-52-230-243.us-west-1.compute.amazonaws.com.
dustin@jemison ~ $ dig @ns-1998.awsdns-57.co.uk. orange-antelope.rmq.cloudamqp.com.
;; ANSWER SECTION:
orange-antelope.rmq.cloudamqp.com. 30 IN CNAME ec2-52-8-30-112.us-west-1.compute.amazonaws.com.
So I think this is a service misconfiguration, rather than something in the taskcluster libs.
Component: Operations → Pulse
Product: Taskcluster → Webtools
Version: unspecified → other
Updated•7 years ago
|
Assignee: dustin → nobody
Comment 2•6 years ago
|
||
This is fixed in the new TC-lib-pulse, which reconnects periodically to distribute connections.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•