Pulse fails with 'connection timeout' errors

RESOLVED FIXED

Status

Webtools
Pulse
RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: jgriffin, Assigned: christian)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

7 years ago
Starting any new pulse listener (with code that previously worked fine) fails with an error such as:

Traceback (most recent call last):
  File "latestbuild.py", line 85, in <module>
    monitor.listen()
  File "c:\mozilla\pulsebuildmonitor\pulsebuildmonitor\pulsebuildmonitor.py", li
ne 284, in listen
    self.pulse.listen()
  File "build\bdist.win32\egg\mozillapulse\consumers.py", line 115, in listen
  File "c:\mozilla\build2\python\lib\site-packages\carrot-0.10.7-py2.6.egg\carro
t\backends\pyamqplib.py", line 245, in queue_declare
    return self.channel.queue_declare(queue=queue,
  File "c:\mozilla\build2\python\lib\site-packages\carrot-0.10.7-py2.6.egg\carro
t\backends\pyamqplib.py", line 187, in channel
    connection = self.connection.connection
  File "c:\mozilla\build2\python\lib\site-packages\carrot-0.10.7-py2.6.egg\carro
t\connection.py", line 135, in connection
    self._connection = self._establish_connection()
  File "c:\mozilla\build2\python\lib\site-packages\carrot-0.10.7-py2.6.egg\carro
t\connection.py", line 148, in _establish_connection
    return self.create_backend().establish_connection()
  File "c:\mozilla\build2\python\lib\site-packages\carrot-0.10.7-py2.6.egg\carro
t\backends\pyamqplib.py", line 208, in establish_connection
    connect_timeout=conninfo.connect_timeout)
  File "build\bdist.win32\egg\amqplib\client_0_8\connection.py", line 131, in __
init__
  File "build\bdist.win32\egg\amqplib\client_0_8\abstract_channel.py", line 89,
in wait
  File "build\bdist.win32\egg\amqplib\client_0_8\connection.py", line 198, in _w
ait_method
  File "build\bdist.win32\egg\amqplib\client_0_8\method_framing.py", line 215, i
n read_method
socket.error: [Errno 10060] A connection attempt failed because the connected pa
rty did not properly respond after a period of time, or established connection f
ailed because connected host has failed to respond

There is also a pulse listener powering the Orange Factor website, and this hasn't received any messages for about 80 minutes (since around 6pm Monday).
(Assignee)

Comment 1

7 years ago
I'll look into it...
(Assignee)

Comment 2

7 years ago
Well, the VM looked fine and I was poking around in the logs. Then my SSH session froze and now I can't get back in. the website on pulse.mozilla.org also won't connect. I'll clone for It to bounce the server and use this to keep tracking down why it wedged in the first place.
(Assignee)

Updated

7 years ago
Blocks: 632467
(Assignee)

Comment 3

7 years ago
Ok, the freezing was a datacenter networking issue. Still looking through the logs but does the connection work now?
(Reporter)

Comment 4

7 years ago
It's working again, the reboot seems to have fixed whatever was wrong.  Thanks!
(Reporter)

Updated

7 years ago
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
(Reporter)

Comment 5

7 years ago
So, this problem is recurring again, exactly the same as before.  If you start any pulse listener (I've tried hg and buildbot consumers), it dies with a connection timeout.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 6

7 years ago
Yep, I reopened the other bug as well. I've been watching it today :-/
(Assignee)

Comment 7

7 years ago
Should be back. I need to figure out what's gobbling up all the ram in the VM
(Reporter)

Comment 8

7 years ago
It was working last night, but I checked it this morning and it is failing again in the same way.
(Assignee)

Comment 9

7 years ago
Yeah, digging into this today. I think I know what is going on now.
(Assignee)

Comment 10

7 years ago
The python scrapers were using tons of memory, causing the VM to page, causing rabbitmq to go into flow control mode (which most python clients can't handle).  I've moved the scrapers off to another machine so even if they screw up pulse.mozilla.org will be just fine.

The scrapers are going away, can't wait!
Status: REOPENED → RESOLVED
Last Resolved: 7 years ago7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.