Starting any new pulse listener (with code that previously worked fine) fails with an error such as: Traceback (most recent call last): File "latestbuild.py", line 85, in <module> monitor.listen() File "c:\mozilla\pulsebuildmonitor\pulsebuildmonitor\pulsebuildmonitor.py", li ne 284, in listen self.pulse.listen() File "build\bdist.win32\egg\mozillapulse\consumers.py", line 115, in listen File "c:\mozilla\build2\python\lib\site-packages\carrot-0.10.7-py2.6.egg\carro t\backends\pyamqplib.py", line 245, in queue_declare return self.channel.queue_declare(queue=queue, File "c:\mozilla\build2\python\lib\site-packages\carrot-0.10.7-py2.6.egg\carro t\backends\pyamqplib.py", line 187, in channel connection = self.connection.connection File "c:\mozilla\build2\python\lib\site-packages\carrot-0.10.7-py2.6.egg\carro t\connection.py", line 135, in connection self._connection = self._establish_connection() File "c:\mozilla\build2\python\lib\site-packages\carrot-0.10.7-py2.6.egg\carro t\connection.py", line 148, in _establish_connection return self.create_backend().establish_connection() File "c:\mozilla\build2\python\lib\site-packages\carrot-0.10.7-py2.6.egg\carro t\backends\pyamqplib.py", line 208, in establish_connection connect_timeout=conninfo.connect_timeout) File "build\bdist.win32\egg\amqplib\client_0_8\connection.py", line 131, in __ init__ File "build\bdist.win32\egg\amqplib\client_0_8\abstract_channel.py", line 89, in wait File "build\bdist.win32\egg\amqplib\client_0_8\connection.py", line 198, in _w ait_method File "build\bdist.win32\egg\amqplib\client_0_8\method_framing.py", line 215, i n read_method socket.error: [Errno 10060] A connection attempt failed because the connected pa rty did not properly respond after a period of time, or established connection f ailed because connected host has failed to respond There is also a pulse listener powering the Orange Factor website, and this hasn't received any messages for about 80 minutes (since around 6pm Monday).
I'll look into it...
Well, the VM looked fine and I was poking around in the logs. Then my SSH session froze and now I can't get back in. the website on pulse.mozilla.org also won't connect. I'll clone for It to bounce the server and use this to keep tracking down why it wedged in the first place.
Ok, the freezing was a datacenter networking issue. Still looking through the logs but does the connection work now?
It's working again, the reboot seems to have fixed whatever was wrong. Thanks!
So, this problem is recurring again, exactly the same as before. If you start any pulse listener (I've tried hg and buildbot consumers), it dies with a connection timeout.
Yep, I reopened the other bug as well. I've been watching it today :-/
Should be back. I need to figure out what's gobbling up all the ram in the VM
It was working last night, but I checked it this morning and it is failing again in the same way.
Yeah, digging into this today. I think I know what is going on now.
The python scrapers were using tons of memory, causing the VM to page, causing rabbitmq to go into flow control mode (which most python clients can't handle). I've moved the scrapers off to another machine so even if they screw up pulse.mozilla.org will be just fine. The scrapers are going away, can't wait!