landfill's connection has problems about once per day

RESOLVED INVALID

Status

mozilla.org Graveyard
Server Operations
RESOLVED INVALID
11 years ago
3 years ago

People

(Reporter: Max Kanat-Alexander, Assigned: reed)

Tracking

Details

(Reporter)

Description

11 years ago
About once per day, sometimes around 4am (sometimes not), there's some unknown problem with landfill's internet connection.

The symptoms are that bugbot and logbot (who both live on that server) disappear from IRC and then reconnect. Also, I have a script on that server that calls out to another server about every five minutes, and sometimes it sends me a text message saying that it wasn't able to get out, around the same time that the bots disappear.

Basically, it seems like the connection "hiccups" about once a day.

It could easily be a problem with the server OS itself, but this never happened before landfill was moved onto the VMWare server.

Comment 1

11 years ago
Are you seeing any similar issues with any of the VMs sitting behind landfill?
(Reporter)

Comment 2

11 years ago
It's harder to say, because the VMs sitting behind landfill aren't as active with outbound connections.

I have seen the tinderbox VM have trouble once in a while, but it doesn't have any constant outgoing connections, so it could be related to something else.

Comment 3

11 years ago
Passing to reed for debugging.  cg-centos01 is the other public facing VM on there.
Assignee: server-ops → reed
(Assignee)

Comment 4

11 years ago
(In reply to comment #3)
> Passing to reed for debugging.  cg-centos01 is the other public facing VM on
> there.

I've set up an irssi session on cg-centos01. Let's see how it does.
Status: NEW → ASSIGNED
(Assignee)

Updated

11 years ago
Whiteboard: Set up irssi session on cg-centos01 to debug
(Assignee)

Comment 5

11 years ago
I haven't seen any problems. Are you still seeing issues on landfill?
(Reporter)

Comment 6

11 years ago
Logbot hasn't died lately, as far as I can see. I think it might be good to give it a bit longer, just to be sure.
(Assignee)

Comment 7

11 years ago
[11:47:16AM] * logbot has quit (Quit: connection timed out)
[11:47:17AM] * logbot (glob@moz-90A89D35.bugzilla.org) has joined #mozwebtools
[12:05:06PM] <mkanat> reed^^^^^^^
[12:05:11PM] <mkanat> reed: There went logbot.

Note that the quit message has the "Quit:" prefix. That means that the bot is actually doing "/quit connection timed out" instead of that message coming from the server.

Also, my irssi session on cg-centos01 hasn't had problems since I connected it.

So, it's either just a problem with logbot or an issue locally on landfill.
(Assignee)

Comment 8

11 years ago
mkanat: Any update on this?
(Reporter)

Comment 9

11 years ago
My current guess is that it's a load problem--that something (most likely mxr, PLEASE somebody move that off of landfill) is causing load levels to peak and that prevents the bots from responding in time.
Status: ASSIGNED → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → INVALID
(Assignee)

Comment 10

11 years ago
(In reply to comment #9)
> My current guess is that it's a load problem--that something (most likely mxr,
> PLEASE somebody move that off of landfill) is causing load levels to peak and
> that prevents the bots from responding in time.

Uh, mxr.mozilla.org is hosted on an IT-supported vm, not landfill. You're probably thinking of mxr-test, which is run by timeless on landfill.
Whiteboard: Set up irssi session on cg-centos01 to debug
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.