Closed Bug 955380 Opened 10 years ago Closed 10 years ago

Irc Networks disconnect then imminently reconnect again.

Categories

(Chat Core :: IRC, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bugzilla, Assigned: aleth)

Details

Attachments

(7 files)

*** Original post on bio 1942 by qlum <qlumreg AT gmail.com> at 2013-04-20 12:03:00 UTC ***

Every 40 minutes or so irc networks disconnect then reconnect agian, this only happens when the chat is somewhat idle, Trying to talk just before it does that also triggers it.

From what I heard about it in the chat it are ping outs, they are either caused by my isp or my router as switching these 2 was the start of my problems. 

here are some logs of when it happens: 
http://pastebin.instantbird.com/174550
http://pastebin.instantbird.com/174555
*** Original post on bio 1942 at 2013-04-20 12:44:23 UTC ***

I think, as both Flo and I suggested, that making http://lxr.instantbird.org/instantbird/source/chat/modules/socket.jsm#284 configurable and having you lower it might help.

It does look like there's a ping that was never responded to and then 6 minutes later you're logged off. The server isn't timing you out though since we would receive a message about that, I think.
*** Original post on bio 1942 by qlum <qlumreg AT gmail.com> at 2013-04-26 16:14:56 UTC ***

I want to add some stuff here about what happens, maybe those details would be
of use.

Firstly not all nets disconnect at the same frequency regardless of activity. I
am currently connected to 
irc.quakenet.org, irc.tweakers.net, irc.stormbit.org, irc.mibbit.net,
irc.ps3sanctuary.co.uk, and irc.mozilla.org

Here are some things I noticed about them

firstly Quakenet disconnects the most often, I have seen disconnects 30 minutes
apart and even 2 minutes after the last message I recieved. In addition to that
it also still keeps my nick in the server the moment I disconnect. Here is also
another log from quakenet: http://pastebin.instantbird.com/183157

irc.mozilla.org hardly ever seems to disconnect even if its just as idle or
even more idle then the other nets. also a log of that one for good measure:
http://pastebin.instantbird.com/183168

the other nets disconnect at the frequency posted above. 

I see it being moved to resolved fixed, I don't know if its in the latest
nightly but if it is the issue is still there for me.
OS: Windows XP → All
*** Original post on bio 1942 by qlum <qlumreg AT gmail.com> at 2013-06-20 23:24:11 UTC ***

So lets sum up what I did now:
changing the timeout from one minute to 12 seconds seemed to work perfectly and for the limited time I tested it at the 2 minute windows client disconnected and reconnected twice on quakenet where the 12 second one did not, on mibbit I tested it once but still the same result. 
However changing it to 30 seconds quakenet disconnected at around the same time as the 2 minute one however this just could be an coincidance as mibbit is still running but it could also mean it doesn't always work on 30 seconds. I will test more tomorrow but this is what I got today.
*** Original post on bio 1942 by qlum <qlumreg AT gmail.com> at 2013-06-21 11:10:05 UTC ***

Okay I did some more testing today and I came to the conclusion that changing the value does not work. The first test at 12 seconds gave me hope but it seems that was just coincidence as I tried it with 30 seconds later and it did not work and I got similar amounts of disconnects. Then I returned to 12 seconds this time on 4 nets at the same time and it seems it still disconnects at around the same rate. I'd say its back to the start with this one.
*** Original post on bio 1942 at 2013-06-21 11:57:56 UTC ***

I see the logs expired. Can you get a log of when this happens and attach it? Thanks.

Also, this is starting to sound like it's just network issues. If we're pinging every 15 seconds and still getting knocked off...that's ridiculous. (Unless we're getting killed for flooding. :))
*** Original post on bio 1942 by qlum <qlumreg AT gmail.com> at 2013-06-21 21:59:41 UTC ***

okay here is a log from quakenet as it happens the most there: http://pastebin.instantbird.com/174555

It is on my desktop as my laptop has connection issues on wireless and I don't want to lose internet on my desktop atm. so its with the old values still.


I may still add here that off all the nets I am connected to mozilla is the only one where I never got a disconnect like this. Regardless of activity. I may also add here that an active channel doesn't help that much as it can disconnect and reconnect during a downtime of only a few minutes (3/4 minutes is the minimum), as long as a certain amount of time has past without disconnecting before. .
*** Original post on bio 1942 by qlum <qlumreg AT gmail.com> at 2013-06-21 22:01:35 UTC ***

oops forgot to press send, derp: http://pastebin.instantbird.com/228787
*** Original post on bio 1942 by qlum <qlumreg AT gmail.com> at 2013-06-21 22:18:37 UTC ***

here is a second one: http://pastebin.instantbird.com/228788
here is a third one: pastebin.instantbird.com/228789
and here is one from the only net without ever disconnecting like this: http://pastebin.instantbird.com/228790


The only odd thing I notice myself in these logs as someone who doesn't know much about it is a lack of anything for around 6 minutes before it disconnects, while it otherwise never has a gap for more then 2 minutes in the log.
*** Original post on bio 1942 as attmnt 2499 by qlumreg AT gmail.com at 2013-06-21 22:42:00 UTC was without comment, so any subsequent comment numbers will be shifted ***
*** Original post on bio 1942 as attmnt 2500 by qlumreg AT gmail.com at 2013-06-21 22:43:00 UTC was without comment, so any subsequent comment numbers will be shifted ***
*** Original post on bio 1942 as attmnt 2501 by qlumreg AT gmail.com at 2013-06-21 22:44:00 UTC was without comment, so any subsequent comment numbers will be shifted ***
*** Original post on bio 1942 as attmnt 2502 by qlumreg AT gmail.com at 2013-06-21 22:45:00 UTC was without comment, so any subsequent comment numbers will be shifted ***
*** Original post on bio 1942 at 2013-06-22 13:42:42 UTC ***

The only thing I can see from the logs is that we stop receiving PINGs (and don't receive any other messages) whenever this happens. I would suspect network issues, but it is strange that it does not happen for all servers simultaneously. Possibly brief outages that a server might notice if it is trying to send something during one of them?
*** Original post on bio 1942 at 2013-06-22 15:21:40 UTC ***

I can't be certain the following was an instance of this bug, but this was seen on #instantbird:

04:36:07 PM - qlum has left the room (Quit: Ping timeout).
04:36:36 PM - aleth: ^^ there goes bug 955380 (bio 1942), I suspect
04:36:41 PM - instantbot: Bug https://bugzilla.mozilla.org/show_bug.cgi?id=955380 (bio 1942) nor, --, ---, nobody, UNCO, Irc Networks disconnect then imminently reconnect again.
04:39:28 PM - clokep: ?
04:39:46 PM - aleth: "04:36:07 PM - qlum has left the room (Quit: Ping timeout)."
04:40:09 PM - clokep: Ah.
04:46:06 PM - qlum [Instantbir@...IP] entered the room.

It suggests the server sent a ping which was not received. Maybe qlum can confirm?
*** Original post on bio 1942 by qlum <qlumreg AT gmail.com> at 2013-06-22 15:44:57 UTC ***

(In reply to comment #14)
> I can't be certain the following was an instance of this bug, but this was seen
> on #instantbird:
> 
> 04:36:07 PM - qlum has left the room (Quit: Ping timeout).
> 04:36:36 PM - aleth: ^^ there goes bug 955380 (bio 1942), I suspect
> 04:36:41 PM - instantbot: Bug
> https://bugzilla.mozilla.org/show_bug.cgi?id=955380 (bio 1942) nor, --, ---, nobody,
> UNCO, Irc Networks disconnect then imminently reconnect again.
> 04:39:28 PM - clokep: ?
> 04:39:46 PM - aleth: "04:36:07 PM - qlum has left the room (Quit: Ping
> timeout)."
> 04:40:09 PM - clokep: Ah.
> 04:46:06 PM - qlum [Instantbir@...IP] entered the room.
> 
> It suggests the server sent a ping which was not received. Maybe qlum can
> confirm?

This was something else, looking at the time I was on my laptop and not even on instantbird, my laptop has kind of a poor connection on wireless in my room so it cuts out from time to time which caused that disconnect. Generally speaking of the 7 nets I am connected to mozilla is the only one that not once got this problem.

anyway what it does give checking from xchat when it happens is this:
qlum has quit (Read error: Operation timed out) (quakenet)
qlum has quit (Ping timeout: 276 seconds) (freenode)
qlum has quit (Ping timeout) (ps3sanctuary)
qlum has quit (Ping timeout) (mibbit)

so yea it's still a ping timeout.
*** Original post on bio 1942 by qlum <qlumreg AT gmail.com> at 2013-06-22 15:46:08 UTC ***

except on quakenet as someone else got a ping timeout before me which makes this a different thing.
*** Original post on bio 1942 at 2013-06-24 09:27:46 UTC ***

Thanks qlum for checking up on the log in comment #14!

I wonder if we could improve the situation here by actually sending pings of our own (which according to the logs here we don't do).
*** Original post on bio 1942 at 2013-06-24 12:29:47 UTC ***

> I wonder if we could improve the situation here by actually sending pings of
> our own (which according to the logs here we don't do).

We SHOULD be doing this. I'm unsure why we wouldn't, but they're not in the logs. The only way I could see this not happening is if the socket dies. But in that case we should throw an error and reconnect.
Attached patch PatchSplinter Review
*** Original post on bio 1942 as attmnt 2523 at 2013-06-27 15:21:00 UTC ***

We currently don't call resetPingTimer() when receiving a new message, as this.connected is undefined. (Either isConnected on the socket or account.connected was intended.) Fixing this may help with this bug.
Attachment #8354291 - Flags: review?(clokep)
Comment on attachment 8354291 [details] [diff] [review]
Patch

*** Original change on bio 1942 attmnt 2523 at 2013-06-27 15:25:10 UTC ***

This is a bad bug. :( It would mean we're never sending pings. We should ensure the XMPP code using this API is correct.
Attachment #8354291 - Flags: review?(clokep) → review+
Whiteboard: [checkin-needed]
*** Original post on bio 1942 at 2013-06-27 23:49:24 UTC ***

Checked in as http://hg.instantbird.org/instantbird/rev/8106088485c8

I'm going to leave this open for now, qlum can you please try an updated nightly (sometime tomorrow, assuming the builds pass) and see if this is still an issue?

Thanks.
Assignee: nobody → aleth
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [checkin-needed]
*** Original post on bio 1942 at 2013-06-28 08:51:12 UTC ***

Changing the timeout as discussed in comments above should now also have more of an effect ;)
*** Original post on bio 1942 by qlum <qlumreg AT gmail.com> at 2013-06-28 21:40:23 UTC ***

okay just to check in after my first day post patch. 

I got 4 disconnects on 3 different nets today and all of them where significantly less bad then the ones I got before and got once every 40 minutes or so as the time between the last message I received and the disconnect was a lot shorter. All in I would say this looks to be pretty acceptable and while I would not call this thing fully solved I be fine with it staying like this.  

I will add some logs of those as well but since I was pretty late noticing it after it happens it is still a bit of scrolling up.
*** Original post on bio 1942 as attmnt 2534 by qlumreg AT gmail.com at 2013-06-28 21:42:00 UTC was without comment, so any subsequent comment numbers will be shifted ***
*** Original post on bio 1942 as attmnt 2535 by qlumreg AT gmail.com at 2013-06-28 21:43:00 UTC was without comment, so any subsequent comment numbers will be shifted ***
*** Original post on bio 1942 at 2013-06-29 08:33:35 UTC ***

These logs show correct behaviour imho, given a ping timeout of 30s. Our default ping timeout is currently 2 minutes, so I assume you changed this?
*** Original post on bio 1942 at 2013-06-29 08:38:07 UTC ***

(In reply to comment #26)
> These logs show correct behaviour imho, given a ping timeout of 30s. Our
> default ping timeout is currently 2 minutes, so I assume you changed this?

Sorry, please ignore that comment. Our default disconnection timeout is 30s, it's the time before we ping that's 2 minutes.
*** Original post on bio 1942 by qlum <qlumreg AT gmail.com> at 2013-06-30 19:59:47 UTC ***

I may want to add here that after running it for a few days I came to the conclusion that the disconnects still happen but less often in addition to them being less severe due to the maximum time I did not receive messages is just 2 minutes 30 seconds. However there was a shift in the nets where it happens. It does not happen anymore in mibbit, quakenet and freenode but it now happens in mozilla irc where it did not happen before.
Unlike what I first thought it is not a ping gone wrong disconnecting me when there is no need as I saw in an comparisons of logs that I was already missing stuff before the timeout. so that is not the case. 

Either way it happens once or twice a day now only on the what I would call smaller nets stormbit, ps3sanctuary, tweakers, and moznet but it clearly still happens.

I could have waited longer but I am pretty sure about it now 0 disconnects in quakenet, freenode, and mibbit in 3 days and the others having a dc once or twice a day. either way I prefer to have it solved.

I will trying the thing to ping more often tomorrow if I can find the post that told me how to do it that is.
*** Original post on bio 1942 at 2013-07-02 09:04:30 UTC ***

(In reply to comment #28)
> I will trying the thing to ping more often tomorrow if I can find the post that
> told me how to do it that is.
That's in comment #1.

FYI some moznet servers have been unstable for the last 1-2 weeks, so it's not just you who has been seeing frequent disconnects there.
I believe this was fixed by the patch in this bug.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → 1.5
You need to log in before you can comment on or make changes to this bug.