Closed Bug 1151462 Opened 9 years ago Closed 9 years ago

XMPP gets disconnected after 6 minutes of no outgoing activity on Openfire servers

Categories

(Chat Core :: XMPP, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

(thunderbird38+ fixed, thunderbird39 fixed, thunderbird40 fixed)

VERIFIED FIXED
Tracking Status
thunderbird38 + fixed
thunderbird39 --- fixed
thunderbird40 --- fixed

People

(Reporter: murilomentor, Assigned: aleth)

Details

Attachments

(1 file, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.101 Safari/537.36

Steps to reproduce:

1 - Added a xmpp account.
2 - Started to chat with a coworker.


Actual results:

The chat message keeps saying that my account is disconnected (Your account is disconnected), and then reconnects (log in brazilian portuguese):
11:24:19 - Sua conta está desconectada (o status de [name redacted] não é mais conhecido).
11:24:20 - Sua conta foi reconectada ([name redacted] está Off-line).
11:24:20 - [name redacted] agora está Disponível: Online.
11:30:21 - Sua conta está desconectada (o status de [name redacted] não é mais conhecido).
11:30:22 - Sua conta foi reconectada ([name redacted] está Off-line).
11:30:22 - [name redacted] agora está Disponível: Online.
11:36:23 - Sua conta está desconectada (o status de [name redacted] não é mais conhecido).
11:36:24 - Sua conta foi reconectada ([name redacted] está Off-line
11:36:24 - [name redacted] agora está Disponível: Online.

The error also appears on the console:

Hora: 06/04/2015 11:24:20
Alerta: Unhandled IQ error stanza.
Arquivo-fonte: resource:///modules/xmpp.jsm
Linha: 973
Código-fonte:
prpl-jabber: XMPPAccountPrototype.onIQStanza

Hora: 06/04/2015 11:30:22
Alerta: Unhandled IQ error stanza.
Arquivo-fonte: resource:///modules/xmpp.jsm
Linha: 973
Código-fonte:
prpl-jabber: XMPPAccountPrototype.onIQStanza

Hora: 06/04/2015 11:36:24
Alerta: Unhandled IQ error stanza.
Arquivo-fonte: resource:///modules/xmpp.jsm
Linha: 973
Código-fonte:
prpl-jabber: XMPPAccountPrototype.onIQStanza


Expected results:

The account should stay connected.
It looks like you get disconnected every 6 minutes for some reason.

Could you look at a debug log (available from the context menu of the account in the IM account manager) and attach the relevant parts here? In particular, the error stanza that's not being handled likely contains some information.
(In reply to aleth [:aleth] from comment #1)
> It looks like you get disconnected every 6 minutes for some reason.
> 
> Could you look at a debug log (available from the context menu of the
> account in the IM account manager) and attach the relevant parts here? In
> particular, the error stanza that's not being handled likely contains some
> information.

Ok, the IQ error stanza appear in the following sections of the debug log:


[07/04/2015 11:12:16] DEBUG (@ prpl-jabber: Socket.onTransportStatus resource:///modules/socket.jsm:557)
onTransportStatus(STATUS_RECEIVING_FROM)
[07/04/2015 11:12:16] LOG   (@ prpl-jabber: XMPPParser.prototype._logReceivedData resource:///modules/xmpp-xml.jsm:312)
received:
<iq xmlns="jabber:client" type="error" id="6a0cc0aa-1a49-4e56-b2dd-a6c4e3df99ed" to="s202437@zabbix.trt18.jus.br/Thunderbird">
 <error xmlns="jabber:client" code="500" type="wait">
  <internal-server-error xmlns="urn:ietf:params:xml:ns:xmpp-stanzas"/>
 </error>
</iq>

[07/04/2015 11:12:16] WARN. (@ prpl-jabber: XMPPAccountPrototype.onIQStanza resource:///modules/xmpp.jsm:973)
Unhandled IQ error stanza.

.
.
.

[07/04/2015 11:18:17] DEBUG (@ prpl-jabber: Socket.onTransportStatus resource:///modules/socket.jsm:557)
onTransportStatus(STATUS_SENDING_TO)
[07/04/2015 11:18:17] DEBUG (@ prpl-jabber: Socket.onTransportStatus resource:///modules/socket.jsm:557)
onTransportStatus(STATUS_RECEIVING_FROM)
[07/04/2015 11:18:17] LOG   (@ prpl-jabber: XMPPParser.prototype._logReceivedData resource:///modules/xmpp-xml.jsm:312)
received:
<iq xmlns="jabber:client" type="error" id="ca5ce802-a29e-4e7c-815f-34ecfe8fd507" to="s202437@zabbix.trt18.jus.br/Thunderbird">
 <error xmlns="jabber:client" code="500" type="wait">
  <internal-server-error xmlns="urn:ietf:params:xml:ns:xmpp-stanzas"/>
 </error>
</iq>

[07/04/2015 11:18:17] WARN. (@ prpl-jabber: XMPPAccountPrototype.onIQStanza resource:///modules/xmpp.jsm:973)
Unhandled IQ error stanza.
Thanks. You haven't provided enough of the debug log to see the context in which the error happens, but "internal-server-error" suggests the problem probably isn't with Thunderbird: it means "the server has experienced a misconfiguration or an otherwise-undefined internal error that prevents it from servicing the stream".
(In reply to aleth [:aleth] from comment #3)
> Thanks. You haven't provided enough of the debug log to see the context in
> which the error happens, but "internal-server-error" suggests the problem
> probably isn't with Thunderbird: it means "the server has experienced a
> misconfiguration or an otherwise-undefined internal error that prevents it
> from servicing the stream".

Sorry for the lack of information from the debug log, I'll attach today's full debug log to make it easier to identify the context in which the error occurs.
(In reply to murilomentor from comment #5)
> Created attachment 8589125 [details]
> xmpp debug log from 2015-04-07

Thanks for the debug log. We decided to hide it as it seems to contain some potentially private data (contact names and XMPP addresses). You may want to check future debug logs before attaching them to ensure they don't contain any messages etc.

The internal server error stanza is received in response to our sending the vcard. It appears the server you are using does not support that? At any rate, as the disconnect only happens later, it doesn't seem to have anything to do with your problem.

There appear to be a lot of pubsub stanzas (http://www.xmpp.org/extensions/xep-0060.html) and user tune stanzas (http://xmpp.org/protocols/tune/) which TB does not currently support, but that shouldn't be a problem.

The disconnect happens because we eventually receive (without an error preceding it)
</stream:stream>

So we need to figure out why that happens.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: xmpp account disconnects very often → XMPP account disconnects after 6 minutes
When did this start happening? Or has it always been a problem?
Component: Instant Messaging → XMPP
Product: Thunderbird → Chat Core
Version: 38 → trunk
This happens since I started using Thunderbird to chat (version 33.0).
Before I was using Pidgin or Instantbird, and the problem did not occur or did not cause the disconnection.
The reference to http://www.igniterealtime.org/projects/smack/ in the debug log and finding [1],[2] leads me to suspect you are connecting to an Openfire/Smack server, which has a known bug [3]: it disconnects clients without warning if they have not sent anything for 6 minutes.

[1] https://developer.pidgin.im/ticket/10767
[2] https://issues.jenkins-ci.org/browse/JENKINS-25676
[3] https://igniterealtime.org/issues/browse/OF-341

Libpurple (Pidgin/Instantbird) landed a workaround [4] which sends whitespace keepalives regardless of whether anything was received from the server.

[4] http://hg.pidgin.im/pidgin/main/rev/24dcb476e62ba5c89a8189740419e230a071343b

Should we add a similar hack?
Flags: needinfo?(clokep)
That sounds reasonable. This should be very easy with the ping stuff we added for IRC.
Flags: needinfo?(clokep)
(In reply to Patrick Cloke [:clokep] from comment #11)
> That sounds reasonable. This should be very easy with the ping stuff we
> added for IRC.

Would you like to take a look into this then? 

Note we already send XMPP pings (for which there is a spec), but that's distinct from sending whitespace keepalives.
You're right. I called the server management staff and they confirmed we're using Openfire here.
Checking the Ignite Realtime's issue tracker I found [https://igniterealtime.org/issues/browse/OF-341] which explains exactly the problem we're facing here.
I think the comments by  Guus der Kinderen and Guenther Niess on the issue give us a good explanation on why it happens and why is acceptable to implement the whitespace keepalive workaround.
There is even a RFC BIS pointed by Niess that mentions the use of whitespace in order to keep the connection alive [http://tools.ietf.org/html/draft-ietf-xmpp-3920bis-04#section-5.3.3].
(In reply to Murilo Farias from comment #13)
> You're right. I called the server management staff and they confirmed we're
> using Openfire here.

Thanks for checking!

> Checking the Ignite Realtime's issue tracker I found
> [https://igniterealtime.org/issues/browse/OF-341] which explains exactly the
> problem we're facing here.

Yes, that's the bug I linked to in comment 10.

> There is even a RFC BIS pointed by Niess that mentions the use of whitespace
> in order to keep the connection alive
> [http://tools.ietf.org/html/draft-ietf-xmpp-3920bis-04#section-5.3.3].

This is long obsolete, the current XMPP RFC is RFC 6120. We should check what the current spec says about idle streams/whitespace keepalives though!
> This is long obsolete, the current XMPP RFC is RFC 6120. We should check
> what the current spec says about idle streams/whitespace keepalives though!

Oh, sorry about that, I should have stayed a little longer on Google. ;)
With regard to the RFC 6120. the topic we are looking for is in session 4.6.1 [https://tools.ietf.org/html/rfc6120#section-4.6.1].
Abdelrhman, maybe you'd be interested in looking at this bug next?
Summary: XMPP account disconnects after 6 minutes → XMPP gets disconnected after 6 minutes of no outgoing activity on Openfire servers
Attached patch xmppping.diff (obsolete) — Splinter Review
Rather than duplicating the whole ping timer mechanism to send whitespce keepalives, this patch instead avoids resetting the ping timer if we haven't sent anything recently. I think it's both bearable and safer (given possible syndication) to send a full XMPP ping (and not just some whitespace, which would be the alternative).

Note two minutes is the current ping interval, the XMPP spec recommends five minutes (RFC 6120 4.6.4.). Should we change this? 5 minutes seems a long time for a client to have to wait to find out the connection died...
Assignee: nobody → aleth
Status: NEW → ASSIGNED
Attachment #8594349 - Flags: review?(clokep)
Attached patch xmppping.diff v2Splinter Review
The send part should have the same "are we fully connected yet" check as the receive part.
Attachment #8594349 - Attachment is obsolete: true
Attachment #8594349 - Flags: review?(clokep)
Attachment #8594354 - Flags: review?(clokep)
Attachment #8594354 - Flags: review?(clokep) → review+
Could you confirm this fixes the problem for you? The fix will be in the next TB daily build.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Flags: needinfo?(murilomentor)
Resolution: --- → FIXED
Target Milestone: --- → 1.6
(In reply to aleth [:aleth] from comment #20)
> Could you confirm this fixes the problem for you? The fix will be in the
> next TB daily build.

Ok, I'll test the daily build from 24-Apr-2015.
I will give feedback as soon as I finish it.
(In reply to Patrick Cloke [:clokep] from comment #16)
> Abdelrhman, maybe you'd be interested in looking at this bug next?

This bug was interesting, but I was busy the last two weeks.
The problem was solved!
Thunderbird stopped disconnecting every 6 minutes. I will not even attach the debug log because the error messages are no longer there. But as you can see on the screenshot:
http://i.imgur.com/mJTZEVO.png
There was 6m47s of inactivity, and then 22m of inactivity, without disconnecting.
The messages "Your account is disconnected"..."Your account is reconnected" which polluted the chat screen, does not appear anymore.
Thank you for all the time and work you have devoted to the issue!

P.S. - Do you saw something strange in the screenshot? Maybe something you've done for Instantbird and should not be in Thunderbird? Something related to bubbles?   :)
Flags: needinfo?(murilomentor)
(In reply to Murilo Farias from comment #23)

Thanks for checking the fix! We'll uplift it for TB 38 (the next release).

> P.S. - Do you saw something strange in the screenshot? Maybe something
> you've done for Instantbird and should not be in Thunderbird? Something
> related to bubbles?   :)

Nice :-) Yes, all message styles should work in TB too. There's just no UI to install/select them.
Status: RESOLVED → VERIFIED
Comment on attachment 8594354 [details] [diff] [review]
xmppping.diff v2

[Approval Request Comment]
User impact if declined: Painful experience for people connecting to Openfire servers
Testing completed (on c-c, etc.): Verified fixed
Risk to taking this patch (and alternatives if risky): Low, it just sends more pings than before.
Attachment #8594354 - Flags: approval-comm-beta?
Attachment #8594354 - Flags: approval-comm-aurora?
It would be helpful in the future if chat bugs, since they do not have the normal Target Milestone that point to a specific Thunderbird release, would update the status flags for the specific affected Thunderbird releases. I have to guess based on initial push date.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: