Closed Bug 1864924 Opened 2 years ago Closed 1 year ago

ThunderBird Server Connection Issues, and Error Message Not very clear

Categories

(MailNews Core :: Networking: SMTP, defect, P2)

Thunderbird 115
x86_64
All

Tracking

(thunderbird_esr115+ fixed, thunderbird122 affected)

RESOLVED FIXED
123 Branch
Tracking Status
thunderbird_esr115 + fixed
thunderbird122 --- affected

People

(Reporter: developers, Assigned: gds)

References

Details

Attachments

(1 file, 1 obsolete file)

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.105 Safari/537.36

Steps to reproduce:

Encountering an error condition, think it is related to accepting calendar invites, but a littel hard to diagnose, hence the bug request. The server logs indicate that the attachment name was successfully extracted, during the SMTP transaction, and then the inline virus scan, and then a 250 OK was issued by the server. Total time of SMTP transaction was only a few seconds.. but strangely the client (TB) reported a time out, yet the server also experienced a timeout, after it send the 250 OK (Post Data), waiting 30 seconds for the client to respond.

Actual results:

Client Timed Out Message 4.xx on Thunderbird, related to Calendar invite responses.

Sending of the message failed.
An error occurred while sending mail: Outgoing server (SMTP) error. The server responded: timeout (#4.4.2).

Expected results:

Delivered email with the same methods as any other SMTP session. The Calendar invite 'appears' to be doing a 'fast talker'.. or something else while sending the ICS during the DATA phase.

Nov 15 09:20:18 fe2 msd[1712360]: ======================================================================
Nov 15 09:20:18 fe2 msd[1712360]: Successfully extracted attachment name from content-type: [invite.ics]
Nov 15 09:20:18 fe2 msd[1712360]: virus scan: /var/spool/qmail/mess/13/196778: OK//----------- SCAN SUMMARY -----------/Infected files: 0/Time: 0.036 sec (0 m 0 s)
Nov 15 09:20:18 fe2 msd[1712360]: virus scan: Start Date: 2023:11:15 09:20:18/End Date: 2023:11:15 09:20:18
Nov 15 09:20:18 fe2 msd[1712360]: Returning 250 ok [qp 1712366] for data
Nov 15 09:20:48 fe2 msd[1712360]: Timeout encountered reading smtp command <<---------=!!!

Please get an smtp log for Thunderbird. See https://wiki.mozilla.org/MailNews:Logging

Hey Magnus, here you go.. (editted for brevity and privacy)

Calendar: _createTempImipFile path: /tmp/itipTemp CalItipEmailTransport.jsm:359
mailnews.smtp: Sending message <af86f521-a7b7-4a7d-8c29-6dfa5cfe9500@wizard.ca> SmtpService.jsm:88:18
mailnews.smtp: Connecting to smtp://mail.REDACCT.com:587 SmtpClient.jsm:123:19
mailnews.smtp: Connected SmtpClient.jsm:395:17
mailnews.smtp: S: 220 fe3.REDACCT.com ESMTP

SmtpClient.jsm:418:17
mailnews.smtp: C: EHLO [192.168.1.55] SmtpClient.jsm:622:19
mailnews.smtp: S: 250-fe3.REDACCT.com

250-AUTH LOGIN

250-AUTH=LOGIN

250-STARTTLS

250-SIZE 29000000

250-HELP

250 8BITMIME

SmtpClient.jsm:418:17
mailnews.smtp: C: STARTTLS SmtpClient.jsm:622:19
mailnews.smtp: S: 220 Ready to start TLS

SmtpClient.jsm:418:17
mailnews.smtp: C: EHLO [192.168.1.55] SmtpClient.jsm:622:19
mailnews.smtp: S: 250-fe3.REDACCT.com

250-AUTH LOGIN

250-AUTH=LOGIN

250-AUTH LOGIN PLAIN

250-AUTH=LOGIN PLAIN

250-SIZE 29000000

250-CLIENTID

250-HELP

250 8BITMIME

SmtpClient.jsm:418:17
mailnews.smtp: Possible auth methods: PLAIN,LOGIN SmtpClient.jsm:909:17
mailnews.smtp: C: Logging suppressed (it probably contained auth information) SmtpClient.jsm:618:19
mailnews.smtp: S: 250 OK

SmtpClient.jsm:418:17
mailnews.smtp: Current auth method: PLAIN SmtpClient.jsm:662:17
mailnews.smtp: Authentication via AUTH PLAIN SmtpClient.jsm:677:21
mailnews.smtp: C: Logging suppressed (it probably contained auth information) SmtpClient.jsm:618:19
mailnews.smtp: S: 235 ok, go ahead (#2.0.0)

SmtpClient.jsm:418:17
mailnews.smtp: Authentication successful. SmtpClient.jsm:1132:17
mailnews.smtp: C: MAIL FROM:<REDACCT> BODY=8BITMIME SIZE=6457 SmtpClient.jsm:622:19
mailnews.smtp: S: 250 ok

SmtpClient.jsm:418:17
mailnews.smtp: MAIL FROM successful, proceeding with 1 recipients SmtpClient.jsm:1166:17
mailnews.smtp: Adding recipient... SmtpClient.jsm:1171:17
mailnews.smtp: C: RCPT TO:<REDACCT> SmtpClient.jsm:622:19
mailnews.smtp: S: 250 ok

SmtpClient.jsm:418:17
mailnews.smtp: RCPT TO done, proceeding with payload SmtpClient.jsm:1231:19
mailnews.smtp: C: DATA SmtpClient.jsm:622:19
mailnews.smtp: S: 354 go ahead

SmtpClient.jsm:418:17
mailnews.smtp: Sending 6457 bytes of payload SmtpClient.jsm:590:17
mailnews.smtp: S: 250 ok 1700162394 qp 2029036 uuid 1eeca3ea-84b5-11ee-b1d6-23ecfd9e674e

SmtpClient.jsm:418:17
mailnews.smtp: Message sent successfully. SmtpClient.jsm:1290:21
Calendar: iTIP on REQUEST: found 1 items. calItipUtils.jsm:1468
Calendar: iTIP operations: 0 calItipUtils.jsm:1921
mailnews.smtp: S: 451 timeout (#4.4.2)

SmtpClient.jsm:418:17
mailnews.smtp: Command failed: 451 timeout (#4.4.2); currentAction=_actionIdle SmtpClient.jsm:545:19
mailnews.send: Sending failed; An error occurred while sending mail: Outgoing server (SMTP) error. The server responded: timeout (#4.4.2)., exitCode=2153066732, originalMsgURI=undefined MessageSend.jsm:337:32
mailnews.smtp: Socket closed. SmtpClient.jsm:518:17
mailnews.smtp: Connection to mail.cityemail.com closed SmtpClient.jsm:164:19
2147942487
Missing resource in locale en-CA: devtools/client/toolbox.ftl

......

On the surface, don't see any form of a QUIT being sent after the 250 is received post DATA, and/or no disconnect.. However, it 'looks' like there is additional data being transmitted after the 250 OK.. according to the server logs...

It SHOULD be noted (probably needs a separate bug ticket) that the CLIENTID was NOT issued, even though the SMTP server did have CLIENTID enabled and credentials.. It does affect the ability to send Calendar notifications by email. And assuming this was a recent change that caused this, as it wasn't a historical problem before upgrading to this version.

Earlier we created a new connection for each mail. Now the connection is cached.

We should probably not limit closing the connection to 421 errors but also other errors during idle.
https://searchfox.org/comm-central/rev/02a02172ea0c4d3cf5dd7e94b7b7897f2e6be4fa/mailnews/compose/src/SmtpClient.jsm#573

While caching IMAP makes sense, caching SMTP connections does NOT. IF there is a pending email to be actioned, and there IS an existing connection, it MAY perform a new transaction in the same session, eg by sending a new MAIL FROM, but this can get complicated real quickly

Caching SMTP connections should NOT be the default IMHO opinion.. It keeps connections open to SMTP longer than desired, and MTA"s may have a limited amount of SMTP connections that can be open at a given time, so connections are meant to be opened and closed quickly.

Would flag this as an undesirable behaviour, and it might alarm MTA system adminisitrators, who may react to such events in an unfriendly way (blocking)

For clarity.. Reviewed logs for other connections, to see that there are other cases where Thunberbird appears to not close connections.. again, shows as Calendar recurring events..

<log output>
mailnews.smtp: C: DATA SmtpClient.jsm:622:19
mailnews.smtp: S: 354 go ahead
SmtpClient.jsm:418:17
mailnews.smtp: Sending 3474 bytes of payload SmtpClient.jsm:590:17
mailnews.smtp: S: 250 ok 1700160170 qp 2388471 uuid f197a778-84af-11ee-b900-174dd2d4b8b3
SmtpClient.jsm:418:17
mailnews.smtp: Message sent successfully. SmtpClient.jsm:1290:21

^^^ After this, expect that TB would either send a QUIT or terminate the connection..
Instead, 30 seconds later the MTA throws a 451 timeout error, which TB receives, and then closes the connection from the MTA end.

mailnews.smtp: S: 451 timeout (#4.4.2)
SmtpClient.jsm:418:17
mailnews.smtp: Command failed: 451 timeout (#4.4.2); currentAction=_actionIdle SmtpClient.jsm:545:19
mailnews.send: Sending failed; An error occurred while sending mail: Outgoing server (SMTP) error. The server responded: timeout (#4.4.2)., exitCode=2153066732, originalMsgURI=imap-message://michael%40wizard.ca@mail.cityemail.com/INBOX#555780 MessageSend.jsm:337:32

Now we get a 'Command Failed' error, when technically the command succeeded once the 250 OK after data was received.

mailnews.smtp: Socket closed. SmtpClient.jsm:518:17
mailnews.smtp: Connection to mail.cityemail.com closed SmtpClient.jsm:164:19
NS_ERROR_ABORT: Component returned failure code: 0x80004004 (NS_ERROR_ABORT) [nsIWindowWatcher.openWindow]

Component: Untriaged → Networking: SMTP
Product: Thunderbird → MailNews Core

There's now a change (currently only in Daily, I think) that allows you to not reuse a connection here: Bug 1854567.
It requires setting pref mail.smtpserver.default.max_cached_connection to 0.
With the pref at zero, each message sent uses its own connection which is immediately terminated after QUIT is sent.

I tend to agree with you that most users don't need to "reuse" an smtp connection or have more than 1 at at time "cached". However, this is helpful for users that queue up a lot of messages in Local Folders/Outbox for sending later and was a requested feature. However, maybe max_cached_connection should default to 0 and the small subset of user that need to queue up message to send later could set it to greater than 0 if they really want it.

Edit: Reporter DevTeam, I'm not a calendar user so I'm not familiar with how it reports SMTP errors. My question is are you seeing the SMTP errors in user notifications or just in console or error logs that only an "advanced" user would see?

See Also: → 1854567

(In reply to gene smith from comment #6)

I tend to agree with you that most users don't need to "reuse" an smtp connection or have more than 1 at at time "cached". However, this is helpful for users that queue up a lot of messages in Local Folders/Outbox for sending later and was a requested feature. However, maybe max_cached_connection should default to 0 and the small subset of user that need to queue up message to send later could set it to greater than 0 if they really want it.

I second the idea to restore the old behavior of Thunderbird 102 (and earlier) to only use a single SMTP connection for each outgoing message - by default.

While it leads indeed to a nice performance improvement, when sending quite a lot of (small) messages in a short period of time, e.g. by using my own add-on Mail Merge, this is under normal circumstances not strictly needed. We have already observed quite some unexpected regressions due to the new behavior. (And it can also very easily lead to more problems like "Too many emails sent in X minutes / hours / days".)

Shall we discuss this in a separate bug report or how is the procedure to eventually change this default?

Besides from changing the default it would probably be helpful as well to have more control over the caching mechanism, e.g. via dedicated preferences to enable / disable caching, set the maximum number of recipients, set the maximum number of messages and set the maximum time period - individually for each configured SMTP server.

And it would certainly be a welcoming addition to have these settings easily available in the UI. See:
https://update.buhl.de/faq-images/accountance/MeinBuero/FAQ/Newsletter_IONOS/Bild001.png
https://phabricator.services.mozilla.com/D190146#6374457

Caching the connection the way we do now is perhaps not great. Maybe we can keep caching but add a timer to always disconnect after 30s inactivity?

(In reply to gene smith from comment #6)

There's now a change (currently only in Daily, I think) that allows you to not reuse a connection here: Bug 1854567.
It requires setting pref mail.smtpserver.default.max_cached_connection to 0.
With the pref at zero, each message sent uses its own connection which is immediately terminated after QUIT is sent.

I tend to agree with you that most users don't need to "reuse" an smtp connection or have more than 1 at at time "cached". However, this is helpful for users that queue up a lot of messages in Local Folders/Outbox for sending later and was a requested feature. However, maybe max_cached_connection should default to 0 and the small subset of user that need to queue up message to send later could set it to greater than 0 if they really want it.

Edit: Reporter DevTeam, I'm not a calendar user so I'm not familiar with how it reports SMTP errors. My question is are you seeing the SMTP errors in user notifications or just in console or error logs that only an "advanced" user would see?

In User Notifications as well.. Could send over screen shots if valuable, but I think these are standard user notifications.. as mentioned this is related to SMTP sessions getting terminated .. Probably extends to anything in TB that tries caching SMTP. REALLY would like that fix backported to 115.5.0, as mentioned.. SMTP caching WILL cause problems with many email implementations..

See Also: → 1857757

(In reply to DevTeam from comment #9)

...REALLY would like that fix backported to 115.5.0, as mentioned.. SMTP caching WILL cause problems with many email implementations..

Let me make sure I understand your terms. We actually have two things going on here:

  1. Currently there can be up to 3 connections to an SMTP server at a time. This is what we call "caching" and is set by parameter mailnews.smtpserver.default.max_cached_connections defaulted to 3.

  2. As long as max_cached_connection is set greater than 0, each connection to the server can send multiple messages. We call this "reuse".

So I think when you (DevTeam) refers to "SMTP caching" you are referring to our "reuse" concept.
Am I right?

Anyhow, a change in Bug 1854567 (not yet in ESR 115) is that setting max_cached_connection to 0 disables "reuse" and "caching". However, that patch does not change the default from 3.

(In reply to DevTeam from comment #5)

While caching IMAP makes sense, caching SMTP connections does NOT. IF there is a pending email to be actioned, and there IS an existing connection, it MAY perform a new transaction in the same session, eg by sending a new MAIL FROM, but this can get complicated real quickly
Caching SMTP connections should NOT be the default IMHO opinion.. It keeps connections open to SMTP longer than desired, and MTA"s may have a limited amount of SMTP connections that can be open at a given time, so connections are meant to be opened and closed quickly.

There seem to be competing interests here. Bug 136871 and its duplicates explicitly asked for multiple messages accumulated in the Outbox to be sent via the same connection to the MTA. Sadly the implementation in that bug overshot the requirement and now even allows multiple connections to the same SMTP server which are in fact not used when sending from the outbox. See bug 1857757 for more discussion. Maybe rolling back most of the implementation from bug 136871 and settling for a simple pref to either close the SMTP connection after sending a message or leaving it open for re-use would be best. Even if left open, the server will eventually time out and it will be closed then.

Duplicate of this bug: 1868939

(In reply to Magnus Melin [:mkmelin] from comment #8)

Caching the connection the way we do now is perhaps not great. Maybe we can keep caching but add a timer to always disconnect after 30s inactivity?

I made a timer to do this and it works by sending QUIT after the final "dot" is sent if there is no message sent within 5 seconds on the connection. I think setting it to 30 seconds would be too long since, as described above, the subject server waited 30 seconds before reporting an error to TB on the idle connection. Maybe it needs to be a pref, default 15 seconds?

Also, defaulting to no reuse along with this timer could be a good solution. And maybe only ever allow reuse when messages are sent from Outbox?

I guess it depends on how you implemented the timer. A short time like even 5-10 sec could be fine if we're sure the connection is really active before that.

If we have the short timer I'd think we can keep the connection reuse on by default.

See Also: → 1868160
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Unspecified → Linux
Hardware: Unspecified → x86_64

When a message is successfully sent on an smtp connection, a timer is now started with
a time set by mail.smtpserver.default.smtp_quit_timeout (default 5000 ms). If another
message is received by the protocol for sending within this time, the timer is just canceled.
But if no more messages are received by the protocol during this period, the timer
times out and causes an SMTP quit to be sent. Sending quit ensures a quick and orderly
connection shutdown so the server doesn't report a timeout error response, which is
sometimes seen as an error by TB.

Assignee: nobody → gds
Status: NEW → ASSIGNED

Related to bug 1851767?

Gene, (for the record, sorry not able to comment more last couple of weeks) but still not understanding the motivation for encoding this at all in TB, but if you are, your choice of default is probably the max reasonable, otherwise ISP's will notice the lag, and it can affect the response ability, given the majority of email SMTP Transactions are on the magnitude of only 500ms.

Are their any other examples of email clients that hold SMTP connections open that you can reference or compare with?

Of course, assuming this will also unify the SMTP transactions in the calendar invite engines with regular SMTP transactions?

(In reply to betterbird.project+13 from comment #11)

(In reply to DevTeam from comment #5)

While caching IMAP makes sense, caching SMTP connections does NOT. IF there is a pending email to be actioned, and there IS an existing connection, it MAY perform a new transaction in the same session, eg by sending a new MAIL FROM, but this can get complicated real quickly
Caching SMTP connections should NOT be the default IMHO opinion.. It keeps connections open to SMTP longer than desired, and MTA"s may have a limited amount of SMTP connections that can be open at a given time, so connections are meant to be opened and closed quickly.

There seem to be competing interests here. Bug 136871 and its duplicates explicitly asked for multiple messages accumulated in the Outbox to be sent via the same connection to the MTA. Sadly the implementation in that bug overshot the requirement and now even allows multiple connections to the same SMTP server which are in fact not used when sending from the outbox. See bug 1857757 for more discussion. Maybe rolling back most of the implementation from bug 136871 and settling for a simple pref to either close the SMTP connection after sending a message or leaving it open for re-use would be best. Even if left open, the server will eventually time out and it will be closed then.

Sorry, for the comments out of order and a bit late, but vacations interfered. And yes, there really is a confusion IMHO. For the end user with a lot of messages queued while offline (like working on the airplane) when they come back online, there isn't really a need to 'cache' a connection, the connection should simply remain open until all messages in the queue are processed, and/or an arbitrary limit is reached eg, number of recipients, since many ISP's and email providers have outbound limit, but this should be considered 'batch' processing messages. Other discussions on whether this single session should have multiple transactions deliminated by RSET commands, or simple MAIL FROM, I leave to a separate discussion.

Bug 136871 should be termed 'batching'. IMHO rollback that patch, and redo as a 'batch' ability?

That will enable a separate discussion on caching SMTP connections.. which I don't see either a demand or a reasonable use case for, that won't break real world implementations. TB is not typically used as a 'bulk' SMTP transaction tool, but on demand.

From comment 18:

there isn't really a need to 'cache' a connection

Well, it might be argued that if you have 99 messages queued in Outbox to "Send Later", and then, on send, if 3 connections to SMTP server are made and used in parallel, each sending 33 messages, that it would be faster. But it doesn't currently work like that. Even with max_cached_connection at 3, there is only one connection and the messages are sent serially on that one connection. So Outbox/Send Later never takes advantage of the potential 3 connections.

I only currently see 2 definite ways that more than one simultaneous smtp connection to the same server is ever established:

  1. Set up a filter to forward a messages to more than 1 address.
  2. Compose several emails on screen and quickly click send on each of them.
  3. Maybe calendar notifications sent to multiple address (haven't tried this so not sure).

The smtp RFC seems to have no problem with sending more than one message on a connection. As can be seen here:
https://datatracker.ietf.org/doc/html/rfc5321#section-4.1.4 (going down a bit from there, emphasis mine):

MAIL ... MUST NOT be
sent if a mail transaction is already open, i.e., it should be sent
only if no mail transaction had been started in the session, or if
the previous one successfully concluded with a successful DATA
command, or if the previous one was aborted, e.g., with a RSET or new
EHLO.

The RFC also implies that multiple connections are OK here: https://datatracker.ietf.org/doc/html/rfc5321#section-4.5.4.1

Similarly, to achieve timely delivery, the SMTP client MAY support
multiple concurrent outgoing mail transactions. However, some limit
may be appropriate to protect the host from devoting all its
resources to mail.

So our default limit is now 3, but, as noted above, only 1 is ever actually established by Outbox/Send Later.

Gene, there 'could' in theory be a need for multiple connections to a 'server' per it's normal definition, but not to the TB definition of a server/service, eg only associated with one outgoing account/server. (MAIL FROM).

Generally a SESSION can be re-used for multiple transactions but from a client perspective it is rare. As well, most ISP's or email providers have only a limited number of simultaneous connections, so it is frowned on for one customer/client to use up more than their expected share. And with most email transactions being SO fast nowadaways, not like the slow dialup days, no real need to 'improve performance' at the session/connection level.

However, an email client can quickly trip up outgoing rate limiters, so I would recommend only doing 'batches' of 25 emails per session, close the session, and open another session for another 25 transactions. You can even put a wait period between 'batches' if you want. (I personally know of quite a few email servers with sending limits of only 100 MAIL FROM's or even RCPT TO per connection, because of compromised email account spammers.

Make more sense that the explicit description in the RFC's ?

And other systems like Calendars etc, can't generate emails fast enough to warrant keeping SMTP connections open, given how quickly a new connection can be opened on demand. But there are many different threat tools out there that will expect the email clients behavior to be similar to a human behavior.

This is a modified https://phabricator.services.mozilla.com/D196701
With this there can only be 1 connection per server, and if the connection
is reusable, a default limit of 10 messages can be sent before a new
connection is created for additional messages to be sent. QUIT timer is not
started if recipient count has reaached >= 100 or if reuse is disabled or if
number of message on the connection has reached its limit; otherwise, it is.
This also moves the reset of the QUIT timer earlier in the send and
makes sure QUIT isn't triggered if a new message occurs right before the
QUIT would be sent but before the timer is canceled (using new server
property "sendIsActive").

Comment 21 is a new "fix candidate" that implements most of the suggestions by DevTeam in comment 20. It keeps the parameter max_cached_connections but now defaults it to 1. And if a user sets it larger it is still seen as one. So you can only have 1 connection at a time per server. Also, if max_cached_connections is set to 0 or less, it still prevents reuse of the single connection so there is a new connection for each message sent.
This adds a new parameter messages_per_connection, currently defaulted to 10. It forces a connection close and a new connection after that many messages (a batch) are sent. This should definitely be user settable since, as we've seen, the allowed number of messages on a connection is SMTP server dependent.
I haven't put in any delay before sending more messages on the new connection yet -- not sure if it's needed or, if so, how much.
If the new parameter messages_per_connection is set to 0 or less, it allows an unlimited number of messages on the connection (as it currently works).
Note that this doesn't remove any code that supports multiple connections since it just forces the parameter (max_cached_connections) that controls the number to 1 (or to 0, as it currently works, to disable reuse).

... a default limit of 10 messages can be sent before a new connection is created for additional messages to be sent.

Limiting the number of messages per connection was suggested in bug 1854567. We're still providing this feature:
https://github.com/Betterbird/thunderbird-patches/blob/main/115/bugs/1854567-limit-number-messages-per-connection.patch

Attachment #9369596 - Attachment is obsolete: true
Attachment #9369596 - Attachment is obsolete: false
Attachment #9369154 - Attachment is obsolete: true

Magnus, I've decided to go with the this diff: https://phabricator.services.mozilla.com/D196942 that has reviewer "resigned" so punching NI to make sure you see it. (Didn't see a way to un-resign the reviewer.)
It's described some in comment 21 and comment 22.

Flags: needinfo?(mkmelin+mozilla)
Flags: needinfo?(mkmelin+mozilla)

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/5b2af0b7af47
Keep QUIT timer, add msg count limit and at most 1 conn per server. r=mkmelin

Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 123 Branch
Severity: -- → S3
Priority: -- → P2
Whiteboard: [TM 115.8.+]
Whiteboard: [TM 115.8.+] → [TM 115.7.+]

I don't know wny I had changed this from 115.8 to 115.7. I'm thinking this should have a full cycle of beta, thus 115.8.0. Unless you feel the patch is either critical, or trivial and super safe.

Flags: needinfo?(gds)

(In reply to Wayne Mery (:wsmwk) from comment #26)

I don't know wny I had changed this from 115.8 to 115.7. I'm thinking this should have a full cycle of beta, thus 115.8.0. Unless you feel the patch is either critical, or trivial and super safe.

It's probably mostly of interest to users like myaddons (Alexander Bergmann, to limit the number of messages per connection) and maybe the reporter DevTeam (no longer caches connections by default).
So unless they respond back indicating they need this feature in the release, I'd say just let it stay on beta for the cycle.

Flags: needinfo?(myaddons)
Flags: needinfo?(gds)
Flags: needinfo?(developers)
Whiteboard: [TM 115.7.+] → [TM 115.8.0]

Out of curiosity, how long does a beta cycle typically last?

While we (our office) have been able to work around the issue by using the Config Editor and manually changing the max_cached_connections values to 0, I suspect that this is not something that will be easy for the average user to do.

To put this in perspective, the crux of the issue is manifesting as follows - and yes, this is specific to the 'lightning' calendar plugin, but...

My mail server will automatically disconnect idle SMTP connections after a period of 30 seconds.

I go to the calendar plugin, and add an event.

I invite someone else to this event.

I click to save the new event.

The plugin attempts to send a calendar invitation email to the person I invited.

It throws an error message, the invitation is never sent (because the 'cached' connection timed out).

Now... we have been getting quite a few complaints about this behavior already from our customers. While yes, we have been giving them the workaround steps, this strikes me as being something that is probably presenting a larger scale of issues than just our customers... and there was already a lengthy delay just getting to the stage where Config Editor workarounds could be applied (Start of December 2023, updates didn't make it to OS distro level until mid January...).

Aside from the whole argument of 'should there be cached connections', or workarounds to add 'timer timeouts' (which should in theory work in my opinion) I am also curious if the failure to try to reconnect / spawn a new SMTP connection upon finding the 'cached' connection as invalid/dead is a shortcoming specific to the lightning calendar plugin, or something that might need to be examined at the core SMTP level?

Flags: needinfo?(developers)

115.8.0 is expected to ship February 20

DevTeam,
Have you tried the latest beta to see if it fixes the issue?

I'm assuming that your steps below are with the release and not the beta?

I go to the calendar plugin, and add an event.

I assume it doesn't try to send a mail at this point.
Also, was there a message send by TB recently (within your server's 30s timeout) before this point? If so, the connection to SMTP server may still be open.

I invite someone else to this event.
I click to save the new event.
The plugin attempts to send a calendar invitation email to the person I invited.
It throws an error message, the invitation is never sent (because the 'cached' connection timed out).

I don't know why the message would not be sent. I thought the timeout was due to your server timing out the connection 30s after the message was sent because, with the current release, there is no QUIT sent and the connection is just kept open until TB is shutdown or the server (or routers) times it out.
Anyhow, with the beta (and the future release) a QUIT will be sent 5s after a message is sent unless another message is sent within the 5 seconds.

Maybe if you could repeat the step of comment 28 with your release version and with with smtp.loglevel at ALL and with timestamps also enabled, it might show what is going on. Then maybe repeat the process with the beta and you should see better behavior and also a QUIT sent 5 seconds after the message is sent.

(In reply to gene smith from comment #27)

It's probably mostly of interest to users like myaddons (Alexander Bergmann, to limit the number of messages per connection) and maybe the reporter DevTeam (no longer caches connections by default).
So unless they respond back indicating they need this feature in the release, I'd say just let it stay on beta for the cycle.

The problems my users and I experienced before have all been fixed in Thunderbird 115.6.0 with the patch in Bug 1854567.

Flags: needinfo?(myaddons)

(In reply to gene smith from comment #30)

DevTeam,
Have you tried the latest beta to see if it fixes the issue?

Not yet no - our policy is to not run beta software in our office standard environment - I'll have to locate a sandbox environment to try that out on.

I'm assuming that your steps below are with the release and not the beta?

That is correct - release 115.6.0 - current stable available in Ubuntu 20.04

I go to the calendar plugin, and add an event.

I assume it doesn't try to send a mail at this point.
Also, was there a message send by TB recently (within your server's 30s timeout) before this point? If so, the connection to SMTP server may still be open.

There was no message sent prior to this - this was from a freshly started instance of Thunderbird.

I invite someone else to this event.
I click to save the new event.
The plugin attempts to send a calendar invitation email to the person I invited.
It throws an error message, the invitation is never sent (because the 'cached' connection timed out).

I don't know why the message would not be sent. I thought the timeout was due to your server timing out the connection 30s after the message was sent because, with the current release, there is no QUIT sent and the connection is just kept open until TB is shutdown or the server (or routers) times it out.
Anyhow, with the beta (and the future release) a QUIT will be sent 5s after a message is sent unless another message is sent within the 5 seconds.

Maybe if you could repeat the step of comment 28 with your release version and with with smtp.loglevel at ALL and with timestamps also enabled, it might show what is going on. Then maybe repeat the process with the beta and you should see better behavior and also a QUIT sent 5 seconds after the message is sent.

I gave this another try - and then remembered that I had another custom setting in place specific to turning off the JSM module for SMTP and IMAP (a previous bug with CLIENTID implementation that has since been fixed).

After reverting back to using JSM and restarting thunderbird, I am now getting different results.

Calendar event is created fine.

SMTP launches fine and sends a message out to my invited recipient.

After about 15? seconds, I get a new error message:

The message was sent successfully, but could not be copied to your Sent folder.
An error occurred while sending mail: Outgoing server (SMTP) error. The server responded:  timeout (#4.4.2).
Would you like to return to the compose window?

In the Error console I see:

 Component returned failure code: 0x80550021 [nsIUrlListener.OnStopRunningUrl] ImapClient.jsm:1858
 mailnews.smtp: S: 451 timeout (#4.4.2)    SmtpClient.jsm:429:17
mailnews.smtp: Command failed: 451 timeout (#4.4.2); currentAction=_actionIdle SmtpClient.jsm:578:19
    _onCommand resource:///modules/SmtpClient.jsm:578
    _parse resource:///modules/SmtpClient.jsm:379
    _onData resource:///modules/SmtpClient.jsm:436
 mailnews.send: Sending failed; An error occurred while sending mail: Outgoing server (SMTP) error. The server responded:  timeout (#4.4.2)., exitCode=2153066732, originalMsgURI=undefined MessageSend.jsm:337:32
08:16:54.979 mailnews.smtp: Socket closed. SmtpClient.jsm:550:17
08:17:47.844
mailnews.imap.4: NetworkTimeoutError: a Network error occurred
 NS_ERROR_NET_TIMEOUT: Component returned failure code: 0x804b000e (NS_ERROR_NET_TIMEOUT) [nsIUrlListener.OnStopRunningUrl] ImapClient.jsm:1858

Events created AFTER this initial error - same thing again.

This may be a separate unrelated issue....

Do we need a new bug report for devteam's comment 32?

Flags: needinfo?(gds)

And should this proceed as uplift to v115?

(In reply to Wayne Mery (:wsmwk) from comment #34)

And should this proceed as uplift to v115?

Comment 26 says this needs a cycle on beta but I don't see anywhere on this report that it was ever put in a beta. Maybe it has but I haven't checked the push logs. So if it has only been on daily, I don't know how much usage this has gotten. However, I've sent emails using daily and haven't seen a problem so it doesn't seem to break typical usage. I think the main thing this prevents is spurious disconnect errors after a message is sent as reported by DevTeam, so jump from daily to v115 (if never on beta) is probably OK.

Do we need a new bug report for devteam's comment 32?

Hard for me to say since I don't use the calendar features much at all. Not sure why saving to Sent folder would have a problem only when sending via calendar functions. It may be the same spurious error (due to not sending QUIT) triggering it that is fixed by this bug so the problem wouldn't be seen with this fix in place. Then again, save to sent sometimes just fails because a connection to imap server can't be made. But, if it fails, there should appear a dialog to select retry or save the message locally.

Flags: needinfo?(gds)

We're good on beta testing front - Target Milestone = 123 means it has been on beta for almost two full betas, 123 and 124.
So I was asking in case there were factors other than testing that needed consideration.

OS: Linux → All
See Also: → 136871
Summary: ThunderBird Server Connection Issues, Error Message Not very clear → ThunderBird Server Connection Issues, and Error Message Not very clear
Whiteboard: [TM 115.8.0]

Comment on attachment 9369596 [details]
Bug 1864924 - Keep QUIT timer, add msg count limit and at most 1 conn per server. r=mkmelin

[Triage Comment]
Approved for esr115 - considering it has gone through two betas, and this result from bug 136871 may mask other bugs/regressions

Attachment #9369596 - Flags: approval-comm-esr115+

FYI, delay in responding on our end, time constraints for building a new installation for testing. However, the calandar issue is avoided by changing caching to 0, but that of course is not workable for the general public usage. Will try to analyze more deeply in the coming days, but yes the way calendar invites handles confirmations looks different to standard SMTP initialization. Don't see any reason though at this time preventing these changes from esr115. We can re-open a new ticket if needed post release.

See Also: → 1965690
Regressions: 1965690
See Also: → 1988329
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: