Closed Bug 1745130 Opened 2 years ago Closed 1 year ago

sending message hangs when copying msg to sent folder

Categories

(MailNews Core :: Networking: IMAP, defect)

Thunderbird 91
Unspecified
All
defect

Tracking

(thunderbird_esr102 fixed, thunderbird112 fixed)

RESOLVED FIXED
113 Branch
Tracking Status
thunderbird_esr102 --- fixed
thunderbird112 --- fixed

People

(Reporter: oliver.ilk, Assigned: gds)

References

Details

Attachments

(14 files, 8 obsolete files)

186.31 KB, text/plain
Details
68.01 KB, image/png
Details
66.70 KB, image/png
Details
4.64 KB, image/png
Details
42.14 KB, image/png
Details
100.83 KB, text/plain
Details
3.52 KB, image/png
Details
48 bytes, text/x-phabricator-request
Details | Review
145.20 KB, video/mp4
Details
64.23 KB, video/mp4
Details
2.69 KB, image/png
Details
133.75 KB, video/mp4
Details
176.35 KB, video/mp4
Details
8.20 KB, text/plain
Details

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:94.0) Gecko/20100101 Firefox/94.0

Steps to reproduce:

  1. Send a new mail (at 04:41 UTC) -> hangs while copying
  2. Cancel (at 04:45 UTC)
  3. Retry (at 04:45 UTC)
    Level-5 IMAP log file attached
    (might relate to 123063/185677/206408 but these are 18-20 years old...)

Actual results:

Pop up window "Copying message to Sent Items folder..." stays forever at 100%
Cancel
Pop up window "Retry"/"Save"/"Don't Save" (says msg was sent which is wrong)
Retry - no effect (3 times)
Save (did not save it)
This happens only sporadically but still quite often - and I had logging activated

Expected results:

Should have sent the msg and copyied to sent folder
According to German TB forum this is a very old bug that could never get resolved due to lack of evidence - hope the logfile helps.
Rather strange that nothing was logged in between Send and Cancel/Retry
04:41:05 ImapThreadMainLoop leaving
04:45:35 TellThreadToDie

I have that issue in a company situation with an IMAP server which apparently isn't 100% stable. I was never able to reproduce it or log it successfully. The only thing that helps is saving the sent message locally and moving it to the IMAP Sent folder later.

Gene, any thoughts?

Flags: needinfo?(gds)

The internet connection is always stable and the IMAP server is fully available when this problem occurs. Even if the server was not 100% stable this apparently doesn't bother other email clients at all. Maybe they implemented some fault tolerance with retries.

It is a shame that nothing gets logged when this happens (not for the first time). Is there any more granular logging level than IMAP:5 that could be used? Or any other method to find out what TB is actually doing during this time?

Something else:
when I run TB with logging activated (Set MOZ_LOG=IMAP:5,timestamp) it keeps creating empty "child" files.

see uploaded screenshots:
2021-12-09 TB logfiles child-1-23.moz_log.png
2021-12-09 TB logfiles child-24-43.moz_log.png

when I try to delete them after I stopped TB, some of them (1, 3-4, 35-37, 41-44) are in use by Firefox (which I have +/- always running).

I don't know the Mozilla code base, but apparently some Firefox code interferes with Thunderbird logging.
Maybe also with network traffic or other pgm logic?

see comments below

see comment above

I don't want to flood you with meaningless text, but given the bug might be 20y old, I just provide all evidence I have.
Just tell me if I should stop.

Here are 2 summaries of previous cases for which I can provide logfiles if needed/useful.

2021-11-26

  1. 03:10:00 send
  2. 03:12:00 hangs when copying to Sent Items -> cancel
  3. 03:12:30 2nd send -> hangs again
  4. 03:13:00 save
  5. 03:14:00 3rd send -> ok

Also in this case the logfile shows no activity between 03:10 (send) and 03:13 (save)

2021-11-26 03:10:31.126000 UTC - [Parent 31480: IMAP]: D/IMAP SetConnectionStatus(0x0)
2021-11-26 03:13:01.436000 UTC - [Parent 31480: Main Thread]: D/IMAP proposed url = Drafts folder for connection Sent Items has To Wait = false can run = false

2021-12-07

  1. 09:14:45 send
  2. 09:14:?? hangs when copying to Sent Items -> cancel
  3. 09:14:?? retry 5x -> no effect
  4. 09:16:55 save, send -> ok

Also in this case the logfile shows no activity between 09:14 (send) and 09:16 (save, send),
but it shows error code 0x80470002 as last entry before it goes quiet.

2021-12-07 09:14:47.752000 UTC - [Parent 9304: IMAP]: D/IMAP URL failed with code 0x80470002 (imap://user01%2Enam%40bluewin%2Ech@imaps.bluewin.ch:993/appendmsgfromfile%3E/Sent%20Items)
2021-12-07 09:16:57.031000 UTC - [Parent 9304: Main Thread]: D/IMAP proposed url = Sent Items folder for connection INBOX has To Wait = false can run = false

I couldn't find any list of error codes in the Mozilla universe,
except for a probably inofficial https://james-ross.co.uk/mozilla/misc/nserror_list
saying "operation is performed on a stream that has already been closed"

First, the "child" files I see too. I filed a bug regarding these with mozilla devs but no response. TB does share some logging code with mozilla but it runs it independently. But the environment var can clash which causes these weird files to appear. You can just ignore them and/or delete them.

Are you saying the messages is not being sent? If so, that would be handled by the SMTP process in TB and not IMAP. SMTP has been re-written in javascript so the MOZ_LOG for SMTP:5 no longer works. You have to use the internal console in tb to log SMTP activity which I'm not sure if it' s documented.

I think with 91 you can still revert back to the original c++ implementation of SMTP with pref mailnews.smtp.jsmodule set to false and restart TB. You might try that and see if you get better sending behavior if you are sure no messages is getting sent.
If that doesn't help, you should set the pref back to true and restart tb.

In you descriptions you mention "retry". Do you mean you are trying to send the message again or do you mean TB is trying again? Sorry, not sure exactly what you mean.

In the IMAP log you attached, it looks like TB does save a draft of the message you were trying to send. However, the attempt to save to "Sent Items" never occurs. Of course, if nothing was actually sent, nothing could/should be saved to "Sent Items".

Flags: needinfo?(gds)

I believe the message is sent since in my case I request and get a read receipt for it. The UI suggests that it is sent and that storing the sent message in the IMAP folder fails. Otherwise it wouldn't make sent so offer the "Retry / Store locally / Cancel" panel since you would store a message locally that was never sent. So it doesn't matter whether you use the old C++ or the new JS module for sending in TB 91; also, the issue already happened in TB 78. Just as a matter of interest: Where is the sent message held in TB after sending and before trying to move/store it in an IMAP folder. Still in the local Outbox? Interesting that according to the log (I haven't looked) no storing in the Sent folder is happening. But doesn't the log cover various retries? Surely some action would have been taken there.

How does the successful storing in the Sent folder after sending look in the logs? Perhaps the reporter can check his various logs for that.

newsfan, I was going by what the reporter Scrambler said a couple times in the opening comment 0:

Pop up window "Retry"/"Save"/"Don't Save" (says msg was sent which is wrong)

Expected results:

Should have sent the msg and copyied to sent folder

which sounds like the message wasn't sent. Not sure that's what is meant.

The message sent is put into a temp file and the temp file is sent to IMAP process of TB which makes a connection to the "Sent" folder and then does an imap APPEND to send the message in the file to the server to store it into the "Sent" folder. From the attached log, it appears that the tcp/ip connection made for appending to the "sent" folder is being refused or dropped. If it worked, you would see the whole message in the log getting sent (APPENDED) to the server. (While composing the message, the log does seem to show a correct APPEND of the draft message to Drafts folder. This is a very similar activity.)

Maybe you said, but what are the types of IMAP server being used, e.g., dovecot, cyrus, exchange, etc.

I would say the reporter didn't phrase this correctly and the message was sent, what was missing was the copying to the Sent folder, in his case "Sent Items". But he can clarify that.

This is in the log:

2021-12-09 04:45:35.656000 UTC - [Parent 8844: Main Thread]: I/IMAP creating protocol instance to play queued url:imap://user01.nam@bluewin.ch@imaps.bluewin.ch:993/appendmsgfromfile>/Sent Items
2021-12-09 04:45:35.656000 UTC - [Parent 8844: Main Thread]: D/IMAP proposed url = Sent Items folder for connection Drafts has To Wait = false can run = false
2021-12-09 04:45:35.656000 UTC - [Parent 8844: Main Thread]: I/IMAP playing queued url:imap://user01.nam@bluewin.ch@imaps.bluewin.ch:993/appendmsgfromfile>/Sent Items

I can't see the dropping or refusing of the connection. In any case, is that happens, TB should abort the APPEND operation with an error instead of hanging until the user presses Cancel.

Maybe you said, but what are the types of IMAP server being used, e.g., dovecot, cyrus, exchange, etc.

Looking closer at the attached log it appear the server type is "Openwave" which seems to be the same type used by my ISP but now called "Intermail". Probably not exactly the same (e.g., supports IDLE, mime doesn't) but I don't seem to have any problems sending mail through them.

It appear the reporter was selected on "Sent Items" when getting ready to append the temp file the connections was dropped with error 0x8047002

2021-12-09 04:41:05.015000 UTC - [Parent 8844: Main Thread]: I/IMAP failed creating protocol instance to play queued url:imap://user01.nam@bluewin.ch@imaps.bluewin.ch:993/appendmsgfromfile>/Sent Items
2021-12-09 04:41:05.037000 UTC - [Parent 8844: IMAP]: D/IMAP ReadNextLine [rv=0x0 stream=1b2372f2280 nb=0 needmore=1]
2021-12-09 04:41:05.037000 UTC - [Parent 8844: IMAP]: D/IMAP ReadNextLine [rv=0x80470002 stream=1b2372f2280 nb=0 needmore=0]
2021-12-09 04:41:05.037000 UTC - [Parent 8844: IMAP]: I/IMAP 1b228895000:imaps.bluewin.ch:S-Sent Items:CreateNewLineFromSocket: clearing IMAP_CONNECTION_IS_OPEN - rv = 80470002

Then it appears that TB immediately tries to open another connection to retry the append but that fails too after TB waits 30 seconds for a server response:

2021-12-09 04:41:05.037000 UTC - [Parent 8844: Main Thread]: I/IMAP creating protocol instance to retry queued url:imap://user01.nam@bluewin.ch@imaps.bluewin.ch:993/addmsgflags>UID>/Sent Items>1121>2
2021-12-09 04:41:05.037000 UTC - [Parent 8844: Main Thread]: I/IMAP retrying  url:imap://user01.nam@bluewin.ch@imaps.bluewin.ch:993/addmsgflags>UID>/Sent Items>1121>2
2021-12-09 04:41:05.037000 UTC - [Parent 8844: IMAP]: D/IMAP ImapThreadMainLoop entering [this=1b22bbe7800]
2021-12-09 04:41:05.037000 UTC - [Parent 8844: IMAP]: D/IMAP ImapThreadMainLoop leaving [this=1b228895000]
2021-12-09 04:45:35.008000 UTC - [Parent 8844: IMAP]: I/IMAP 1b22bbe7800::NA:TellThreadToDie: close socket connection
2021-12-09 04:45:35.008000 UTC - [Parent 8844: IMAP]: D/IMAP ImapThreadMainLoop leaving [this=1b22bbe7800]

But a parallel attempt to open a connection to Drafts does work:

2021-12-09 04:45:35.008000 UTC - [Parent 8844: IMAP]: D/IMAP ImapThreadMainLoop entering [this=1b227a5d000]
2021-12-09 04:45:35.023000 UTC - [Parent 8844: Main Thread]: I/IMAP 1b227a5d000:imaps.bluewin.ch:NA:SetupWithUrlCallback: clearing IMAP_CONNECTION_IS_OPEN
2021-12-09 04:45:35.023000 UTC - [Parent 8844: IMAP]: I/IMAP 1b227a5d000:imaps.bluewin.ch:NA:ProcessCurrentURL: entering
2021-12-09 04:45:35.023000 UTC - [Parent 8844: IMAP]: I/IMAP 1b227a5d000:imaps.bluewin.ch:NA:ProcessCurrentURL:imap://user01%2Enam%40bluewin%2Ech@imaps.bluewin.ch:993/appenddraftfromfile%3E/Drafts%3EUID%3E:  = currentUrl
2021-12-09 04:45:35.139000 UTC - [Parent 8844: IMAP]: D/IMAP ReadNextLine [rv=0x0 stream=1b2372f2820 nb=31 needmore=0]
2021-12-09 04:45:35.139000 UTC - [Parent 8844: IMAP]: I/IMAP 1b227a5d000:imaps.bluewin.ch:NA:CreateNewLineFromSocket: * OK IMAP4 PROXY server ready

But then TB tries again to run the queued "append to sent" URL on the connections used to save the Draft (shown in comment 10) but the draft gets appended again instead of the the sent file (via a new URL). Not sure why the draft gets saved twice since I think the minimum time between draft saves is 1m and it appears there is about 10 seconds between draft saves. So it almost appears that the attempt to retry the the "save to sent" gets converted to a "save to drafts" URL.

Looks to me as if two events together lead to the problem: First a connection drops (error 80470002, NS_BASE_STREAM_CLOSED), then saving the sent item causes a conflict with saving a draft. Aren't drafts saved automatically as long as the compose window is open, which it is here. Yes, these two in 10 seconds difference are surprising:

2021-12-09 04:45:35.525000 UTC - [Parent 8844: IMAP]: I/IMAP 1b227a5d000:imaps.bluewin.ch:S-Drafts:SendData: 48 append "Drafts" (\Draft) {1857}
2021-12-09 04:45:46.484000 UTC - [Parent 8844: IMAP]: I/IMAP 1b227a5d000:imaps.bluewin.ch:S-Drafts:SendData: 54 append "Drafts" (\Draft) {1857}

Any possibility to build an ESR version with more diagnostics for the reporter to run? Apparently he frequently runs into the issue (I stopped short of saying: He can reproduce the issue). Another option would be to programmatically simulate error 80470002 at the right location to see how the system behaves afterwards. You would simulate

Looks like despite what the reporter claims, the connection isn't stable, there are are also:

2021-12-09 04:45:41.661000 UTC - [Parent 8844: Main Thread]: I/IMAP failed creating protocol instance to play queued url:imap://user01.nam@bluewin.ch@imaps.bluewin.ch:993/fetch>UID>/INBOX>71011

twice in the log at the same time.

We notice David Bienvenu's debug here:
https://searchfox.org/comm-central/rev/30b885249ef2f353b846bc20a41e83d3189a03ce/mailnews/imap/src/nsImapIncomingServer.cpp#681
He must have been debugging in nsImapIncomingServer::GetImapConnection() but that wasn't turned into logging.

Finally, reporter, have you changed "Maximum number of server connections to cache" setting in the advanced server settings or is it still at 5?

I guess it's possible the reporter did a manual save of the draft to cause the 2nd one, but unlikely.

That algorithm tb uses to choose a connection to run a URL has always been a bit unclear to me so I need to look at it again (with real/simulated network errors going on).

Finally, reporter, have you changed "Maximum number of server connections to cache" setting in the advanced server settings or is it still at 5?

Yes, I meant to ask this too in my previous comment. Also, did you (reporter) do a manual "save to drafts" and what is your "save to draft time interval" set to?

Flags: needinfo?(oliver.ilk)
Flags: needinfo?(oliver.ilk)

Not sure about your questions, but here are my settings that might relate to them:
"auto save every 5 minutes"
mail.compose.autosave.interval 5
mail.server.server1.max_cached_connections 5
network.http.max-persistent-connections-per-server 6

Re. your other questions :
I don't know what type of IMAP server they are using - how could I find out?
And no, I didn't manually save or retry, just used the buttons/options in the windows/panels that were displayed.

Here is a new case where I tried to reflect the sequence in more detail.
Will also attach the complete (downstripped) logfile in which you can also see what happens when a mail gets sent normally

2021-12-15

  1. 08:53:58 send (out of compose mail window) -> displays "copying to Sent Items..." window with empty progress bar and "cancel" button
  2. 08:54:02 hangs while still displaying "copying to Sent Items..." window with progress bar at 100% and "cancel" button
    (see attachment "2 Sending Message hangs.png")
  3. 08:55:00 press cancel -> displays "retry/save/don't save" window, and (sometimes) "Server ... has disconnected"
    (see attachment "3 Retry-Save--Server disconnected.png")
  4. 08:56:00 press retry -> immediately redisplays "retry/save/don't save" window
  5. 08:57:00 press save -> back on compose mail window
  6. 08:58:00 press send -> "copying to Sent Items..."
  7. 08:58:02 (done) -> mail compose window closes -> back on main window (mail list)

Also in this case the logfile shows no activity between 08:53:58 (send) and 08:57 (save),
so steps 3 (cancel) and 4 (retry) seem to have no effect.
And the 2nd last last entry before "going to sleep" shows 0x80004005

2021-12-15 08:53:58.487000 UTC - [Parent 21428: IMAP]: I/IMAP 2a235c68000:imaps.bluewin.ch:S-Spam:ProcessCurrentURL: entering
2021-12-15 08:53:58.487000 UTC - [Parent 21428: IMAP]: I/IMAP 2a235c68000:imaps.bluewin.ch:S-Spam:ProcessCurrentURL:imap://user01%2Enam%40bluewin%2Ech@imaps.bluewin.ch:993/appendmsgfromfile%3E/Sent%20Items: = currentUrl
2021-12-15 08:53:58.487000 UTC - [Parent 21428: IMAP]: I/IMAP null imapMailFolderSink
2021-12-15 08:53:58.487000 UTC - [Parent 21428: IMAP]: I/IMAP 2a235c68000:imaps.bluewin.ch:S-Spam:ProcessCurrentURL: aborting queued urls
2021-12-15 08:53:58.487000 UTC - [Parent 21428: IMAP]: I/IMAP 2a235c68000:imaps.bluewin.ch:S-Spam:SendData: clearing IMAP_CONNECTION_IS_OPEN
2021-12-15 08:53:58.487000 UTC - [Parent 21428: IMAP]: D/IMAP SetConnectionStatus(0x80004005)
2021-12-15 08:53:58.487000 UTC - [Parent 21428: IMAP]: D/IMAP ImapThreadMainLoop leaving [this=2a235c68000]
2021-12-15 08:57:00.534000 UTC - [Parent 21428: Main Thread]: D/IMAP proposed url = Drafts folder for connection INBOX has To Wait = false can run = false
2021-12-15 08:57:00.534000 UTC - [Parent 21428: IMAP]: D/IMAP ImapThreadMainLoop entering [this=2a235904000]
2021-12-15 08:57:00.534000 UTC - [Parent 21428: Main Thread]: I/IMAP 2a235904000:imaps.bluewin.ch:NA:SetupWithUrlCallback: clearing IMAP_CONNECTION_IS_OPEN
2021-12-15 08:57:00.534000 UTC - [Parent 21428: IMAP]: I/IMAP 2a235904000:imaps.bluewin.ch:NA:ProcessCurrentURL: entering
2021-12-15 08:57:00.534000 UTC - [Parent 21428: IMAP]: I/IMAP 2a235904000:imaps.bluewin.ch:NA:ProcessCurrentURL:imap://user01%2Enam%40bluewin%2Ech@imaps.bluewin.ch:993/appenddraftfromfile%3E/Drafts%3EUID%3E: = currentUrl
2021-12-15 08:57:00.666000 UTC - [Parent 21428: IMAP]: D/IMAP ReadNextLine [rv=0x0 stream=2a23b9101f0 nb=31 needmore=0]
2021-12-15 08:57:00.666000 UTC - [Parent 21428: IMAP]: I/IMAP 2a235904000:imaps.bluewin.ch:NA:CreateNewLineFromSocket: * OK IMAP4 PROXY server ready
2021-12-15 08:57:00.666000 UTC - [Parent 21428: IMAP]: D/IMAP SetConnectionStatus(0x0)

More points:

  1. mail not sent in Step 3 (Cancel): I actually don't know if it was sent. Was just assuming it since there was no evidence in the log.
    With "request receipt" would I have to wait until the recipient has read it or would indicate delivered into his mailbox?

  2. Gene said: "You have to use the internal console in tb to log SMTP activity"
    How can I do this?
    Or can I activate both IMAP and SMTP logging? What would the setting be for MOZ_LOG?
    I just find it hard to accept that TB seems to do just nothing between send and save - completely ignoring cancel and retry!

  3. Newsfan said: "Apparently he frequently runs into the issue (I stopped short of saying: He can reproduce the issue)"
    Unfortunately I cant' reproduce it, but frequently is correct ;-)

  4. "couldn't find any list of error codes in the Mozilla universe, except for a probably inofficial https://james-ross.co.uk/mozilla/misc/nserror_list"
    The last case now shows 0x80004005 (not 80470002)
    NS_ERROR-FAILURE - This is the most general of all the errors and occurs for all errors for which a more specific error code does not apply
    Ist there an official list of error codes?

re. 1 - I was wrong: the mail actually gets sent. I could just reproduce it with a mail I cc'd to myself, which got stuck. After cancel, save, send I had received it twice...

re. 2 - found this at mozilla wiki:
SMTP --- NOTE! for Thunderbird 91+ SMTP logging is changed and does not use this way of logging. There, set the mailnews.smtp.loglevel pref to "All" and open the Error Console (Ctrl+Shift+J).

Will start "playing" with it...

re. 4 - there must be something official. Who decided to hide such important information and why??

an earlier comment I wrote was:

"I don't know the Mozilla code base, but apparently some Firefox code interferes with Thunderbird logging.
Maybe also with network traffic or other pgm logic?"

Generally using a common framework is perfect. "Re-use" is always better than re-inventing the wheel.

But TB logfiles being allocated by an external process (Firefox) indicate some potentially severe issues:

  • bypassing process and filesystem integrity of the operating system (which normally only hackers and their Trojans do!)
  • potential "protection exceptions" leading to data integrity issues like crashes or BSoD

These are no "newbee" issues. One has to really buckle down into the OS fundamentals to achieve this!
The headline for this would be: "Mozilla (Firefox, Thunderbird etc.) deploy malicious code!"

I would have expected all alarm bells to ring in the Mozilla community with such an issue:
"Hey, Mozilla, we have a fundamental problem!"

But no: nothing, not even in the expert community - highly disturbing...

Just for clarity - do you have the following settings:

'Account Settings' > 'Copies & Folders' for the account
Select 'Place a Copy in'
Select 'Other' and choose 'Sent Items on mail account name'

If no:
please select those settings and then restart Thunderbird.
Report on sending results.

Flags: needinfo?(oliver.ilk)

my settings are exactly what you describe - and have always been for ages btw

shouldn't this be more than clear given that the screenshot I uploaded says "copying to Sent Items..."?

Flags: needinfo?(oliver.ilk)

In the 2nd log (where a successful save to SENT finally occus) it does appear the tb makes several failed attempt to connect to the server before the successful attempt works and the message is appended to SENT mailbox.

If you select Inbox and shutdown tb and then restart and remain on Inbox and then compose messages to yourself and send them, do you see problems sending several messages? I don't know how often you see the "save to sent" problem but what I'm suggesting is that your server is seeing too many connections and rejecting the attempt to save to sent mailbox. If you just stay on Inbox and don't visit other folders, there will be 3 connections maximum probably, one to inbox, one to drafts when a draft is saved and one to Sent when the message is saved to sent.

Another way to cause fewer server connection is to temporarily turn off the drafts auto-save feature under General/Composition. The will prevent tb from making the save to drafts connection. Restart tb if you make this change to clear out any existing connections.

If the above works and you don't see failure to save to sent, you might try reducing the number of "cached connections" in advanced server setting down from default of 5 to 4. This will force tb to reuse an existing connection instead of opening a new 5th one. So I'm thinking maybe the server will accept 4 connections but doesn't like a 5th one.

If you can try the above experiments and let us know if it helps, hurts or makes no difference, it would be good to know.
Thanks.

Reporter wrote in comment 2:

Even if the server was not 100% stable this apparently doesn't bother other email clients at all.

Is it possible that your "other email clients" are also connected to the server when TB's "save to sent" fails? If so, this may also point to the server limiting the number of simultaneous connections from your account.
Also, the other non-TB clients may just use a single imap connection so the attempt to obtain another connection never occurs so can't fail when saving to sent, which would be similar to setting TB's cached connections to 1 (not recommended).

Also, I added in comment 24 that you should restart tb if you turn off the drafts auto-save to limit connections.

No, I'm normally not using multiple email clients in parallel - just did this twice in Oct to find out if the behavior was the same.

Is there a "common" or "accepted" standard of IMAP servers re. concurrent connections?
Wouldn't it maybe even make sense to serialize some IMAP communications rather than flooding the server with too many sessions?
I think it might make a lot of sense for an IMAP server to accept only 1 concurrent session per folder in order to prevent from data clashes.

So, why would "cached connections 1" not be recommended - if it might resolve the issue?

"If you just stay on Inbox and don't visit other folders, there will be 3 connections maximum"
why do you conclude this? As far as I saw in the logs TB connects to and opens sessions with inbox, drafts, sent, trash, spam - which is already 5

Anyway, I will run the tests as you suggested:

  1. restart TB in sent mail list (which I cut down to 0 meanwhile, same as for inbox) and try to compose+send a few mails to myself
  2. turn off auto save and try some compose+send to myself
  3. cut down cached connections to 4 and try some compose+send to myself

So, why would "cached connections 1" not be recommended - if it might resolve the issue?

With just one "cached" connection you will only have one imap thread so you should get better performance with 5 threads. Most servers have no problem with 5 connections but we've seen problems with Yahoo where it was refusing the 5th and to solve the problem the users reduced the number of connection down to 4 to fix it. Your server appears to be Openwave which I've not seen before from random users I dealt with over the years. However, my ISPs server was also called Openwave a few years ago but now called InterMail. Not sure if they are the same. I occasionally see failures to connect, usually when saving Drafts, but "retry" fixes it. Haven't seen failure to save to sent.

"If you just stay on Inbox and don't visit other folders, there will be 3 connections maximum"
why do you conclude this? As far as I saw in the logs TB connects to and opens sessions with inbox, drafts, sent, trash, spam - which is already 5

Well, if you just stay on Inbox it will make it less likely to open connections (imap select) to other folders. However, I have seen others get connected even when they haven't been visited and not sure why. But this should minimize the number of connections. And if you turn off auto-save that should definitely reduce at least one connection.

Another thing that may cause additional connections is if you are checking for new mail in folders other than Inbox. This is set by right-clicking the folder and going to the folder properties and selecting "check this folder for new mail". But normally tb just re-uses an existing connection to do this.

I turned auto-save off and reduced cached connections to 4. So far I could not reproduce the problem but I'll keep you posted.

Now, when I was thinking again about our discussion on concurrent connections, a sneaking suspicion came up my mind - probably somewhat cringey for myself...

I have IMAP mail activated on my iPhone which is +/- always running. Is it possible that these server access problems are caused by my phone coincidentally contacting the same IMAP server with my credentials at the same time, thus causing too many concurrent connections together with TB?

I will turn off my phone for some more tests, but this would really be highly embarrassing and I feel ashamed that I didn't think of this before.

Sorry, I'm just cursorily following this, so please excuse me if I missed something. Whatever the number of cached connections and whatever other devices in the same household potentially on the same IP address bombarding the IMAP server, why is that a reason for TB to hang?

As I said it's just a suspicion. We know that the IMAP server only allows for a certain number of concurrent connections to the same folder. My iPhone definitely uses the same IMAP server and the same folders with the same credentials as TB on my PC. So if the phone happens to update the mail folders at the same time as I send a mail on TB on my PC this could exceed the allowed number of concurrent connections to those folders, one of which is "Sent Items", that could then refuse the additional connection for TB to copy into it. Again, just an assumption, but it could explain why this only happens sporadically.

(In reply to newsfan from comment #29)

Sorry, I'm just cursorily following this, so please excuse me if I missed something. Whatever the number of cached connections and whatever other devices in the same household potentially on the same IP address bombarding the IMAP server, why is that a reason for TB to hang?

Well, I don't think a tb hang in the sense that you have to restart or reboot is occurring. It can be cleared by retry or cancel and then reporter can do other activities OK, I think.

If the server really is refusing the connection to do the activity, and tb has other connections and they are idle, tb maybe could hijack an idle connection to complete the activity. However, it currently doesn't do this and would probably be a major change to do this. It only does this now when the max number of cached connections is actually reached.

@Gene: "tb maybe could hijack an idle connection to complete the activity. However, it currently doesn't do this and would probably be a major change to do this"
Does this mean it will never be done or considered?

What do you think of my smartphone theory?

(In reply to 5crambler from comment #32)

@Gene: "tb maybe could hijack an idle connection to complete the activity. However, it currently doesn't do this and would probably be a major change to do this"
Does this mean it will never be done or considered?

I would never say never. But it sounds like it might be a big change. I'll look into it when I finish what I'm currently working on, if something else doesn't come up first.

What do you think of my smartphone theory?

Sounds possible. It probably depends on whether your ISP limits the number of connections per device or per imap account.

After a while now running TB with no auto-save and limited cached connections I still experience the same errors - just less frequently.
Maybe I'm the only one in this universe, which I doubt. So, what do you experts suggest?
I've provided loads of evidence and I have spent as much time chasing a bug in a piece of SW as I have spent using it - which is what it is meant for.
Any chance to get a meaningful resolution after all, like "tb maybe could hijack an idle connection to complete the activity"?

Lately I'm seeing this behavior with a particular (company) server:
In about - say - 20% of the cases, copying the sent message to the Sent IMAP folder hangs. After clicking "Cancel" after a while, the first "Retry" fails and the second one succeeds. This is 100% consistent.

So there appear to be two bugs: 1) TB doesn't detect the hang and blocks the UI 2) Something is wrong with the first retry.

Wouldn't it be possible to simulate erroneous server behavior (or a network problem) either in the server itself or in some TB code? Should I try to get an IMAP log for when this happens?

Flags: needinfo?(gds)

(In reply to 5crambler from comment #34)

After a while now running TB with no auto-save and limited cached connections I still experience the same errors - just less frequently.
Maybe I'm the only one in this universe, which I doubt. So, what do you experts suggest?
I've provided loads of evidence and I have spent as much time chasing a bug in a piece of SW as I have spent using it - which is what it is meant for.
Any chance to get a meaningful resolution after all, like "tb maybe could hijack an idle connection to complete the activity"?

Since you wrote this I have learned that TB's imap subsystem is scheduled for a complete re-write (in javascript). So my "hijack" idea might be a bit much to add to a dead code-base. I don't know if the intention is to port the current imap functionality to the new code or re-design the system from scratch.

(In reply to newsfan from comment #35)

Lately I'm seeing this behavior with a particular (company) server:
In about - say - 20% of the cases, copying the sent message to the Sent IMAP folder hangs. After clicking "Cancel" after a while, the first "Retry" fails and the second one succeeds. This is 100% consistent.

Not sure what you mean by "Sent IMAP folder hangs". Which window is this that hangs? Also, why are you clicking "Cancel". Also which window has "Retry" and are you clicking "Retry". Probably need a more exact "steps to reproduce" here.

So there appear to be two bugs: 1) TB doesn't detect the hang and blocks the UI 2) Something is wrong with the first retry.

Wouldn't it be possible to simulate erroneous server behavior (or a network problem) either in the server itself or in some TB code? Should I try to get an IMAP log for when this happens?

If you could provide a IMAP:5 log that captures this problem, it might show something. But if there's a network error it is sometimes hard to tell from the log who is dropping the connection. You might also need to run wireshark to tell which end is doing what. (If your server connection uses TLS you may want to run TB and wireshark with environment variable SSLKEYLOGFILE to decrypt and see the actual IMAP data in wireshark.)

Flags: needinfo?(gds)
Attached image copy-hang.png

Not sure what you mean by "Sent IMAP folder hangs".

Well, the full sentence was:

"copying the sent message to the Sent IMAP folder hangs" or to make is super clear:
"copying the sent message to the IMAP folder called 'Sent' hangs".

In my case it's called "Sent2". "Retry" refers to the panel that is shown after clicking "Cancel", I'm sure you're familiar with it since you implemented it in bug 1366591 and bug 1414022. I click "Cancel" since the process hangs, that seems to be the obvious choice.

The simulation would be to prevent the IMAP connection while leaving the SMTP connection untouched. That way the message can be sent but the sent message can't be copied. First the "hang" needs to be fixed, that is TB needs to time out, and then the fact that only the second "Retry" works. It might be better to simulate under controlled conditions than to wait for it to happen again.

It failed twice in a row now and after the first failure I started with logging, and I have a log now. Sadly it contains a lot of private information and the message which was saved on the 2nd "Retry", so I can't publish it here.

Ok, I wasn't sure what you meant by "folder hangs".
Did the log show a successful save to drafts? I think the original reporter's log showed that.
Like I said, you may need to just run wireshark and see who is dropping or refusing the connection.
In the mean time, I'll look into a simulation like you suggest.

Drafts go to a local folder. I've sent you the log, didn't you see it? Sorry, never used Wireshark. Irrespective of what's going wrong, TB shouldn't hang. There needs to be a timeout.

Ok, I see your log in my inbox, thanks.
When trying to save to sent2 the log shows that it's OK to use the current connection for Sent2 (line 17949, can run = true) then a few lines down it changes its mind about Sent2 (line 17952, can run = false) which triggers the url failure. I need to look closer and hopefully determine why this strange thing happens. It appears there is only one connection to Sent2.

Also, in the log it appears there are actually 2 appendmsgfromfile to Sent2, at line 18243 and 30229. Maybe the 2nd one you just sent OK without any problems?

I also see what looks like a failure to authenticate at line 254 (aborting queued urls) and a few other places mostly near the top of the log. Don't know if this is relevant to this bug.

To see if other accounts are affecting the problem account, you can temporarily "disable" them by un-setting "check for new messages at startup", "check for new messages every X min" and "notify immediately new mail" and restart tb. Then don't visit the disabled accounts when trying to send from the problem account. I don't see how this would be a problem, but who knows.

Thanks for looking at the log. Somehow your line numbers don't match mine:

Line 23886: 2022-04-01 19:29:06.822000 UTC - [Parent 15736: Main Thread]: D/IMAP proposed url = Sent2 folder for connection Sent2 has To Wait = false can run = true
Line 23889: 2022-04-01 19:29:06.823000 UTC - [Parent 15736: Main Thread]: D/IMAP proposed url = Sent2 folder for connection Sent2 has To Wait = true can run = false
Line 23895: 2022-04-01 19:29:06.823000 UTC - [Parent 15736: Main Thread]: D/IMAP proposed url = Sent2 folder for connection Sent2 has To Wait = true can run = false

But yes, "can run" changes from true to false. Yes, there are two attempts to appendmsgfromfile. As I wrote in the e-mail I sent you with the log: The message was send at 19:29 resulting in the first appendmsgfromfile which hung. I cancelled at 19:30 and after about 30 seconds hit "Retry". Then at 19:31 I hit "Retry" again leading to the second appendmsgfromfile. During the entire minute of there is 19:30 nothing in the log for that account, only for the others. A colleague who has the same problem told me that his profile only contains one account on the server which causes the issue, so the other account don't appear to be relevant. However, I can follow your steps to "disable" the other accounts.

If I understand the prior discussion in this bug with the original reporter correctly, those hangs lead to nothing being reported in the log, see comment #6 (quote): Also in this case the logfile shows no activity between 09:14 (send) and 09:16 (save, send).
BTW, one of the logs supplied by the reporter shows the same pattern:

Line 144: 2021-12-09 04:41:04.999000 UTC - [Parent 8844: Main Thread]: D/IMAP proposed url = Sent Items folder for connection Sent Items has To Wait = false can run = true
Line 147: 2021-12-09 04:41:05.015000 UTC - [Parent 8844: Main Thread]: D/IMAP proposed url = Sent Items folder for connection Sent Items has To Wait = true can run = false
Line 151: 2021-12-09 04:41:05.015000 UTC - [Parent 8844: Main Thread]: D/IMAP proposed url = Sent Items folder for connection Sent Items has To Wait = true can run = false
Search "Sent Items folder for connection Sent Items" (1 hit in 1 file of 1 searched)

And a question: What function is triggered behind the "Retry" button?

And a question: What function is triggered behind the "Retry" button?

Looks like _mimeDoFcc() should be called:
https://searchfox.org/comm-central/rev/426b56ce23c5fb85d61e7e11a143f62052da7639/mailnews/compose/src/MessageSend.jsm#462

Haven't follow how that gets converted to appendmsgfromfile URL and passed to imap.

Sorry about the line nums. Looks like TB puts in a trailing 0x0d at the end of lines containing IMAP data that my editor doesn't count as an extra line but other editors do, so you see lines like "....Ready\n\n\r" that my editor (vim) sees as a single line while yours sees it as two.

Anyhow, the first thing I don't understand is why it looks through 6 cached connection and looks at the cached connection to Sent2 twice. I'm assuming you still have max cached conns set to 5. This seems to be causing the first failure to save to Sent2.

mail.server.serverNN.max_cached_connections is set to 5 for all 5 accounts in the profile.

Yes, _mimeDoFcc() is at
https://searchfox.org/comm-central/rev/426b56ce23c5fb85d61e7e11a143f62052da7639/mailnews/compose/src/MessageSend.jsm#870
That will try to copy the message again, and that will trigger some IMAP stuff under the covers.

So where from here? Is there a logic error and not a network dropout on the server? Why 6 cached connections?

No luck in duplicating what you see. But I do see one thing in you log that I ignored before but don't understand. After the point where it fails to run the "save to Sent2" url, I see an attempt to store a flag value on a message in Sent2. I don't see any "store" imap command so it appears this fails too.
Specifically it is creating protocol instance to retry queued url:imap://<your server>:993/addmsgflags>UID>/Sent2>1963>64 which means it is trying to set flag value 64 (0x0040 == kImapMsgForwardedFlag) on message UID 1963
I don't see this url run when I send a message from here.
You can identify which message in Sent2 has URL 1963 by adding the column "Order Received" which actually lists the message UID.

Edit: I see the same thing if I forward a message in my sent folder. When I test with a server that does not include "forward" flag as a permanent flag (like your server), I see the same thing. The addmsgflags url occurs but no network activity occurs since the flag is only stored locally for the message. So this is probably not significant.

Also, FWIW, after clicking "retry" the second time, it appears the the message got appended twice to Sent2, according to your log. I don't know if you ended up with two copies of the message in Sent2 or if your server saw the second one as an exact duplicate and deleted it.

Yes, message 1963 has a forwarded flag in the Sent2 folder.

As I said: Storing/appending failed at 19:29, clicked "Cancel" at 19:30 followed by "Retry". None of that is in the log and no message got stored for the first "Retry". Then the second "Retry" at 19:31 succeeded which resulted in the sent message being stored once, not twice.

See Also: → 123063, 185677, 206408

newsfan, any chance you only see this when you select Sent2 and then forward a message from there? I've been doing this myself and yet to see a problem.

I don't know if relevant, but can you recall the order (from your log) you selected (visted) folders before you tried to send the message? I can see several "selects" occurring in the log for the subject account but not sure which are true "visits".
Can you duplicate the error by just sending or forwarding a message to yourself?

Does this occur on all accounts or just one?

Well, in this particular case, the course of events was this: Reply to a message from the Sent2 folder (reply to self) as a follow-up. That already hung, Cancel + 2x Retry fixed that resulting in message 1963. That was later forwarded and hung again. So the two hangs were at the time the Sent2 folder was selected. So perhaps copying the message to the folder conflicts with some other action on the folder. However, this is intermittent. I've just tried again replying to a message from the Sent2 folder with no issue.

It happens on the default account in the profile, the one that has the Sent2 folder. All other four accounts in the profile are not used for sending, although I believe on one occasion I also had a hang sending from one for the other four. I still believe that there is some sort of connection problem to the server of the account where it fails and TB doesn't treat the condition correctly and instead hangs. If this were a more common issue, you'd see more bugs about it.

In the meantime I've "disabled" the other four accounts, so no e-mail is fetched there. We'll see whether the problem persists.

So in your simulation, what happens if you block the IMAP port while sending a message? It's possible that the server where the issue happens can't handle five (or six) open connections. That's why someone from the IT department had suggested in the past to set mail.server.serverNN.max_cached_connections to 1, but that also didn't solve the issue.

Wow! A meaningful feedback here suggesting I was not fantasizing after all? Thank you, newsfan!
If there's anything I can contribute, just tell me! - I really think it's about time to resolve this 20y+ issue!

newsfan, Ok, thanks for the info. I didn't realize you had "two hangs at the same time".

From comment 44:

Why 6 cached connections?

There appear to be only 4 connections in place when you sent the message, Inbox, Inbox/sb1 and Inbox/sb2 and Sent2. When it looks for a connection to use in the loop it sees Sent2 as "can run = true" so it should have immediately used that connection to run the appendmsgfromfile url and save to sent message. (The connection to Inbox is normally not hijacked and used for other purposes unless max cached conns is set to 1.) But the log shows it looking at the three non-inbox conns twice, which doesn't make sense.

19:29:06.822000 proposed url = Sent2 folder for connection INBOX/sb1 has To Wait = false can run = false
19:29:06.822000 proposed url = Sent2 folder for connection INBOX/sb2 has To Wait = false can run = false    
19:29:06.822000 proposed url = Sent2 folder for connection Sent2 has To Wait = false can run = true             
19:29:06.823000 proposed url = Sent2 folder for connection INBOX/sb1 has To Wait = false can run = false
19:29:06.823000 proposed url = Sent2 folder for connection INBOX/sb2 has To Wait = false can run = false    
19:29:06.823000 proposed url = Sent2 folder for connection Sent2 has To Wait = true can run = false             

Is there any chance you edited something out between the first 3 lines and the bottom 3 lines? I haven't seen anything like this in my log or in the original report's (Scrambler) logs.

Also, it probably doesn't matter but are you checking for new mail in a lot of non-Inbox folders? I see it cycling though them doing an imap STATUS command on each which is used to check for new mail.

Yes, I had two hangs, but I only started logging after the first one, so the log only contains the second one. I didn't edit anything, but you're not quoting the full log:

2022-04-01 19:29:06.822000 UTC - [Parent 15736: Main Thread]: D/IMAP proposed url = Sent2 folder for connection INBOX/xxx has To Wait = false can run = false
2022-04-01 19:29:06.822000 UTC - [Parent 15736: Main Thread]: D/IMAP proposed url = Sent2 folder for connection INBOX/yyy has To Wait = false can run = false
2022-04-01 19:29:06.822000 UTC - [Parent 15736: Main Thread]: D/IMAP proposed url = Sent2 folder for connection Sent2 has To Wait = false can run = true
2022-04-01 19:29:06.823000 UTC - [Parent 15736: Main Thread]: D/IMAP proposed url = Sent2 folder for connection INBOX/xxx has To Wait = false can run = false
2022-04-01 19:29:06.823000 UTC - [Parent 15736: Main Thread]: D/IMAP proposed url = Sent2 folder for connection INBOX/yyy has To Wait = false can run = false
2022-04-01 19:29:06.823000 UTC - [Parent 15736: Main Thread]: D/IMAP proposed url = Sent2 folder for connection Sent2 has To Wait = true can run = false
2022-04-01 19:29:06.823000 UTC - [Parent 15736: Main Thread]: I/IMAP queuing url:imap://XXXX:993/appendmsgfromfile>/Sent2
2022-04-01 19:29:06.823000 UTC - [Parent 15736: Main Thread]: I/IMAP considering playing queued url:imap://XXXX/appendmsgfromfile>/Sent2
2022-04-01 19:29:06.823000 UTC - [Parent 15736: Main Thread]: I/IMAP creating protocol instance to play queued url:imap://XXXX/appendmsgfromfile>/Sent2
2022-04-01 19:29:06.823000 UTC - [Parent 15736: Main Thread]: D/IMAP proposed url = Sent2 folder for connection INBOX/xxx has To Wait = false can run = false
2022-04-01 19:29:06.823000 UTC - [Parent 15736: Main Thread]: D/IMAP proposed url = Sent2 folder for connection INBOX/yyy has To Wait = false can run = false
2022-04-01 19:29:06.823000 UTC - [Parent 15736: Main Thread]: D/IMAP proposed url = Sent2 folder for connection Sent2 has To Wait = true can run = false
2022-04-01 19:29:06.823000 UTC - [Parent 15736: Main Thread]: I/IMAP failed creating protocol instance to play queued url

So yes, there are three lines between the top three and the bottom three. BTW, xxx and yyy are filter target folders.

newsfan from comment 48:

So in your simulation, what happens if you block the IMAP port while sending a message?

I can DROP packets (using iptables) to port 143 on my local dovecot server where I am saving to Sent. When I remove the DROP rule it ACCEPTS 143 by default. With it set to DROP before sending the message, the dialog like attachment 9270480 [details] goes to 100%. If I cancel, I immediately get the retry dialog and if I click retry it fails again and the retry dialog comes back. I can do this several times and the retry dialog comes back. If I then delete the DROP rule and click retry, the save to Sent completes OK.

I repeat the test with DROP rule and just let the dialog of attachment 9270480 [details] sit at 100% and don't click cancel. If I wait a while and then remove the DROP rule the save to Sent completes normally with no prompting. The retry dialog doesn't appear.

I do the test again with DROP rule and just let the dialog of attachment 9270480 [details] sit at 100% and again don't click cancel. After a few minutes (I didn't time it) 130 seconds the retry dialog appears under the attachment 9270480 [details] dialog and only retry dialog accepts clicks. When I remove the DROP rule and click retry, the message saves to Sent as expected.

I did notice some semi-undocumented behavior when retry is clicked more than once and the first one doesn't work (because still in DROP mode). It causes a appenddraftfromfile URL to occur which appends the sent message to the designated Drafts folder. The drafts folder is on a server that I haven't blocked the imap port so I ended up with 2 copies of the sent message, one in Drafts and the other in Sent. Also, once I got two copies the same sent message in Sent folder after doing retry with the DROP rule removed.

Note: in your log (sent via email) I also see two successful appends of the same message to Sent2 folder (about 1 second apart). Dovecot will show exact duplicate messages but other servers may not.

After a few minutes (I didn't time it) the retry dialog appears ...

Well, how long is the user meant to wait? Maybe longer for a large message, but for a short message, it should time out after a few seconds. Is there a pref for that?

I did notice some semi-undocumented behavior when retry is clicked more than once and the first one doesn't work ...

Strange, in Drafts, for the sent message. Well, typically Drafts and Sent are on the same server, so one wouldn't notice.

Note: in your log (sent via email) I also see two successful append of the same message to Sent2 folder (about 1 second apart).

Yes, you're right, I hadn't noticed, there are also two messages in the Sent2 folder, which I also hadn't noticed. I hate to repeat it, that happened at 19:31. The "Retry" clicked after the "Cancel" just after 19:30 is not registered in the log at all. Or are you saying, both got executed after 19:31. It's become a habit to click "Cancel" followed by 2x "Retry", and apart from this occurrence I only found one other doubled-up message at the beginning of February. Anyway, that's another bug, it shouldn't happen, regardless of how many times "Retry" is clicked.

I don't see a pref for that. Linux takes exactly 130 seconds before showing the retry dialog. I just tested on Windows10 (with imap port DROPed) and it took only 22 seconds before the retry appeared. So apparently it is OS dependent. FWIW, removing the DROP allowed retry to save to sent with TB on win.

I hate to repeat it, ...

You don't have to, I understood it the first time...

Anyhow, looks like blocking/dropping the imap port really doesn't simulate what you and reporter Scambler see.

Do you see any messages that failed to save to Sent end up in Drafts?

I just tested on Windows10 (with imap port DROPed) and it took only 22 seconds before the retry appeared.

I've certainly timed this before and waited various minutes before hitting cancel.

Anyhow, looks like blocking/dropping the imap port really doesn't simulate what you and reporter Scambler see.

Well, yes, but you already see some issues: Strange timeout behavior and saving to Drafts.

Do you see any messages that failed to save to Sent end up in Drafts?

Which Drafts is that? The Drafts folder on the IMAP server or the configured drafts folder in "Local Folders". I'm not saving drafts to the server. Now that you mention it, I've seen "left over" drafts which I blamed on some defect during compose. Maybe those were the ones saved by the "undocumented" feature/bug. I'll keep an eye out for it.

(In reply to newsfan from comment #55)

I just tested on Windows10 (with imap port DROPed) and it took only 22 seconds before the retry appeared.

I've certainly timed this before and waited various minutes before hitting cancel.

I think you hitting an internal tb bug and has nothing to do with network connectivity, e.g., as pointed to here: comment 50. I don't see anything in the code that would cause what your log shows. But I'll look again ASAP.

Anyhow, looks like blocking/dropping the imap port really doesn't simulate what you and reporter Scambler see.

Well, yes, but you already see some issues: Strange timeout behavior and saving to Drafts.

Do you see any messages that failed to save to Sent end up in Drafts?

Which Drafts is that? The Drafts folder on the IMAP server or the configured drafts folder in "Local Folders". I'm not saving drafts to the server. Now that you mention it, I've seen "left over" drafts which I blamed on some defect during compose. Maybe those were the ones saved by the "undocumented" feature/bug. I'll keep an eye out for it.

It would be the drafts pointed to by the account you are sending the message from. It could be the account's drafts, another imap account's drafts or a local folder, however:
I don't see anything in the code that causes save to drafts as a explicit back-up for failure to save to sent. The reason I see it is because I had my save to drafts time set to 1 minute (the minimum, for testing something else) so I had just waited more then 1 minute to send the message while fooling around with blocking/unblocking the port. Once the message is sent and tb is trying to save to sent, I don't see any more saves to drafts.

To me, the timeouts look normal, for windows and linux. They are about what I observed years ago: bug 1366591 comment 14 -- linux timeout was over 2 minutes and windows timeout is quick.

However, it appears that reporter Scrambler's logs show some sort of real network error when save to sent (which I don't know how to simulate or reproduce). So his main complaint is the "hang" while saving to sent with only the "cancel" button visible. Maybe some additional text in that dialog would be helpful that you may need to click cancel to see the prompt to retry the save to sent. Currently it kind of implies that if you click cancel, it will just give up trying to save to sent. However, I'm not sure if after he clicked cancel, did he actually try to save the message to local folders? In comment 0 he wrote "Save (did not save it), so apparently he did and it didn't work. This was after several unsuccessful retry clicks.

Adding save to local folders was the main improvement of bug 1366591 - After copy to IMAP "Sent" folder failed, "Retry" just dismisses the panel without retrying and loses the sent message (making bug 28211 even worse) - circa Thunderbird 56 beta.

I looked again at reporter Scrambler's logs and I see a pattern. Every time the error NS_BASE_STREAM_CLOSED (0x8047002) occurs, tb is trying to exit IDLE state via the DONE command and instead of the server sending <tag> OK idle done response , the NS_BASE_STREAM_CLOSED error is reported, apparently due to the server disconnecting.

So one thing you can try, if you still see the save-to-sent issue, is to temporarily disable IDLE for the openwave (bluewin.ch) server. This is done via server settings by un-selecting "Allow immediate notification when new messages arrive" (an probably restart tb after making the change to make sure existing connection are exited and re-initialized with IDLE not in effect).

Note: the openwave server provided by my ISP doesn't support IDLE at all. If this change proves effective, you may want to reduce the "check for new messages every X minutes" some to compensate for the lack of IDLE.

It just happened again, and that was will all the other accounts in the profile "muted" so they never check anything. I cancelled the hung copying to the Sent folder after about three minutes. It should really time out after at most 30 seconds. "Retry" worked the second time around, like always.

(In reply to newsfan from comment #57)

It just happened again, and that was will all the other accounts in the profile "muted" so they never check anything. I cancelled the hung copying to the Sent folder after about three minutes. It should really time out after at most 30 seconds. "Retry" worked the second time around, like always.

I don't suppose you recorded a log. Would be curious if we would see the same/similar log, that didn't seem to make sense, as before. Anyhow, I've been unable to duplicate the "hang" that you occasionally see. Also, glad at least "retry" finishes the save to sent.

I've been looking closer at the algorithm used to select an existing connection or create a new one at bug 1745205. Maybe some changes there would be useful and may help with hang, not sure.

See Also: → 1745205
See Also: → 1606047

5crambler, does this still reproduce for you?

Flags: needinfo?(oliver.ilk)
Blocks: 1795390
Duplicate of this bug: 1795390

Our reporter is probably gone :(

Component: Untriaged → Networking: IMAP
Product: Thunderbird → MailNews Core

This still occasionally reproduces for me with 1.02.4.1 and a shaky/slow-ish but for other programs usable 2mbit/s internet connection and GMail IMAP. If you need logs, let me know some logging mechanism I should enable. (Preferably one that can run for days without producing gigabytes or that rotates the log somehow to not completely fill my disk. It happens every few days or so, for some mails it's fine.)

(In reply to Ellie from comment #63)

This still occasionally reproduces for me with 1.02.4.1 and a shaky/slow-ish but for other programs usable 2mbit/s internet connection and GMail IMAP. If you need logs, let me know some logging mechanism I should enable. (Preferably one that can run for days without producing gigabytes or that rotates the log somehow to not completely fill my disk. It happens every few days or so, for some mails it's fine.)

There's a semi-documented "ROTATE" option to MOZ_LOG described here: bug 1786252 comment 5 and in the code here: https://searchfox.org/comm-central/source/mozilla/netwerk/docs/http/logging.rst#225. (Sorry, the "permalink" to the code over there doesn't seem to be very permanent!)
I assume you know how to record the a non-rotating IMAP:5 log but in case you don't, here the main link to the instructions: https://wiki.mozilla.org/MailNews:Logging I don't think the official docs there mention ROTATE option.

I'd just like to add a comment on this one. I've been experiencing similar problems for a long while - I'm running 102.5.0 64 bit Windows now.

One thing I've found that seems to sort things out every time (so far) is doing a "Get Messages" request on the account I'm sending from.

It seems that the message is actually sent correctly - I once sent a message to an email address with an autoreply "Thanks for your email blah blah". I received this autoreply whilst Thunderbird was still "copying to sent folder". IIRC, as soon as it downloaded this autoreply, the copying message to sent message cleared up and closed. Had similar with a customer who was very quick replying to a mail I sent too.

I'm not sure if my IMAP server is caching authentication for so long, and up to this time, any "pushed" transaction is fine, but beyond this time, it requires a re-authentication (perhaps initiated by the "Get Messages" command) before any more pushed transactions are allowed. If it is something like this, it might explain the wide variety of experiences, based on the wide variety of IMAP server configurations.

Might help. Might not. But it's all about trying to narrow down the cause.

Matt, do you let tb save your password?

I'm not sure if that matters but your comment gave me an idea to test. I set up my main account to not login at tb startup (don't check for mail at startup and don't check mail every X minutes). Then I select Inbox on another account and restarted tb with an IMAP log recording. On restart, the log indicates no connection to the main account as expected. I then send a message while selected on other account but change the "from selector" to my main account and send the message. Looking at the log, I see the sent messages gets imap appended to the main account sent folder as it should after tb logs into the main account using my stored password. The sent message appears in main account sent folder as it should.

If I didn't store the main account password in tb, would it have asked for the imap password when I tried to send? I haven't checked that yet.

Anyhow, if you don't let tb store your password and you enter it on startup, tb keeps the password so that if the imap connection times out (typically after 30 minutes of doing nothing) tb won't ask for the password again and just go ahead and login when it needs to.

Not sure what you mean by "pushed" transactions...

I sadly haven't seen this for a bit and not gotten to set up the logs anyway :-( but I think the main reason is that my internet connection got slightly more stable due to antenna placement. So the key to this possibly is having a shaky connection, and not just slow but likely some sort of random varying degree of package loss on top. Maybe at least on Linux it would be possible to simulate and provoke this via some kernel networking shenanigans to mess with the connection speed and induce lost packages at some sort of varying rate.

(In reply to gene smith from comment #66)

Matt, do you let tb save your password?

I do. I'd like to say it never asks more than once, but that's not quite true. It does ask occasionally, and when it does, it pops up the "Please put in your password" box, pre-populated with something (circles instead of the actual password of course...). I just click on OK, and it carries on. Not sure whether that's a connection issue, or the server being awkward.

Not sure what you mean by "pushed" transactions...

It was a poorly worded attempt to describe a transaction that might (but probably not, having had time to think about it) be different.

I imagine the normal sync messages would be "Hi, I'm Bob, my password is 'somethinglongandcomplicated'. Please send me all my messages from my account", whereas the copy to sent folder that hangs might be more on the lines of "Put this message in sent items".

If there's no authentication with it (because TB assumes it's cached, but either the server doesn't cache, or it has expired), then it might explain why a "Get Messages" request allows the transaction to proceed - "Hi, I'm Bob, my password is 'somethinglongandcomplicated'. Please send me all my messages from my account AND put this message in sent items".

Matt

"Our reporter is probably gone :("
No, he's not! - And yes: he's still suffering from this same bug occasionally.

Meanwhile just amused about the non-progress shown on this issue despite all sorts of log files, evidence and explanations.
This is a 20+ years issue, folks!

Sorry for being sarcastic...

Flags: needinfo?(oliver.ilk)

...while the REAL bug - big problem - just gets ignored!
When activating TB logging it starts cross-writing to/from Firefox!
Now, this is a REAL issue, but nobody at Mozilla seems to care...

(In reply to 5crambler from comment #70)

When activating TB logging it starts cross-writing to/from Firefox!

An accurate response requires knowing exactly what you specified, and where, to create your logging. But I will try to help - I have dealt with this in the past. Disclaimer, it was a long ttime ago, so ...

Regarding Firefox and log files, assuming Windows:

  1. Do not specify logging variables as global variables in Windows (you probably didn't, because the wiki page doesn't offer that information) - specify them only in the shell in which you start Thunderbird.
  2. You may not be able to delete log files that are in use, but you can copy them. And you might be able rename them.
  3. If you still have trouble managing log files, just shut down Firefox.
  4. In Firefox do about:logging and click "stop logging" - whether this will help or not I am not sure
  5. Try starting Firefox before you set the Thunderbird variables
  1. I don't use global vars, but local ones
  2. this is the case - and implies that FF allocates the TB log files, which it should never do
  3. same as 2)
  4. I never activated logging in FF
  5. doesn't make a difference
    As stated: this is clear misbehaviour of FF, which should be fixed. But you guys don't seem to care - for whatever strange reason

5crambler, please see end of comment 56 (in bold) for something to try.

Also, I tested on Windows using a daily build, and not able to reproduce the intermixing of Thunderbird and Firefox in the log files under normal conditions documented in the Thunderbird logging wiki.

Child log files may be ignored - this is a result of multi-process, more relevant in Firefox.

(In reply to 5crambler from comment #73)

...
As stated: this is clear misbehaviour of FF, which should be fixed. But you guys don't seem to care - for whatever strange reason

The people in this bug report clearly care about resolving the copy issue.

Caring about the logging difficulties is a different bag of worms. But assuming we "Cared", caring isn't sufficient to get something fixed. 1) No one disputes that you are having a logging issue. But it decidedly impossible to offer a workaround (or a fix) if no one else can reproduce or if no one knows why it is happening on your computer, 2) Gecko Logging issues are not likely to be within the Thunderbird team's power to fix.

Please note, you are not the only person who is frustrated.

When activating TB logging it starts cross-writing to/from Firefox!

I've been seeing this too lately. It was reported a couple years ago and I worked on it a lot in this bug 1687377. When you run TB with the logging env. vars set and then click a link in TB that opens in FF, FF sees the env. vars and writes 0's to the log. So the workaround is not click a link in TB that activates FF while recording a MOZ_LOG.

There is another workaround at bug 1687377 comment 5 to not set MOZ_LOG_FILE env. var. and let the logging redirect to the console.
Also, using the "append" option that as described in bug 1687377 comment 18 works around the problem.
I think there is also a batch file you can make that helps that I mentioned somewhere in that bug but don't remember the details.
Anyhow, a true fix would be in the mozilla code and there seems to be no interest by them in fixing it; I even submitted a patch but it was ignored.

Anyhow, not saving your sent file to sent folder is probably due to a network failure to connect. If this is a big problem you might consider changing to a mail provider that automatically delivers anything sent to the imap sent folder. Gmail does this and possibly yahoo too and maybe others, not sure.
Another way to avoid network failures on save to sent is to direct your sent messages to a local folders instead of to an imap folder and copy or maybe filter them to imap sent if you really need sent messages put in imap for visibility of other clients.

The problem is less the network failures, but that Thunderbird just gets plain stuck doing the sent folder copy. It doesn't seem to be handling a network timeout correctly, or in some other way falls over itself when it retries or something. This also usually happens when the network is very unreliable but not gone, so that a retry would actually make sense. It seems like if the network is kinda half there rather than fully gone, Thunderbird can get extremely confused.

(Confused to the point that if the network bounces back fully and web browsing in a separate window works fine, Thunderbird will still remain stuck with the sent copying forever. I mean literally forever, I've had it sit once for hours just as a test.)

Gene, Ellie,
I'm not sure it's a general "network failure" either. I'm on a mobile broadband 4G connection which is pretty quick for my "out in the middle of nowhere" location (~80Mbs down, ~40Mbs up), but latency can be a bit slow - Just measured at 123ms down, but nearly 300ms up, versus the Ookla "norm" of 100ms.

When I get the hang, I've found that doing a "Get Messages", or even receiving a message in the mean time, will "wake up" whatever had gone to sleep, and let the message be copied correctly.

I'd be interested to know (and maybe others would be too) if when you have the hang, if doing a "Get Messages" clears things up, and the copy to sent messages unhangs itself, and carries on. Might give clues as to what's broken...

"Get Message" usually doesn't fix it for me, no. Nothing I tried usually does, which more than once included completely disabling any internet access (which you would think would interrupt any copy attempt after some timeout and lead to the error & retry dialog but it didn't). I've just had to kill thunderbird a lot when that happened to get the stuck copy dialog to disappear.

My guess is that more common network failures (like connection reset) are handled correctly, but somewhere in that operation some less obvious network failure (long timeout) is not and leads to the operation aborting on a socket level but the thunderbird copy operation still thinking it's in progress. So at that point, it doesn't seem to matter what is actually happening with the underlying network since thunderbird is essentially just waiting for nothing forever. But that's just an uneducated guess as for what is happening.

(In reply to gene smith from comment #76)

When activating TB logging it starts cross-writing to/from Firefox!

I've been seeing this too lately. It was reported a couple years ago and I worked on it a lot in this bug 1687377. When you run TB with the logging env. vars set and then click a link in TB that opens in FF, FF sees the env. vars and writes 0's to the log. So the workaround is not click a link in TB that activates FF while recording a MOZ_LOG.

There is another workaround at bug 1687377 comment 5 to not set MOZ_LOG_FILE env. var. and let the logging redirect to the console.
Also, using the "append" option that as described in bug 1687377 comment 18 works around the problem.
I think there is also a batch file you can make that helps that I mentioned somewhere in that bug but don't remember the details.

You may be thinking of Bug 1687377 comment 47? Or 37 or 52?

(In reply to Ellie from comment #63)

This still occasionally reproduces for me with 1.02.4.1 and a shaky/slow-ish but for other programs usable 2mbit/s internet connection and GMail IMAP. If you need logs, let me know some logging mechanism I should enable. (Preferably one that can run for days without producing gigabytes or that rotates the log somehow to not completely fill my disk. It happens every few days or so, for some mails it's fine.)

Ellie, suggest you use something like

/path/thunderbirdxxx --MOZ_LOG=IMAP:5,timestamp,rotate:50 --MOZ_LOG_FILE=/tmp/log_file

You will also need to set mailnews.smtp.loglevel to string value ALL. See https://wiki.mozilla.org/MailNews:Logging#Setting_a_Preference

How would I set MOZ_LOG and MOZ_LOG_FILE in a way that they apply when I just launch thunderbird normally through the UI? It might be a few weeks before this happens again so I'll forget if I need to use some quirky script file. I also use firefox, so I can't just set these user-wide (since I assume then firefox will also start messing with the log file).

I simulated a network that is "very unreliable but not gone" by disabling my normal wifi on the computer running TB and setting my phone to to be a wifi hotspot and placing the phone at a distance where the signal is flaky (I see 20-30% signal strength per network manager, whatever that means so the connection comes and goes). With computer connected to phone hotspot, "ping yahoo.com" sometimes works and then doesn't.

With this setup, an attempt to send an email usually results in a send error from SMTP. This produces a good error message telling you your message was not sent and when dismissed you can try to send again. No problem with this.

If the messages does send, then the "copying message to Sent folder..." progress dialog occurs which some see never go away (i..e, hangs). This message doesn't occur unless the message was actually sent. However, its not clear looking at the dialog that the message was definitely sent. That should be stated there. Also, I haven't looked at the code but I don't think having a progress bar here is appropriate since I suspect it will always show 100% since I doubt there is a way to monitor the progress of an imap APPEND command, but I might be wrong.
Anyhow, there needs to be an indication on the "copying to sent" box that the message was successfully sent.

I haven't seen a total hang in the "save to sent" progress dialog yet but if you click "cancel" when it hangs, does that dismiss the dialog and then go to "retry or save to local" dialog so you can still save your sent message to local? Or do you really have to kill TB to dismiss the dialog and thus lose your sent msg?

Wayne wrote (but I changed the 50 to 80):

/path/thunderbirdxxx --MOZ_LOG=IMAP:5,timestamp,rotate:80 --MOZ_LOG_FILE=/tmp/log_file

You can put this in your windows shortcut for default tb startup. You will eventually get 4 files each with a max size of 20M with names log_file.moz_log.[1..4] I think and then file 1 get overwritten, etc. You can set the total size to more or less than 80 to whatever you are willing to allow the total files sizes to be. But if you try to click a link to open FF in an email it will zero the currently open log file so you might want to just copy the link and paste and go in FF while recording a long-term TB logs.

Now that you say it, I think the cancel usually becomes grayed out at some point. But I can't fully remember. But I think for when it wasn't fully grayed out, "Cancel" will also not allow a retry and just discard the message which is NEVER what I want, so I learned to just never click that even if it's available.

(In reply to Wayne Mery (:wsmwk) from comment #82)

(In reply to gene smith from comment #76)

I think there is also a batch file you can make that helps that I mentioned somewhere in that bug but don't remember the details.

You may be thinking of Bug 1687377 comment 47? Or 37 or 52?

Yes, bug 1687377 comment 37 show exactly how to make and use a ff.bat file to open links in TB into FF that won't zero any MOZ_LOG files.

I was wrong above in that it DOES try to monitor the send progress. The sending of the "file" content occurs in a loop here:
https://searchfox.org/comm-central/rev/5fb08affe686ce86e49678f02d82fcce06370525/mailnews/imap/src/nsImapProtocol.cpp#6330 on the call to SendData(). I put a 1 sec delay in the loop and changed it so it only sends 1 char per SendData() call. This slows down the progress indicator and then I disable the wifi connection before it gets to 100% but I get NO failure from the SendData() call -- rv is always NS_OK. It just keeps "sending" the bytes until all are supposedly sent and the "Sending to sent" dialog goes away as if it worked, but it didn't -- the message doesn't appear in Sent folder since the wifi was disabled.

For some reason the code for SendData() doesn't detect that the network is down and just keeps accepting bytes and returning NS_OK. It's here:
https://searchfox.org/comm-central/rev/5fb08affe686ce86e49678f02d82fcce06370525/mailnews/imap/src/nsImapProtocol.cpp#2173
I tried adding an rv = m_inputStream->Available(&avialBytes); call in SendData() thinking that may detect the network being down but it didn't. I'm thinking that to detect the network down, I have to try to do an actual m_inputStream->Read() call since I see it get detected on Read() for other threads when I turn off wifi. I haven't tried that yet but doing a read while sending could easily mess things up.

(In reply to Ellie from comment #86)

Now that you say it, I think the cancel usually becomes grayed out at some point. But I can't fully remember. But I think for when it wasn't fully grayed out, "Cancel" will also not allow a retry and just discard the message which is NEVER what I want, so I learned to just never click that even if it's available.

I haven't seen it infinitely hang on the Sending dialog yet but by sending 1 byte per cycle and delaying 1 sec per cycle the dialog stays there for several minutes while sending a small message. The cancel button does appear gray but when I move the mouse over it it lights up and I can click it and it brings up the retry screen and I can save to local folders. Anyhow, not saying you are wrong in what you remember since I haven't yet seen an infinite hang.

Infinite hang can be tested: if I disable all internet and it continues to "send" even after multiple minutes without showing me a failure and retry option then it's obviously broken and hanging forever. (Since there is no technical way it can make progress, so if it wasn't malfunctioning it should notice that after a reasonable timeout.) I tried this, and this happens, it appears that it can hang literally forever.

Sorry, I guess you meant "infinite hang that cannot be canceled". It's very likely that if the button looks grayed out I just assumed canceling wasn't possible even when it was, so maybe it actually can always be canceled and I was wrong on that part. But it would be highly preferable if I didn't need to actively cancel for it to eventually offer me a retry.

(In reply to Ellie from comment #90)

Infinite hang can be tested: if I disable all internet and it continues to "send" even after multiple minutes without showing me a failure and retry option then it's obviously broken and hanging forever. (Since there is no technical way it can make progress, so if it wasn't malfunctioning it should notice that after a reasonable timeout.) I tried this, and this happens, it appears that it can hang literally forever.

How did you "disable all internet" exactly while it is sending/appending the message to the server? If you disable it too soon, the email won't send via SMTP and it never reaches the sending/appending to imap server point. That's why I had to tweak the code to slow it down so I could disable internet after SMTP finishes and while sending the message file via imap to the server is in progress. (Hope this makes sense, It's kind of hard to explain.)

I've been working some more on this but even though I haven't seen the "infinite hang" I have fixed a few issues that may affect what has been reported:

  • The .png attachments above show the copy to sent at 100% even thought nothing was actually copied to sent folder. This is caused by two places in mailnews/compose/content/sendProgress.js. When the progress bar is first updated in onChangeProgress() the percentage value is initially determined to be 100% because the progress percentage is is also used by activities other than "save to sent" such as fetching messages that doesn't show a progress bar. If no save to sent progress occurs, the progress bar just stays at the last percentage captured by the previous activity, typically 100%.
    Edit: For save to sent the "previous" progress activity is actually the progress of sending the message to the SMTP server. So the initial 100% is actually the completing of the send to SMTP.

  • But even with that fixed, there is still an initial "glitch" to 100% and then back to 0% on the progress bar. This is caused by the call to dialog.progress.removeAttribute("value") on the "progressListener" event STATE_START. Removing the call to removeAttribute() gets rid of the 100% initial glitch.

  • Added an new informational note to the save to sent (and drafts) progress dialog saying "If progress stops or never starts, click Cancel for other options." The save to sent progress dialog only appears, when composing/sending a message, if the message was successfully sent. So this lets the user know that it's OK to click cancel if the progress never starts or if it stops.
    Edit: Magnus didn't think this is needed. I tend to agree since the progress dialog is "shared" with SMTP send and clicking cancel during the SMTP transfer doesn't actually stop the message from going out and causes other problems and confusions. Maybe this is a separate bug.

  • When testing, I noticed that progress dialog reaches 100% and then closes (appear to have successfully saved to sent) if the network goes down while in the imap append loop here: https://searchfox.org/comm-central/rev/5fb08affe686ce86e49678f02d82fcce06370525/mailnews/imap/src/nsImapProtocol.cpp#6330 because errors are only detected when receiving data. The loop just sends the whole message file, one chunk per loop, and nothing is received until the very end. However, there was no check for command success there so any possible failure was ignored.

  • Added a check for DeathSignalReceived() on each pass through the loop which also indicates that the network is down and skips the fruitless call to SendData() in the loop. Without this, if cancel is clicked with appends in progress, the appends continue sending until tb is shutdown.

  • After the loop finishes and the append completes successfully, there are a few other things done that I don't completely understand, depending on capability UIDPLUS. I added some error check in there so if that fails, the save to sent is now considered to have failed.

  • Added the local variable urlOK to keep track of success or failure of this activity triggered by URL appendmsgfromfile. urlOk is initialized to false and remains false when an error condition is detected followed by "goto done;". It is set true if the append command appear to succeed but might go back false depending on success or failure of the UIDPLUS dependent code near the end.

  • Now inside the "goto:" label, I now mark the command as failed with this: GetServerStateParser().SetCommandFailed(true). This ensures that the URL knows that the command failed so the "retry/save to local" dialog appears. Without this, network errors may be ignored even though save to sent did not succeed.

Here's a try build with the changes: https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=7a6d200ff3f8559ae47feea55cc75c5096ecf227
This has some temporary code for easier fault insertion. Before the append starts, there's a 10 sec delay. The append chunks sizes are set to 75 bytes so usually more passes through the loop occur. And finally, there is a 1 sec delay in the loop so the network can be perturbed while in the process of sending the file.
There's also some console dumps and printfs you will see if you run it.

There are a few unit test failure that sound like this may have triggered. I haven't look at them yet but they are maybe caused by the delays I added.

From comment 93 above:

Here's a try build with the changes: https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=7a6d200ff3f8559ae47feea55cc75c5096ecf227

This try build is pretty much "busted" due to problems caused by big UI changes recently merged into comm central. The problems I saw are described here: bug 1811720.
The reason for the UI change is posted here: https://thunderbird.topicbox.com/groups/daily/T7df1556699d655bb

FYI, recent archived daily builds are also busted. The last workable version was on Jan 18 here: https://archive.mozilla.org/pub/thunderbird/nightly/2023/01/2023-01-18-10-06-38-comm-central/

Edit: Also from comment 93 above:

There are a few unit test failure that sound like this may have triggered. I haven't look at them yet but they are maybe caused by the delays I added.

Yes, the suspicious unit test failures go away if I remove the artificial delays.

Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Unspecified → All

Here's a new "try" build with all my artificial delays and printfs removed. It's built against the current daily which still has problems but is not totally broken like described in comment 94.
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=798b1093c5ce33771f3352b48ee651e2db007186

Assignee: nobody → gds
Status: NEW → ASSIGNED
Attached patch save-to-sent-fix-v1.diff (obsolete) — Splinter Review

Most of the changes for the mozphab diff are described in comment 93 above.
This attachment is the about the same as the mozphab diff but with commented out printfs and dumps for debugging info and some delays added so network errors can be inserted (disconnect wifi or ethernet) while the message is being sent to the imap server via the imap append loop. This may be helpful for future testing.

Attachment #9316664 - Attachment is obsolete: true
Target Milestone: --- → 112 Branch

Pushed by martin@humanoids.be:
https://hg.mozilla.org/comm-central/rev/77d8bb0563ce
Improve error detection and reporting during imap save to sent. r=mkmelin

Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED

I received a report via email that a user manually applied the changes but still sees the problem (saving to sent progress hangs at 100%). It's a problem that rarely occurs when sending messages but the user still sees it occasionally. As I mentioned above, I had not been able to duplicate the problem myself before submitting the now landed patch: https://hg.mozilla.org/comm-central/rev/77d8bb0563ce

I have since found a simple way to duplicate the problem and it produces almost the same imap log as reporter scrambler attached above: attachment 9255638 [details]. It only occurs on servers supporting imap IDLE and if several folders on the server are in imap SELECTed state (have been recently visited):

  1. Open 4 or 5 or more non-Inbox imap folders on the account in succession.
  2. Compose a short message and have it ready to send but don't sent it yet.
  3. Block incoming responses from the server's imap port 143 or 993 if TLS in use: sudo iptables -A INPUT -p tcp --sport 143 -j DROP
  4. Now send the message. (The iptables command doesn't affect the outgoing SMTP mesage since it goes to a completely different port.)
  5. The screen shown in attachment 9255630 [details] will remain at 100% forever or until cancel is clicked. This will occur even with the landed patch applied.
  6. Note that it says message was 100% sent. It wasn't actually sent to to imap sent folder at all.

The message is sent successfully via SMTP but the attempt to send a copy of the message to imap server fails due to the iptables rule. At this point, tb deems the connection as not usable and attempts to automatically retry the send on another existing connection. It chooses an existing connection in idle to use. But to use that connection an imap DONE command must be sent to the server to exit idle state. The iptables rule also prevents the DONE command from succeeding. But due to a bug in the code the URL that initiates all the activity, appendmsgfromfile, never sees an error or completion signal so the progress dialog hangs (unless Cancel is clicked).

Note that if cancel is clicked, the retry/save dialog does appear and if the rule is then deleted with sudo iptables -D INPUT -p tcp --sport 143 -j DROP and Retry is clicked, the save to sent finally succeeds but only on the 2nd click of Retry.

The problem is due to this line of code near the end of EndIdle(): https://searchfox.org/comm-central/rev/db81589f13e9d7b1d47c2a466d19b15715dcf277/mailnews/imap/src/nsImapProtocol.cpp#7755
Before being set null, that folder sink pointer references the currently selected folder for the connection in imap IDLE state, not the "Sent" folder. When the automatic retry of URL appendmsgfromfile occurs using the IDLE connection, it changes the sink pointer for that connection to the "Sent" folder. Then right before EndIdle() return it sets that sink folder pointer to null and then the DONE imap command fails and the connection status is set to failed. The bad connection status is never processed here as it should (due to the null m_imapMailFolderSink checked above this point): https://searchfox.org/comm-central/rev/e029583d2aca9c03975a4e8a919e4b575136d598/mailnews/imap/src/nsImapProtocol.cpp#2075.

The problem is related to changes I made in bug 1708981 but that really didn't directly cause a regression. However, had I changed EndIdle() so that m_imapMailFolderSinkSelect is set null instead of m_imapMailFolderSink this bug may have been fixed then.

Note also that with my new fix in place, save to sent dialog will still display as 100% sent but the hang will only persist until the response times out after about a minute or two and then the retry/save dialog will appear. That's what I see with the iptables rule blocking port 143 responses.

I'll submit a moz-phab diff next to fix the pointer. I would also kind off like to make a change to a comment in SendProgress.js to clarify why STATE_START is not handled, so I might include that in the diff too.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
See Also: → 1708981

Before attaching a new moz-phab diff/patch, I'll describe a few other differences from my description in previous comment 100.

  • In nsImapProtocol::EndIdle() instead of setting m_imapMailFolderSinkSelected null I decided to continue to set m_imapMailFolderSink null but only do it if TB is shutting down (!waitingForResponse) or if send of "DONE" results in a successful (OK) idle response. This is more like how it has always been but will now keep the pointer around so that the appendmsgfromfile URL failure caused by an unsuccessful ending of IDLE can still be reported and is not ignored.
  • Even with this fix, when the network problems occur that cause attachment 9270480 [details] to appear, that progress dialog can remain for up to 2 minutes on linux (some shorter on windows, not sure about macosx). This implies that the message has been 100% copied to the "Sent" folder (called Sent2 in this example) even though nothing was actually sent.
    • The 100% can be just the final progress left over from the initial and successful SMTP send of the message. So at the end of the SMTP send, I now set the progress meter back to 0%. This is only helpful if no imap threads are available for the "append" URL so nsImapProtocol::UploadMessageFromFile() never occurs which would have also zeroed the progress meter.
    • But even with this change, if imap thread is available and the message is short it will still be successfully queued in UploadMessageFromFile() even with the network down and show 100% save to sent complete. To avoid showing 100% complete, in UploadMessageFromFile I now only allow the progress meter to go to 99%. So now the progress dialog will stay for up to 2m at 99% instead of 100%. This is still not really true but at least it doesn't imply that the save to sent was a complete success. (After a successful save to sent, 100% will still be displayed and is set in the STATE_STOP handler in sendProgress.js, but it's probably not visible since the dialog is fading out by then.)
      • To make sure the meter doesn't set a 99.9% complete transfer to 100%, I now use "trunc" instead of "round" in sendProgress.js.

Another issue is that the "indeterminate" progress state is removed with the original patch. I've received reports that some might consider this change unexpected or surprising since users are used to seeing a quick "flash" filling the progress bar when sending a message. Currently the indeterminate progress state is the same color as percentage progress but with a slightly lighter color modulating the fully colored bar from left to right (throbbing). When sending a message this would only appear during the first "Creating mail message" message state which typically is very fast and, at least on linux, is not even visible when small messages are sent.

I personally still think showing the indeterminate state during "Creating mail message" state is not helpful and just showing an empty progress bar with no text percentage during this state, as it now does, is better. But some may disagree and may also prefer a more distinctive color and/or modulation pattern for the indeterminate state progress graphic.

Attached video sending dialogue.mp4

When sending a message this would only appear during the first "Creating mail message" message state which typically is very fast and, at least on linux, is not even visible when small messages are sent.

It's not necessarily fast, particularly not for larger and encrypted messages. It's also used for "Sending message...", see attached video where we have reinstated the start processing and coloured the intermediated state differently (CSS below) to make it obvious, especially when the user has disabled animation with pref ui.prefersReducedMotion at 1 or a corresponding Windows setting:

+  background-image: linear-gradient(90deg, rgba(255, 255, 255, 0.8) 0%,
+                                           rgba(255, 255, 255, 0.2) 25%,
+                                           rgba(255, 255, 255, 0.8) 50%,
+                                           rgba(255, 255, 255, 0.2) 75%,
+                                           rgba(255, 255, 255, 0.8) 100%);

This is on a fairly new laptop with heaps of RAM and an Intel i7 processor. Besides, the dialogue is also used for posting to news groups.

Yes, the indeterminate progress can display for a few seconds like you show and I see that too when I send a 16M message. Otherwise, on linux, I don't see it with my typical sized email. Maybe it lasts longer too with encrypted email sending but I haven't really checked.

the dialogue is also used for posting to news groups.

Been a while since I posted to a newsgroup so don't know how that would differ from sending an email. I guess mainly there is no "sent" folder to copy to?

Anyhow, I still don't see what the indeterminate display is telling you other than maybe "I'm alive and doing something but I can't tell you the exact amount of progress I've made". So I guess that would be a good thing if that's what it means and if the user knows what the indeterminate pattern means.

It's also used for "Sending message..."

Actually STATE_START triggers the indeterminate progress and occurs only once right before the "Creating mail message..." so indeterminate doesn't occur again during "Sending message...".

At one time I thought the indeterminate (remove progress attribute) was causing a false 100% progress and that was mainly why I took it out. But it really has nothing to do with that, so I have no problem putting it back if that's what is wanted. However, I do need to be told exactly what CSS setting it should have (if that needs to be changed in this bug) since I know zero about that.

Attached video sending news.mp4

(In reply to gene smith from comment #104)

Anyhow, I still don't see what the indeterminate display is telling you other than maybe "I'm alive and doing something but I can't tell you the exact amount of progress I've made". So I guess that would be a good thing if that's what it means and if the user knows what the indeterminate pattern means.

That's what it means. A lot of other software also shows a throbbering effect to indicate activity. Removing the indeterminate state is changing long established behaviour and is unnecessarily surprising.

Attached a video for posting a news article for your information.

BTW, setting the progress to 99% in UploadMessageFromFile() at the end results in 100% never showing. That's rather confusing.

The change in SmtpService.jsm leads to 0% being shown after a successful 100% if the message is later not IMAP copied, so for Sent folder on local folders or with no sent copy. That's equally confusing. Please consider removing those this hunk and the one mentioned in the previous comment again.

Attached a video for posting a news article for your information.

All I see there is indeterminate state and no progress % that I can tell. Maybe you were just posting a very small message and it was too short to show a progress %?

BTW, setting the progress to 99% in UploadMessageFromFile() at the end results in 100% never showing. That's rather confusing.

The change in SmtpService.jsm leads to 0% being shown after a successful 100% if the message is later not IMAP copied, so for Sent folder on local folders or with no sent copy.

Yes, I noticed these issues too but thought maybe they wouldn't be noticed that much unless someone recorded the screen and played it back frame by frame.

The dialog should quickly transitions to output the string "Your message was sent" and set the progress to 100% in the STATE_STOP handler. However, I think something is causing the dialog to "fade away" before you can see the success message. Seem to me there should be some way to cause the final success message to show for a while (maybe 1/2 second) before it completely fades away. Maybe another CSS setting?
But as it is, the 99% is not really noticeable since the dialog is fading away at that point.

I've found another way to get around these issues which I'll submit next via moz-phab. The idea is to no longer try to set an artificial % of 0 or 99 but instead set the progress to indeterminate at those points. You will never see forced 0% or 99% but instead you will see just the indeterminate progress display after each phase reaches 100% or while waiting for the save/retry dialog to appear.
With this method, the final 100% is sometimes not seen due to the early fade out of the progress dialog. This is also true of how it has always worked, I think.
I also went back to setting indeterminate at STATE_START and I updated the .css as per comment 103.

Attached image progress-bar.png

All I see there is indeterminate state and no progress % that I can tell.

Correct. News messages are plaintext and usually short. The video is just an illustration that without the START processing you'd see an empty progress bar.

As for the latest patch: Please reinstate the comment in messenger.css. It still applies, only the is CSS changed. The CSS establishes two identical patterns, from 0% to 50% and from 50% to 100% of the gradient. One part is not visible initially since the pattern is displayed with 200% width. Only the animation varies the background position between 0 and -100% so the other part becomes visible. To understand this better, set this CSS in your local build on in the inspector: linear-gradient(90deg, red 0%, red 50%, green 50%, green 100%).

Note that the change in CSS also affects getting messages as per the attached screenshot, but that's an improvement since before it was also not easily distinguishable.

We've applied the patch to try it.

The change in nsISmtpService.sendMailMessage() leads to the 100% completion display being cleared quickly (if it's at all visible) and being replaced by an indeterminate state of the progress meter, now quite distinguishable due to the improved CSS. This doesn't appear necessary since a subsequent IMAP move already clears the progress meter here: https://hg.mozilla.org/comm-central/rev/77d8bb0563ce#l2.13.

The change in IMAP's UploadMessageFromFile() for "move sent message to IMAP Sent folder" similarly leads to the 100% completion display being cleared quickly.

This is better than the previous version that dropped back to 0% (SMTP) or 99% (IMAP). However, we doubt the usefulness of those two changes. Cases where sending or moving to the IMAP Sent folder fails are rare, and the system should not be geared towards them. It is more informative to the user that the operation completed successfully with 100% and that the 100% remains shown when sending/moving is finalised rather than showing the indeterminate state again. We still suggest to drop those two hunks.

Replacing round() with trunc() has been dropped. Did that not offer a bit more security not to display 100% for something that was only 99.5% complete and hence could still fail?

Attached video sending 1.mp4

Apologies, looking at this again, maybe the two hunks make sense and offer an advantage. In this video the indeterminate state can be seen after sending/moving when things can still go wrong. But is https://hg.mozilla.org/comm-central/rev/77d8bb0563ce#l2.13 still required if SMTP switches to indeterminate?

Attached video sending 2.mp4

This video shows a less than ideal situation where "Copying message to Sent folder..." isn't actually shown at all; instead 0%, 90% and 100% progress are shown under the description "Checking mail server capabilities...". Anyway, it confirms the previous comment that switching to indeterminate display makes sense after showing 100%.

Re: comment 112

But is https://hg.mozilla.org/comm-central/rev/77d8bb0563ce#l2.13 still required if SMTP switches to indeterminate?

Yes, that could be removed if we end SMTP send with indeterminate.

Re: comment 113

... "Checking mail server capabilities...". ...

I see this too in my testing when there is no initial connection to the imap server before saving to sent. It's really goofed up when this happens. You see the save to sent progress go from 0 to 100 but the label says something else in the message display like "checking caps", "finding folders", "connecting to server" etc. Haven't looked how to fix this but it's not caused by my proposed changes that I can tell.

Another bug is that the progress dialog usually goes away or fades to a very light and unreadable state before the STATE_STOP occurs so you never see (or just barely see) "Your message was sent" and "100%". Maybe this is another .css setting? I think it would be good to hold off erasing or fading the progress dialog for maybe 1/2 second after STATE_STOP so you see a confirmation that your message was successfully sent.

(In reply to gene smith from comment #114)

Yes, that could be removed if we end SMTP send with indeterminate.

Maybe add it to the patch in order not to forget.

Another bug is that the progress dialog usually goes away or fades to a very light and unreadable state before the STATE_STOP occurs so you never see (or just barely see) "Your message was sent" and "100%". Maybe this is another .css setting? I think it would be good to hold off erasing or fading the progress dialog for maybe 1/2 second after STATE_STOP so you see a confirmation that your message was successfully sent.

Please clarify what you mean by fading. STATE_STOP itself sets the progress to 100% here:
https://searchfox.org/comm-central/rev/39e2f40bf7dd57bc8c9279875025d1250c874601/mailnews/compose/content/sendProgress.js#36
That won't have a lot of effect since the window is closed immediately after.

You can put the subsequent window.close() on a timeout: window.setTimeout(() => window.close(), 500);

Maybe add it to the patch in order not to forget.

OK.

You can put the subsequent window.close() on a timeout: window.setTimeout(() => window.close(), 500);

That fixes the problem!

With that change the progress shows "Your message was sent" with 100% progress and you can see it before the progress dialog goes away. Well, it actually still does "fade away" if you record to a file and look at it in a video viewer that can show each frame. I.e., at the end before calling window.close() it is full brightness and gradually gets dimmer until it's gone; but with the delay you show, it doesn't matter.

To me 500 is a bit too quick and I prefer 1000 as a delay.
I'll add this delay to the patch too.

It should work with setTimeout(() => window.close(), 500);, too. 1 second seems long, perhaps 750 ms? We're still not sure what you mean by "fading". Do you have a screenshot or does it show in our videos? Maybe there's animation in your OS which fades out windows which are closed.

Attached image Fading away after window closed.png (obsolete) —

Is this unique to linux/GTK/KDE? No idea. After window.close() the progress thing doesn't just suddenly go away but becomes transparent and gradually "fades away". I only notice it if I slow down a video screen shot. I couldn't tell about your attached examples since they are all cut off before STATE_STOP it seems.

1000 seems OK to me. But I'll compromise and go with 750 like you request.

Never mind, sorry for all the comments about "fading". I forgot about a bunch of "desktop effects" in KDE that are on by default. One was about fading in/out popups but didn't affect the progress fading. Another is about fading windows in/out and with that one turned off the progress dialog just suddenly vanishes when window.close() is called.
Anyhow, regardless of the "fading" tangent, the 750ms delay will help a lot so the user can actually see the final sending status.

Attachment #9320389 - Attachment is obsolete: true

Getting the send progress to work has a big problem on Windows: Closing the compose window here:
https://searchfox.org/comm-central/rev/5d858dd9cf653d4b90f107b6e29b8674ef3c5fca/mailnews/compose/src/nsMsgCompose.cpp#3299
destroys the send progress child window immediately as per this comment:
https://searchfox.org/mozilla-central/rev/a94fc6d9ececbf64a995d7ce9dc6eba0ffad7b67/widget/windows/nsWindow.cpp#1281
It's different on Linux where the modal progress child window can survive its parent:
https://searchfox.org/mozilla-central/rev/a94fc6d9ececbf64a995d7ce9dc6eba0ffad7b67/widget/gtk/nsWindow.cpp#591

So the timeout introduced in sendProgress.js has no effect on Windows. In fact, on Windows "Your message has been sent."
https://searchfox.org/comm-central/rev/5d858dd9cf653d4b90f107b6e29b8674ef3c5fca/mail/locales/en-US/chrome/messenger/messengercompose/sendProgress.properties#20
is never displayed when sending whereas when saving a draft and not closing the compose window, "Your message has been saved."
https://searchfox.org/comm-central/rev/5d858dd9cf653d4b90f107b6e29b8674ef3c5fca/mail/locales/en-US/chrome/messenger/messengercompose/sendProgress.properties#21
is displayed, and the suggested 750ms or at most 1000ms are enough to see this.

Until the Window closing is fixed, comment #111 holds true, setting an indeterminate state in the progress after sending/moving removes the 100% display. That's not helpful for the user. This can be seen in the video in comment #112. For now, setting the indeterminate state could/should only be done on Linux (and maybe Mac).

This fixes comment #120:
https://github.com/Betterbird/thunderbird-patches/blob/main/102/bugs/1745130-close-compose-later.patch
With this additional patch, D171035 is viable on all platforms.

Preface: I didn't see BB's comment 120 and comment 121 before writing this (didn't reload). So this repeats some of the things already stated there.

The latest change on this works fine for me on linux. But user BB on windows wasn't seeing the same results. The last fix was suppose to keep the progress dialog visible for 750ms at the end so the user can see the successful result: "Your message was sent" and 100% progress. But user BB reported that this final status never appears and only the indeterminate state in the progress dialog was last seen after sending a message.

To confirm the problem I created a try build for win64 and saw the same thing when I booted into windows and sent a message.

User BB did some research and found that for Windows when a window closes (the compose window in this case) any child window also closes (the progress dialog). But for linux (GTK) closing a parent window doesn't close the child window so I didn't see the problem on linux.

User BB has provided a patch to keep the compose window in existence but hidden for a few seconds before destroying it so that the progress dialog remains long enough that "Your message was sent -- 100%" can be seen by the user on both linux and windows after a message is sent. I'll include the patch (with credit given) in the next moz-phab update.

This probably still needs to be verified to work on macosx since AFAIK the changes haven't been tried there yet and I don't have the hardware. I'll produce a try build with a link.

One other tweak: I set the time delay before closing the progress dialog to 1 second. Previously it was 750ms. To me this seems a bit more see-able.

Attachment #9319927 - Attachment description: Bug 1745130 - Fix hang at 100% during failed imap save to sent. r=mkmelin → Bug 1745130 - Fix hang at 100% during failed imap save to sent. r=mkmelin,micahilbery

The moz-phab update is now there.
Here's the try build. A test of this on real OSX based h/w is probably needed:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=e1280e166c0e549fdd1d0b334964cdd5d396b411&selectedTaskRun=WdPjQ1tsQb-8QnzJUT7e1Q.0

Just send an email message and watch the screen and verify that you see at the end "Your message was sent" with progress showing 100%. With a short message most of it goes by pretty quick but the final status should be visible for 1 second.

Looking at the try run, you seem to have a test failure in browser_ext_compose_onBeforeSend.js, particularly:
TEST-UNEXPECTED-FAIL | comm/mail/components/extensions/test/browser/browser_ext_compose_sendMessage.js | Found an unexpected unknown window with document URI: chrome://messenger/content/messengercompose/sendProgress.xhtml and title: Sending Message - Second message at the end of test run

This comes from here:
https://searchfox.org/mozilla-central/rev/ed1ebe7bc313eaaec1ed9c2a761fa0fc53e85210/testing/mochitest/browser-test.js#473

Apparently the send progress windows hangs around for a second longer now and the test finds it and complains. A possible fix would be to delay the completion of the test by 2-3 seconds so the progress window closes before completion.

Adding this to the end of browser_ext_compose_sendMessage.js fixes the test for us. Please attribute this change to our project:

+
+add_task(function test_wait_at_the_end() {
+  let { mc } = ChromeUtils.import(
+    "resource://testing-common/mozmill/FolderDisplayHelpers.jsm"
+  );
+  // Need to wait until the progress window goes away.
+  mc.sleep(3000);
+});

Curiously enough, importing FolderDisplayHelpers.jsm at the outer scope breaks the test entirely.

Attached patch fix-test-on-top-of-try.patch (obsolete) — Splinter Review

Apparently https://hg.mozilla.org/try-comm-central/rev/2fbb3d9e455fb73c0c9a67a770b237defe08d1e0 was based on our suggestion from comment #125. Here's another patch to fix browser_ext_compose_sendMessage.js. Note that browser_filelinkTelemetry.js also fails and will also need a "wait task" at the end. Please attribute our work accordingly in case you find it useful.

Here's a better approach: Instead of changing the test, change the window type.
BTW, the latest try https://treeherder.mozilla.org/jobs?repo=try-comm-central&selectedTaskRun=BJRXJltGRWmCPdWa5KM6eQ.0&revision=d596a69adcd7a8ada709a7a30d166fd117491d8e shows another failure in M(bct5) in browser_forwardRFC822Attach.js. That will also need a "wait task" at the end.

Comment on attachment 9321293 [details] [diff] [review]
change-window-type-to-fix-test.patch

OK, the "better" approach led to a whole lot more test failures:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=19fd66f9199931d27dd60f967245abc7e90ed68f
Apparently changing the window type to "msgcomposedestroy" upsets the entire notification system, most prominently here:
https://searchfox.org/comm-central/search?q=%22msgcompose%22&path=ext-mail&case=false&regexp=false

So please forget attachment 9321293 [details] [diff] [review] and revert to
https://hg.mozilla.org/try-comm-central/rev/d596a69adcd7a8ada709a7a30d166fd117491d8e
from
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=d596a69adcd7a8ada709a7a30d166fd117491d8e

Note that browser_forwardRFC822Attach.js, needs to be fixed the same way the other tests were fixed.

Attachment #9321293 - Attachment is obsolete: true

Thanks for the latest try runs at (they appear to have identical code):
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=040d8fca8bbc6024319457bdf1176d0d11fc6dcb
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=544f88fecafd341cfc894fd4a8bdd0377545f8cf

The former shows a failure in browser_newmsgComposeIdentity.js, otherwise it looks good, the latter has an additional failure in browser_editDraftTemplate.js.

Turning to the Mac failures in browser_ext_compose_onBeforeSend.js which passes on Windows/Linux. To understand what's happening, start TB, open a compose window and then paste this code into the error console:

let composeWindows = [...Services.wm.getEnumerator("msgcompose")];
let basewindow = composeWindows[0].docShell.treeOwner.QueryInterface(Ci.nsIBaseWindow);
basewindow.visibility = false; basewindow.enabled = false; console.log(basewindow.visibility, basewindow.enabled);

The result on Windows is true false, that means despite setting window visibility and "enabled" to false the former is returned true. But the test filters on the latter:

composeWindows = composeWindows.filter(w => w.docShell.treeOwner.QueryInterface(Ci.nsIBaseWindow).enabled);

Running the same JS on Mac, we see true true returned. So on Mac, the test changes don't make the test pass.

One option would be to disable that test on Mac by adding skip-if = os == 'mac' after this line:
https://searchfox.org/comm-central/rev/1c667de681fc87d7050f21ddf33ec1ce652c9819/mail/components/extensions/test/browser/browser.ini#51
Another option would be to not check the window count and Mac, and if there are two, do a composeWindows.splice(0,1) to remove the first one which was left behind from the previous test task.

Mac also has an additional failure in browser_generalContentPolicy.js which can be fixed by adding a "wait task" at the end.

Attached patch add-attribute-to-fix-test.patch (obsolete) — Splinter Review

Here is another option to fix the test failures also on Mac. An attribute is added the soon-to-be-destroyed window which the test can filter on. Apply https://hg.mozilla.org/try-comm-central/rev/544f88fecafd341cfc894fd4a8bdd0377545f8cf first.

Please run another try after adding the "wait task" to browser_newmsgComposeIdentity.js, browser_editDraftTemplate.js and browser_generalContentPolicy.js.

Attachment #9321247 - Attachment is obsolete: true

Patch https://phabricator.services.mozilla.com/D171035?id=688824 was approved at Fri, Mar 3, 6:34 PM. But it was quickly found that there were numerous mochitest failures, so more changes were needed to address this.

Most of the test failures were because the approved patch allows the progress window to remain open for 1 second after the test finishes. The test code then detected an unexpected window as present and flagged it as an error. To fix this, on each of the test files that produce a progress window an add_task() doing just a 3 second delay is appended at the end of the file to allow time for the progress window to close and avoid the error.

Another test failure was caused by the delay of 2 seconds in closing the compose window (needed specifically for M/S Windows) and resulted in failures in mail/components/extensions/test/browser/browser_ext_compose_onBeforeSend.js or mail/components/extensions/test/browser/head.js which detected an unexpected window from the previous test as still present. The solution for this is to add a new attribute to the compose window called "destroywindow" in mailnews/compose/src/nsMsgCompose.cpp and in browser_ext_compose_onBeforeSend.js and head.js, filter out and ignore old dying/closing compose windows from a possible previous test having the new attribute "destroywindow".

Both solution were provided by commenter BB.

I'll submit an update moz-phab patch next.
Here's my last try build with the final fix in place:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=2514aee1693fa13bad319c3951dd5198b64a9ec3

Attachment #9319927 - Attachment description: Bug 1745130 - Fix hang at 100% during failed imap save to sent. r=mkmelin,micahilbery → Bug 1745130 - Improve error detection and reporting during imap save to sent. r=mkmelin

And out-of-date comment was left in the patch. Please change:

    // We need to disble the window since, due to a bug in the Mozilla
    // platform, invisible windows still return visibility of true.
    // In our tests we need to have a way to ignore "soon to die" windows.

to

    // In our tests we need to have a way to ignore "soon to die" windows.
    // Due to a bug in the Mozilla platform, invisible windows still return
    // visibility of true. To distinguish we set a custom attribute.
Attachment #9319927 - Attachment description: Bug 1745130 - Improve error detection and reporting during imap save to sent. r=mkmelin → Bug 1745130 - Fix hang at 100% during failed imap save to sent. r=mkmelin

(In reply to BB from comment #132)

And out-of-date comment was left in the patch. Please change:

Ok, no problem. Actually glad you want a change since I had put an old description in the 1st line of commit message.

Sorry I think we need to backpaddle on having the timeout. It seems to cause unreasonable complications.

The point is that on TB's most prominent platform Windows (96%) the send progress never displays "Your message has been sent", see comment #120 for details. So if you want TB to behave well on Windows as it behaves on Linux, you need to delay destroying the the compose window. Since that is necessary anyway, you might was well use the timeout suggested by Gene.

The functionality likely broke with the the removal of the compose window recycling here:
https://hg.mozilla.org/comm-central/rev/158bced7b129#l8.120

Before the window was only hidden and the send progress window survived so the user could actually read it.

Since Magnus didn't say which timeout to remove, I assumed he meant at both places: on the window.close() in sendProgress.js to keep the progress window present longer and in nsMsgCompose.cpp to delay the compose window close. With them both out, there is little difference between linux and window. On linux the "Your message was sent... 100%" will be "seen" as a "subliminal" message for about 1/25 second. But it won't be seen at all on Windows due to closing the compose window during STATE_STOP in sendProgress.js.

Note: my ISP somehow disabled my email account yesterday which caused my bugzilla account to also be "suspended". All is fixed now. Maybe I sent too many "spam" messages testing this bug? They didn't say why it got disabled.

Attachment #9316670 - Attachment is obsolete: true

(In reply to BB from comment #134)

The point is that on TB's most prominent platform Windows (96%) the send progress never displays "Your message has been sent", see comment

I don't think it's a necessity for the message to be shown. If it took time and you saw the progress approach 100%, as I user I'd assume it was done once reached.

(In reply to Magnus Melin [:mkmelin] from comment #136

I don't think it's a necessity for the message to be shown. If it took time and you saw the progress approach 100%, as I user I'd assume it was done once reached.

I guess that's true and how it's always worked. But it would be nice to see some kind of positive confirmation that your message was sent and/or saved after the progress stuff flashes on your screen. But, as you noticed, it does take a lot of tweaking to get that to happen without breaking a lot of tests.
Anyhow, the main fix for this is to prevent the save-to-sent from hanging and instead go to the retry/save dialog. But if it still hangs, don't want to show that save to sent is 100%. So need to get this into the real world to see if the fixes for these really help.

Target Milestone: 112 Branch → 113 Branch

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/4e26ea4560b9
Fix hang at 100% during failed imap save to sent. r=mkmelin

Status: REOPENED → RESOLVED
Closed: 1 year ago1 year ago
Resolution: --- → FIXED

safe for uplift to beta?

Flags: needinfo?(gds)

Yes, I think it's good.

Flags: needinfo?(gds)

Comment on attachment 9319927 [details]
Bug 1745130 - Fix hang at 100% during failed imap save to sent. r=mkmelin

[Triage Comment]
approved for beta

Attachment #9319927 - Flags: approval-comm-beta+

Comment on attachment 9317251 [details]
Bug 1745130 - Improve error detection and reporting during imap save to sent. r=mkmelin

[Triage Comment]
approved for beta

Attachment #9317251 - Flags: approval-comm-beta+

Comment on attachment 9317251 [details]
Bug 1745130 - Improve error detection and reporting during imap save to sent. r=mkmelin

Already on beta, landed in 112 milestone.

Attachment #9317251 - Flags: approval-comm-beta+
Duplicate of this bug: 1824339
Duplicate of this bug: 1825224
See Also: → 1824624
See Also: → 1824401
Attachment #9317251 - Attachment is obsolete: true
Attachment #9321695 - Attachment is obsolete: true

Any objection to taking this on 102?

Flags: needinfo?(mkmelin+mozilla)
Flags: needinfo?(gds)

I guess that should be fine

Flags: needinfo?(mkmelin+mozilla)

Comment on attachment 9319927 [details]
Bug 1745130 - Fix hang at 100% during failed imap save to sent. r=mkmelin

[Triage Comment]
Approved for esr102

Attachment #9319927 - Flags: approval-comm-esr102+
Flags: needinfo?(gds)
Attachment #9335501 - Attachment is obsolete: true

Comment on attachment 9335504 [details]
1745130-esr.diff

[Triage Comment]
Approving esr-version of previously approved patch.
This needs to be applied on top of D169577.

Attachment #9335504 - Flags: approval-comm-esr102+
Attachment #9319927 - Flags: approval-comm-esr102+
No longer duplicate of this bug: 1825224
See Also: → 1868160
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: