Open Bug 707933 Opened 13 years ago Updated 3 months ago

Global inbox, Local Folders, access not serialized/synchronized when automatically checking multiple POP3 accounts every n minutes causing POP3_MESSAGE_FOLDER_BUSY error "This folder is being processed. Please wait until processing is complete..."

Categories

(MailNews Core :: Networking: POP, defect)

x86
All
defect
Not set
major

Tracking

(thunderbird_esr115 affected)

Tracking Status
thunderbird_esr115 --- affected

People

(Reporter: petr.v, Unassigned)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [implement comment 36?][workaround, partial: comment 10])

Attachments

(1 file, 1 obsolete file)

151.29 KB, application/octet-stream
Details
User Agent: Mozilla/5.0 (Windows NT 5.1; rv:8.0.1) Gecko/20100101 Firefox/8.0.1
Build ID: 20111120135848

Steps to reproduce:

- three POP3 accounts imported from Outlook Express
- all accounts stores emails in global Local Folders and use Message Filters
- all messages are left on the mail server
- all accounts have enabled regular checking every 5 minutes
- 'Automatically download new messages' is checked for all accounts




Actual results:

From time to time, there is following error in Error Console. It usually happens when messages are checked automatically. Manual checking by "Get New Messages" button does not cause the error:

Error: [Exception... "'Component is not available' when calling method: [nsIActivityManager::removeActivity]"  nsresult: "0x80040111 (NS_ERROR_NOT_AVAILABLE)"  location: "JS frame :: resource:///modules/activity/pop3Download.js :: <TOP_LEVEL> :: line 158"  data: no]
Source File: resource:///modules/activity/pop3Download.js
Line: 158

There is always at least following sequence ("ERROR: 4029") in POP3 log when it occurs:

2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 8
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: BeginMailDelivery folder locked
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Calling ReleaseFolderLock from AbortMailDelivery
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: ReleaseFolderLock haveSemaphore = FALSE
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: ERROR: 4029
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 24
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 25

And (most of the time) continues as follows:

2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Clearing server busy in POP3_FREE
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Clearing running protocol in POP3_FREE
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Clearing server busy in OnStopRequest
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Calling ReleaseFolderLock from ~nsPop3Sink
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: ReleaseFolderLock haveSemaphore = FALSE
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: Entering NET_ProcessPop3 0
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: POP3: Entering state: 24
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: Calling ReleaseFolderLock from AbortMailDelivery
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: ReleaseFolderLock haveSemaphore = TRUE
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: POP3: Entering state: 25
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: Clearing server busy in POP3_FREE
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: Clearing running protocol in POP3_FREE
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: Calling ReleaseFolderLock from ~nsPop3Sink
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: ReleaseFolderLock haveSemaphore = FALSE
2011-12-06 12:37:43.059000 UTC - 0[e2f140]: Setting server busy in nsPop3Protocol::LoadUrl
2011-12-06 12:37:43.119000 UTC - 0[e2f140]: Setting server busy in nsPop3Protocol::LoadUrl
2011-12-06 12:37:43.149000 UTC - 0[e2f140]: Entering NET_ProcessPop3 85
2011-12-06 12:37:43.149000 UTC - 0[e2f140]: POP3: Entering state: 1
2011-12-06 12:37:43.149000 UTC - 0[e2f140]: POP3: Entering state: 2
2011-12-06 12:37:43.149000 UTC - 0[e2f140]: POP3: Entering state: 4



Expected results:

No errors when checking multiple POP3 accounts.
Attached file Detailed POP3 log (obsolete) —
The log contains multiple POP3 sessions. Interesting ones are those with "BeginMailDelivery folder locked" message.
Attached file Detailed POP3 log (7zip) —
The log contains multiple POP3 sessions. Interesting ones are those with "BeginMailDelivery folder locked" message
Attachment #579314 - Attachment is obsolete: true
(In reply to Petr Vones from comment #0)
There is some discussion about the same problem here: http://forums.mozillazine.org/viewtopic.php?f=39&t=1911257
FYI. Following is bugs of open/updated-within-one-year status and ReleaseFolderLock in comment.
  bug 584655, bug 604444, bug 609674, bug 675503, bug 707933, bug 700987
More observations about the issue:

- it happens ONLY for automatic new messages checking via "Check for new messages every 5 minutes" option.
- in NEVER happens for manual checking via "Get All Messages" button or Shift+F5 keyboard shortcut.

Since those two actions are different (periodic automatic checking does not show its progress in the Status Bar while manual does) I suspect there is a folders synchronization issue or something like that because of the "BeginMailDelivery folder locked" error message in the log.

If there was an addon running exactly the "Get All Messages" action every 5 minutes I believe the problem would go away.
Another problem is that the POP3 session where the error occurs is *not* finished by QUIT command properly. See log:

... the POP3 session which fails ...
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Entering NET_ProcessPop3 15
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 12
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: RECV: 276 388066797
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 12
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Entering NET_ProcessPop3 15
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 12
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: RECV: 277 388066798
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 12
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Entering NET_ProcessPop3 15
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 12
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: RECV: 278 388066799
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 12
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Entering NET_ProcessPop3 15
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 12
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: RECV: 279 388066800
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 12
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Entering NET_ProcessPop3 16
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 3
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: RECV: +OK 227 791964
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 8
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: BeginMailDelivery folder locked
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Calling ReleaseFolderLock from AbortMailDelivery
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: ReleaseFolderLock haveSemaphore = FALSE
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: ERROR: 4029
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 24
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: POP3: Entering state: 25
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Clearing server busy in POP3_FREE
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Clearing running protocol in POP3_FREE
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Clearing server busy in OnStopRequest
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: Calling ReleaseFolderLock from ~nsPop3Sink
2011-12-06 12:33:13.521000 UTC - 0[e2f140]: ReleaseFolderLock haveSemaphore = FALSE
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: Entering NET_ProcessPop3 0
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: POP3: Entering state: 24
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: Calling ReleaseFolderLock from AbortMailDelivery
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: ReleaseFolderLock haveSemaphore = TRUE
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: POP3: Entering state: 25
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: Clearing server busy in POP3_FREE
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: Clearing running protocol in POP3_FREE
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: Calling ReleaseFolderLock from ~nsPop3Sink
2011-12-06 12:34:53.525000 UTC - 0[e2f140]: ReleaseFolderLock haveSemaphore = FALSE
2011-12-06 12:37:43.059000 UTC - 0[e2f140]: Setting server busy in nsPop3Protocol::LoadUrl
2011-12-06 12:37:43.119000 UTC - 0[e2f140]: Setting server busy in nsPop3Protocol::LoadUrl
2011-12-06 12:37:43.149000 UTC - 0[e2f140]: Entering NET_ProcessPop3 85


... new sessions starts here ...
2011-12-06 12:37:43.149000 UTC - 0[e2f140]: POP3: Entering state: 1
2011-12-06 12:37:43.149000 UTC - 0[e2f140]: POP3: Entering state: 2
2011-12-06 12:37:43.149000 UTC - 0[e2f140]: POP3: Entering state: 4
2011-12-06 12:37:43.149000 UTC - 0[e2f140]: RECV: +OK X1 <1908.1323175064655@xxx.xxx.xxx> pop3 service at xxx.xxx.xxx
2011-12-06 12:37:43.149000 UTC - 0[e2f140]: POP3: Entering state: 28
2011-12-06 12:37:43.149000 UTC - 0[e2f140]: SendCapa()
2011-12-06 12:37:43.149000 UTC - 0[e2f140]: SEND: CAPA
Summary: Random 'Component is not available' errors when checking multiple POP3 accounts → Local Folders access not serialized/synchronized when periodically checking multiple POP3 accounts every n minutes
It looks like a serious folders access serialization issue. The amount of errors depends on how often two or more concurrent POP3 sessions starts. Since it is properly serialized for manual checking via "Get New Messages" button, the issue is visible for automatic periodical checking only. 

For manual checking by "Get New Messages" the access is property serialized. That's why there are no errors:

2011-12-09 18:44:36.375000 UTC - 0[e2b140]: Setting server busy in nsPop3Protocol::LoadUrl
  2011-12-09 18:44:36.703000 UTC - 0[e2b140]: BeginMailDelivery acquiring semaphore
  2011-12-09 18:44:36.859000 UTC - 0[e2b140]: Calling ReleaseFolderLock from EndMailDelivery
  2011-12-09 18:44:36.859000 UTC - 0[e2b140]: ReleaseFolderLock haveSemaphore = TRUE
2011-12-09 18:44:36.875000 UTC - 0[e2b140]: Clearing server busy in POP3_FREE

2011-12-09 18:44:36.890000 UTC - 0[e2b140]: Setting server busy in nsPop3Protocol::LoadUrl
  2011-12-09 18:44:37.015000 UTC - 0[e2b140]: BeginMailDelivery acquiring semaphore
  2011-12-09 18:44:39.421000 UTC - 0[e2b140]: Calling ReleaseFolderLock from EndMailDelivery
  2011-12-09 18:44:39.421000 UTC - 0[e2b140]: ReleaseFolderLock haveSemaphore = TRUE
2011-12-09 18:44:39.437000 UTC - 0[e2b140]: Clearing server busy in POP3_FREE

2011-12-09 18:44:39.437000 UTC - 0[e2b140]: Setting server busy in nsPop3Protocol::LoadUrl
  2011-12-09 18:44:39.593000 UTC - 0[e2b140]: BeginMailDelivery acquiring semaphore
  2011-12-09 18:44:40.109000 UTC - 0[e2b140]: Calling ReleaseFolderLock from EndMailDelivery
  2011-12-09 18:44:40.109000 UTC - 0[e2b140]: ReleaseFolderLock haveSemaphore = TRUE
2011-12-09 18:44:40.125000 UTC - 0[e2b140]: Clearing server busy in POP3_FREE


But for periodic checking every X minutes, the error occurs in following scenario when the access to global Local Folders is not serialized properly:

2011-12-09 18:49:26.500000 UTC - 0[e2b140]: Setting server busy in nsPop3Protocol::LoadUrl <- first session
2011-12-09 18:49:26.531000 UTC - 0[e2b140]: Setting server busy in nsPop3Protocol::LoadUrl <- second session
  2011-12-09 18:49:26.625000 UTC - 0[e2b140]: BeginMailDelivery acquiring semaphore  <- first session, ok
  2011-12-09 18:49:26.734000 UTC - 0[e2b140]: BeginMailDelivery folder locked        <- second session, error
  2011-12-09 18:49:26.734000 UTC - 0[e2b140]: Calling ReleaseFolderLock from AbortMailDelivery
  2011-12-09 18:49:26.734000 UTC - 0[e2b140]: ReleaseFolderLock haveSemaphore = FALSE
  2011-12-09 18:49:26.734000 UTC - 0[e2b140]: ERROR: 4029
  2011-12-09 18:49:26.734000 UTC - 0[e2b140]: Clearing server busy in POP3_FREE

2011-12-09 18:49:56.546000 UTC - 0[e2b140]: Setting server busy in nsPop3Protocol::LoadUrl
  2011-12-09 18:49:57.359000 UTC - 0[e2b140]: BeginMailDelivery folder locked        <- third session, error
  2011-12-09 18:49:57.359000 UTC - 0[e2b140]: Calling ReleaseFolderLock from AbortMailDelivery
  2011-12-09 18:49:57.359000 UTC - 0[e2b140]: ReleaseFolderLock haveSemaphore = FALSE
  2011-12-09 18:49:57.359000 UTC - 0[e2b140]: ERROR: 4029


The issue can be minimized by setting different checking period for each POP3 account (i.e. 4,5,6 instead of 5,5,5 minutes) that the "crashes" does not happen that often. This is not acceptable workaround of course because the POP3 session is corrupted each time the error occurs. 

I can reproduce it on multiple computers.
The 4029 error means:

Status - write error occurred.
This folder is being processed. Please wait until processing is complete to get messages.

It confirms the description above.
Summary: Local Folders access not serialized/synchronized when periodically checking multiple POP3 accounts every n minutes → Local Folders access not serialized/synchronized when periodically checking multiple POP3 accounts every n minutes causing POP3_MESSAGE_FOLDER_BUSY error
yes, I don't think biff (check for new mail) preflights the check that a common inbox isn't busy. If so, the check should probably queue itself, or back-off.
Status: UNCONFIRMED → NEW
Ever confirmed: true
I think I've found a workaround:

- set the same checking interval for all POP3 accounts
- set preference mail.biff.add_interval_jitter to false

There are no longer 4029 (POP3_MESSAGE_FOLDER_BUSY) errors in POP3 log because multiple account checks are serialized now.
Unfortunately the workaround does not work if an email is received because the checking period is re-scheduled or something like that.

It must be fixed.
Component: Folder and Message Lists → Networking: POP
Product: Thunderbird → MailNews Core
QA Contact: folders-message-lists → networking.pop
Will new mail check interval like next reduce frequency of your problem?
 POP3 account-1:  7 minutes(1st prime number after 2,3,5, 1 + first perfect number)
 POP3 account-2: 11 minutes(2nd prime number after 2,3,5)
 POP3 account-3: 13 minutes(3rd prime number after 2,3,5)
 POP3 account-4: 17 minutes(4th prime number after 2,3,5)
 POP3 account-5: 19 minutes(5th prime number after 2,3,5)
Unfortunately it does not help. There are still about 20 or more errors during a 14 hour session.
How exactly did you setup the 3 accounts so that they download into Local folders?
By importing them from Outlook Express. The setup is pretty simple, multiple POP3 accounts targeting all kind of folders into corresponding "Local Folders".
I mean if you have done it in the Account settings -> account -> Server settings -> Advanced -> store into Global inbox?
Or if you set Account settings -> account -> Local Directory to point to the same filesystem folder for all accounts.
The first option: Server Settings -> [Advanced] -> (o) Global Inbox (Local Folders Account). 

I did not change Local Directories, I left it as it was created by the import from Outlook Express. All three POP3 accounts have its own separate local directories.
Good.
Line 158 does not exist in pop3Download.js any longer. Can you please paste an updated message (from TB24) ?
I reported Bug 827897 in January 2013, unaware of this bug.  But in my case, I was tolerating collisions (POPUP busy messages) that always seemed to happen only when doing a manual GETMAIL of about 15 POP3 accounts, all going to a global inbox (except for a few filters).  It was only when the problem started to happen on automatic fetches that I got concerned.  Automatic timing varies from 10 minutes for a few important (but not busy) accounts, up to an hour for several, and even one or two that are only polled manually.

My feeling was that the popup message should time out after a minute and try again, rather than just hang forever.  I may not be at my computer for hours and could miss something.  (Also causes emails to pile up waiting to be downloaded.)  An automated process should be fully automatic, not requiring an OK from an operator who can't do anything about the situation anyway.
Blocks: 827897
(In reply to :aceman from comment #19)
> Line 158 does not exist in pop3Download.js any longer. Can you please paste
> an updated message (from TB24) ?

Sorry no. I have already moved everything to single IMAP accout.
Severity: normal → major
OS: Windows XP → All
Wayne, my bug #827897 was resolved as being a duplicate of *THIS* bug, yet if I understand the flags here, *THIS* bug is a dup of mine.  A bit circular?

In any case, the issue of busy messages (whatever the cause) still happens today -- regularly if I do a manual GetMail to poll all mailboxes immediately, and occasionally by automatic polling over random times (ie- every once in a while I see the email polling hung waiting for an OK to continue).
(In reply to Dan Pernokis from comment #23)
> Wayne, my bug #827897 was resolved as being a duplicate of *THIS* bug, yet
> if I understand the flags here, *THIS* bug is a dup of mine.  A bit circular?

If you mean the blocking field, no, this bug blocks yours.  But I duped yours because this bug has more technical detail. In addition, yours is newer, and we tend to keep the older bug report when duping. And it is preferable to keep the blocking field intact, in case fixing this bug does not fix your issue.
Summary: Local Folders access not serialized/synchronized when periodically checking multiple POP3 accounts every n minutes causing POP3_MESSAGE_FOLDER_BUSY error → Global inbox, Local Folders, access not serialized/synchronized when automatically checking multiple POP3 accounts every n minutes causing POP3_MESSAGE_FOLDER_BUSY error "This folder is being processed. Please wait until processing is complete..."
Whiteboard: [workaround, partial: comment 10]
Petrv states that all messages are kept on server. This slows up checking for new mail, so perhaps this contributes to making this problem more frequent?

FWIW, a number of "processing" bugs have been fixed in the past https://bugzilla.mozilla.org/buglist.cgi?quicksearch=101584%20205379%20466933%20223062&list_id=14264786
Recently, I happened to have TB open/visible while logged into Webmail.  I noticed a message in Webmail that was not yet in TB.  I clicked getmail for the account to DL the message.  As TB was starting its poll, a new message appeared in Webmail -- and TB got the popup about being busy.  So this is a rare but suspected case where TB and the server collide -- TB is polling just as the server is receiving (displaying) a new message.

But I've only ever actually seen this ONCE this century!  Odds are against it.  But if you do a Getmail (all) twice in succession -- both running concurrently -- sooner or later (in a short time) they WILL collide over something if one has enough mailboxes (I currently have 14 in my list).  Or if I click Getmail for a specific account while Getmail (all) is in process, I might get a collision.

So this tells me TB is doing something strange, like perhaps not grabbing a lock or something before it runs along.  But my issue is that once it says "busy", I have to MANUALLY click OK to continue -- the process that detects busy does not time out and disengage and try again -- it simply hangs forever waiting.  That causes the backlog, because no further polling occurs on any of the other mailboxes.  And it can happen spontaneously too, because occasionally two automatic polls for specific mailboxes might happen simultaneously -- and lock each other out, hanging everything.

Since all of my incoming mail goes to the same shared INBOX, I suspect the common case is that my main INBOX is busy (receiving a message) rather than the server being busy with that account.  Because even when specifically running Getmail (all), a single Getmail poll could be triggered automatically by one of the other mailbox accounts just doing its normal timed fetch, and this could cause busy for that mailbox.  This still happens often enough to be more than just a chance collision.

And, as I said (here and elsewhere, in various words):  The biggest impact of the problem is that the process that detects busy does not time out and disengage, thereby hanging (suspending) everything further.

Dan, was your testing with version 60?

Giacomo, you wrote that you think this is gone, was that with version 60?

Flags: needinfo?(giacomo.mazzini)
Flags: needinfo?(dannyfox)

This issue still exists with the new Thunderbird 60.5.0 I just installed.
Why do we need an error popup in a modal window? I sometimes absentmindedly click the get mail box twice, and have to close 10 modal windows advising me uselessly of something I know and don't care about.
It's stupid. Really stupid.
This type of error should be able to be disabled especially after all these years, this bug languishing here.

Hi Wayne. If by testing you mean what was I using in my Comment #26: That would be TB 52.9.1 (not replaced until 60.3.0, the next public release I am aware of, on 05-Nov-2018). I installed 60.5.0 just now and clicked two GETMAILs (ALL) -- immediately got a few collisions and popups -- had to click OK to acknowledge and bypass the popups.

Unlike these induced collisions, I have not seen any spontaneous popups during normal (automatic timed) polling in a very long time. But that doesn't mean they won't happen -- if the mailserver just happens to be busy for a split-second, TB will hang waiting for OK.

On the other hand, local collisions do occur occasionally but regularly if I run a GETMAIL (ALL) and an automatic poll overlaps, or if I GETMAIL a specific account just as it is being polled automatically. I always have to click OK to continue.

BTW, I somewhat agree with Steve (Comment #28) -- for sure the popup should time out on its own and continue polling, but there might be a need to know that a collision has occurred. The tricky thing would be: When should the timeout retry the current failed (collided) poll and when should it move on and continue with the next item? We wouldn't want it repolling forever if something goes wrong with the server, for example, so it should give up when a retry limit is exceeded. But I would like to see a message if polls are failing, especially if the polling skips an account and moves on to the next. (I think Steve is saying the popup should be eliminated altogether -- a good idea in many cases -- but I would want it to be a configurable option because some of us need to know when the polls are failing.)

Flags: needinfo?(dannyfox)

Any popup that requires user interaction before normal processing will continue on any computer process should be automatically dismissed, because if it's not it hangs the system.
What if the system is unattended and processing 27 mail accounts, like me, and it's stopped by a transient error? And no one is looking at it? It could sit there for years.
This is anything on any computer. Alerts can be useful but they cannot stop the system from functioning.

Maybe it should be a notification of some kind, not a modal window.
Does Thunderbird have that? Yeah, I think it does.

I guess not, this one has been languishing for 7 years. Someone at Mozilla is going to have to make an executive decision some day and help clean this up.
I have been involved in these bugzilla discussions before, and there have indeed been those who deny that a problem is a problem until someone comes by and says "Hey, this is not right."
This is one of those times.
Modal windows displaying a transient non fatal failure is just inappropriate.
The queue should just continue processing as soon as it can, because this is not a problem that stops everything on an automated system. And no, the user does not necessarily care about that. If there is a collision and it is resolved 45ms later, why do I need to be told?
This is automation.

I agree with Steve that the process should continue without interruption if at all possible. I also agree that there is no need for a popup or warning if the issue resolves itself miliseconds later -- and I'd say even seconds later.

However, if a poll fails for quite some time (a minute or more?), then a persistent non-halting popup should occur which the user can acknowledge when seen. That is, a popup should say that a problem occurred, the process should try to continue but eventually time out (give up) AND THEN CONTINUE automatically without intervention, and the user can see later that something happened.

I also have a number of accounts being polled -- and TB hangs in a permanent wait if any one of them fails. Not good -- and I've also mentioned this for years. (See my Comment 20 and Comment 23 above.)

It looks like we already postpone a server if it is busy, when automatically checking for new mail in the background, https://searchfox.org/comm-central/source/mailnews/base/src/nsMsgBiffManager.cpp#305

For manual checking, we could add a check for busy server e.g. at https://searchfox.org/comm-central/source/mailnews/local/src/nsPop3IncomingServer.cpp#729 or pass a flag into nsPop3Service::GetNewMail() there and and skip errors silently somewhere at https://searchfox.org/comm-central/source/mailnews/local/src/nsPop3Protocol.cpp#2461. But that could get messy. Magnus, what do you think.

Flags: needinfo?(mkmelin+mozilla)

(In reply to :aceman from comment #36)

It looks like we already postpone a server if it is busy, when automatically checking for new mail in the background, https://searchfox.org/comm-central/source/mailnews/base/src/nsMsgBiffManager.cpp#305

In that case, should we undupe Dan's manual checking bug 827897 and handle the manual checking issue there? (And manual checking apparently is also Steve's issue)

As for automatic checks, this bug, Giacomo wrote me last fall that "bug 707933's message appears only when I press "GetMail" while a checkmail process is already running" [i.e. bug 827897]. But could bug 383517 and/or bug 1155614 be related?

Flags: needinfo?(acelists)

Steve, your issue is only with manual checks, and not automatic checks? Also, I'm curious how you came to be here on the exact day I posted the needinfo requests - how did that happen?

Flags: needinfo?(steevo)

I mostly only do manual checks.
As to timing, I think google finds things that are active with a proper query.

Flags: needinfo?(steevo)

Wayne & Aceman... Some clarifications.

Manual: I presume you mean when I press GET MESSAGES (aka doing a GETMAIL ALL), this is a manual process. Likewise, that pressing GETMAIL to a specific account is manual.

So...

  • If I do GETMAIL (ALL) when GETMAIL (ALL) is already in progress, I often get the popup/hang because something collides. (This is an easy way to see if the problem exists if someone has several accounts, as Steve & I do.)
  • If I do a specific GETMAIL when GETMAIL (ALL) is already in progress, I sometimes get the popup/hang if and when the specific collides with the ALL.
  • You could argue that, since we're doing a manual GETMAIL, we're at the keyboard anyway. But this is not true if a lot of stuff is coming down. In any case, the popup stays politely behind the active screen (ie something else being worked on at the moment) so it probably won't be seen for some time -- or not at all if I get up and leave.

Automatic: I presume you mean one (or more) of the accounts internally does a GETMAIL poll according to its schedule.

So...

  • If I do a manual GETMAIL (ALL), I ocasionally get the hang because something collides. I've always thought (and still do) that an automatic GETMAIL is the cause. This happens to this day. But I can't tell you whether the manual or automatic process is at fault.
  • On the other hand, it has been a very long time since I have seen a "spontaneous" collision, one where the popup/hang just spontaneously shows up because TB has internally polled an account that happens to be busy. But it does happen if the server is busy for a moment. See my Comment 23 (this case) and Comment 26 (manual poll - but had it been an automated poll at that instant, I'm sure it would have collided.)
  • My feeling is that TB is smart enough to not do two scheduled (automatic) polls simultaneously, so the issue wouldn't appear. But if the server is busy with the target account (Comment 26, an email coming in at the exact time of the poll), that timing is beyond TB's control and the collision (and popup/hang) will happen. The coincidental timing is unlikely, but it does happen, as I inadvertently demonstrated manually.

If I absentmindedly double click get mail, I get 17 popups that must be dismissed.
That is pointless. This should all spool. There is no valid reason why I, the user, should have to manually wait to click a button again, like if I am waiting for email to arrive.
The system should realize I clicked and spool that click for when it can be processed, like any other computer process would.

Additionally, I can set the get mail interval to 1 minute.
Lets say a large message is taking time to arrive. Big attachment, maybe.
Would the system throw an error if it's not done?

No, it should not, it should queue those get mails until they can be started.

If the system cannot start a new task when another is still in process it ought to be smart enough to know that and wait.

That this product has been in use so many years and no one has thought that tasks that are requested by the user or by settings that cannot be started should be queued- and the system should dully just give a popup that must be manually dismissed, that just seems like bad architecture.

(In reply to Steve from comment #41)

If I absentmindedly double click get mail, I get 17 popups that must be dismissed.

Perhaps a regression, because long ago we fixed Bug 392680 - Clicking Get Mail twice or during startup (while first round not finished) returns modal "This folder is being processed. Please wait until processing is complete to get messages." dialog / error message [pop]

I think we have plent of information to proceed.

Flags: needinfo?(giacomo.mazzini)

Perhaps a regression, because long ago we fixed Bug 392680...

Wayne, I started using Thunderbird in early 2010. Being new to the program and to the entire concept of using a non-Microsoft email product, I never really paid attention to the occasional popup, thinking it was just normal -- so I can't say whether I noticed one way or the other which aspect of the problem existed. That is until it got bothersome and I could see the implications, especially as my email account list grew, so I eventually filed a bug report. But I can say it looks to me that the problem in 392680 still exists today, and aside from timing (being at startup, or some other time), the visible symptoms are identical.

FYI, I usually have TB running continuously, but there are many times I have it closed, then open it only when needed. Therefore, I have all my accounts set for delayed startup, because:

  • sometimes I open TB just to check something, then close it immediately and carry on with my work.
  • if I'm pressed for time, the delay allows me to check a few important emails (or perhaps one specific account) without waiting for everything to come down.

So during startup, as I'm doing GETMAIL on a few selected accounts, suddenly I see a bunch of other mailboxes being polled one after another. I thought I must have hit GETMAIL (ALL), or the cursor did it by being off-focus or "between" the account menu lines. Since this happens so often (and I'm not all that clumsy), I eventually realized that it was more likely the startup run of all the mailboxes. Whatever, suddenly bang, there's a collision and a popup/hang on one of the accounts. Manual (by me checking a specific mailbox and colliding with startup)? Or automatic (by startup running all accounts and colliding with my specific mailbox)? Either way, the issue rears its head.

(In reply to Wayne Mery (:wsmwk) from comment #27)

Dan, was your testing with version 60?

Giacomo, you wrote that you think this is gone, was that with version 60?

Yes Wayne, I think it is gone. Maybe it hasn't been showing since last September (when I run Thunderbird 52.9.1 64bit on MacOSX 10.13.6.)

I can click twice on the get mail button and get 17 modal windows to dismiss.
It's here on that basis.
V60.5.0

Just to be clear in my Comment #44:
If I get "a" popup/hang, I usually get several, depending how far through the sequence when the first one hits. I currently have only 14 accounts included in GETMAIL (ALL), but I could have half of them hang waiting for "OK", one after another.
BTW, running 65.5.1

We're talking about error message popups. Not a problem with getting the mail.

It's a dumb computer, if I tell it to check 100 email accounts it ought to go away and do that as best it can without pestering me. Whoever got the idea that a window saying essentially "Check mail more slowly, I am busy for the next two seconds" was a good idea needs to be slapped down, a lot. I don't need the computer to try to train me to check mail at the rate it can do it. I just need it to do the best it can.

Comment 47: BTW, running 65.5.1

I meant 60.5.1 -- latest public release from 13-Feb-2019. (Firefox is v65.)

Comment 48: We're talking about error message popups. Not a problem with getting the mail.

Nobody is disputing that, Steve -- I too have been talking about the error message popups. They understand the problem: A collision happens, the error message pops up, and the mail fetch hangs (does nothing further) until acknowledged. They're trying to get a handle on the actual cause. (And they know it really should just time out and continue immediately -- if at all possible.)

Even though it seems like dumb design, there were probably good technical reasons why the process was originally implemented that way -- such as avoiding "deadly embrace" and other issues if multiple computers are polling the same account at the same time, or TB having to co-exist with other competing (possibly non-friendly) email clients which may be active simultaneously (perhaps even on the same machine!) -- hence the "if at all possible" disclaimer. (Considering TB was developed just after the end of the dial-up age, when users still paid penalties for overtime and excess traffic, the better option in some circumstances might have been to stop polling. The last thing you would want in a non-viable situation is a lot of repetitive polling traffic doing useless handshaking and chewing up your bandwidth.)

Hopefully the process can be changed, but they have to be sure they won't break it worse by fixing it.

Oh, I just want to point to the actual problem, which many seem to not see and feel it's fixed.
Maybe because the vast majority of people only have one account, so it doesn't happen.
But as Wayne said, this was supposed to be fixed and this is a regression. And it might be.

Just updated to the new TB 60.5.2 -- so I tried two overlapping concurrent GETMAIL (ALL) runs, even though Release Notes don't mention our issue being fixed. As expected, there were a few popup hangs (four) during polling of 14 x 2 accounts.

The process was a little slow waiting for my ISP's email server, as is typical for this time of day. But when I manually polled the four accounts that had had popups, one of them did in fact have an email come down which (according to the timestamp) arrived at the server around the time the account was being polled.

Just to be clear: When the mail server is slow to respond, TB waits patiently as it should, then continues properly and normally. Only when TB trips over itself -- or when an email is actually coming into the server (rarely, as it appears to have done just now) -- then the popup hang occurs. (And even though TB had already been running for awhile, there might have been the startup run in effect too, as TB had just restarted itself following the update...)

(In reply to :aceman from comment #36)

For manual checking, we could add a check for busy server e.g. at
https://searchfox.org/comm-central/source/mailnews/local/src/
nsPop3IncomingServer.cpp#729

Sounds reasonable.

Flags: needinfo?(mkmelin+mozilla)

OK, I have created a test build with an experimental patch. It is a build of TB 60.x plus patch at https://hg.mozilla.org/try-comm-central/rev/9bc7f82566d4a6f3475b887ee2911d7a3dedd358 .
It would be great if somebody could try it out.
For Windows 64bit, the package is at: https://queue.taskcluster.net/v1/task/X2DUU5y5R0eYnh76dt4dZg/runs/0/artifacts/public/build/target.zip .
You just unpack it somewhere and can run from there, it will not affect your existing TB installation, but will run on your existing TB profile (where all your data are). So you can directly test if the behaviour changes.

Flags: needinfo?(acelists)

testing of comment 53 needed

Flags: needinfo?(steevo)
Flags: needinfo?(dannyfox)

I downloaded the ZIP file from /rev/...358 and unpacked (unzipped?) it, but I can't see anything that looks executable. How do we run this patch to test it?

Use the other link to the zip file with the executable (assuming you are on Windows 64bit).

@aceman - Sorry, no, I'm using 32-bit (and Windows 7/Pro if that further restricts things)

Flags: needinfo?(dannyfox)

I am on windows 7 64 bit. I downloaded the second version, the 64 bit one. The installed version of thunderbird I am using is 32 bit.

I extracted the files and clicked thunderbird.exe got a UAC popup but nothing launches.

I found the plugin-hang-ui executable. Quit the other instance of tb and it launched.
I clicked get mail several times.
I still get the error popups.
Nothing is fixed so far.

Testing on Windows 7 / 32-bit: Aceman's daily unzipped OK (TB 60.6.2 Daily)

I clicked GETMAIL (All) twice in succession for the various tests, which polled 14 POP3 accounts. There were at least two different mail servers involved (at two distinct ISPs).

First tried my existing installation (TB 60.6.1 Release). Tested THREE passes. Each time I clicked GETMAIL (All) twice in a row, varying the delay between launches slightly each time from ~3 seconds down to click click.
First pass I got two busy popups. Second pass one popup. Third pass another two popups.

Then I tested the patch (TB 60.6.2 Daily). Tested FOUR passes. Three times I clicked GETMAIL (All) twice in a row, varying the delay between launches slightly each time from ~3 seconds down to click click.
First pass I again got two busy popups. Second pass one popup. Third pass another two popups.
For the fourth pass, I clicked GETMAIL (All) once and I was deliberately waiting to get about halfway through to start the second GETMAIL. About 10 seconds in, suddenly a busy popup occurred, without my second click. So either the program tripped over itself again, or an automatic timed poll went off and collided with my test pass.

I did notice that the hangs sometimes happened on the first of two getmail sessions within the pass, then shortly later watched the second one just sail right by. The could be an automatic poll interfering.

There also occurred an overlap of "No messages to download" while a poll was in progress -- this would be from an automatic poll that jumped in and ran -- erasing the current display from the current getmails running manually.

Symptoms overall are unchanged, so I agree with Steve -- this test fix hasn't resolved the problem. But thanks, Ace, let's try again.

As to this being a regression. I think I agree with that.

I am at a different location this week, and I have an older version of Thunderbird on a system here which has been checking only one account.

45.6.0

I installed 13 more accounts on it and I get none of these popups.
No problems at all.
It's nagging me to update, but as of yet I am resisting that. Heh.

I don't know what the changes were from that to later versions, or when the problem appeared again.

Flags: needinfo?(steevo)

I was complaining about this issue long before 45.6.0. My Bug 827897 back in January 2013 was for TB 17.0, wherein I stated the problem had already been occurring for a long time. I was polling about 13-14 accounts then too, pretty much the same ones I have today.

If your additional accounts were just newly installed, perhaps the underlying files are of a different type or format on that machine. Once upon a time (including my start with TB), all messages for an account went into one big file (with pointers to each) -- and that's how my TB operates today. But as I understand, a switch was made several versions ago to a format where each message goes into its own file.

I'm not sure you can mix the two, but I'm speculating that your isolated location had this newer version of TB, that it was set up differently from the start, and that it uses the newer format (which I believe is now the default for new installations of TB).

A system that relies on updating pointers to a file must have lockouts to prevent data collisions -- and lockouts are notoriously tricky and subject to tripping over themselves. The newer format would be cleaner and less prone to hiccups. I'm wondering if something in handling the old format is the cause of the problem?

Steve, it would be good to know the file format in use at the isolated location (and your home base too). Go to TOOLS, OPTIONS, ADVANCED, then "General" tab -- look in the bottom box for "Message Store Type for new accounts". Let us know if it is "File per folder (mbox)" [the old format] or "File per message (maildir)" [the new format].

If you are using the new format, it would be a good test if you could upgrade TB to current and let rip a bunch of GetMails. If that system still doesn't lock up, then we may a few steps closer to a resolution.

So then -- as a workaround for old format users -- the seriousness of the popup window problem can be resolved by letting the popups expire and continue, without hanging forever waiting on a human response. (Subject of course to the previously mentioned caveats...)

The installation that does not exhibit the problem says
File per folder (Mbox)

I won't be back at the other location for a while, but that was an upgraded installation where I tried to import mail and settings from eudora. It was an older version that supported that, but it sure didn't go well at all. I upgraded it after that.

The installation that does exhibit the problem says the same
File per folder (Mbox)
I was eagerly going to change it when I got back which I did today, my hopes are dashed.
This installation still exhibits the problem.
I have no explanation. I did bring the 3 GB thunderbird folder from the other location, and I may put it in instead of this one where I am having problems, still. I do have lots of missing mail that was in Eudora that I could not import. The Eudora mbx files became corrupt at both locations, I suspect one bad email, but I have not been able to find it. Seems to have screwed up their pointers or something in the record structure. And mbx is so simple!

Does anyone have a suggestion for me?
File per folder (Mbox)
is set, and the collisions still occur.

For what it's worth, it seems that (for reasons unknown) running any single GETMAIL may take milliseconds or might take many seconds, depending on (what?) -- seems like it's my service provider responding quickly or not. I can't prove this, only by observation and time of day -- sometimes it's fast and sometimes it isn't. So, a GETMAIL (ALL) may take several millisecond-long fetches and work perfectly, or might take a few minutes in total, pausing for several seconds waiting for each response. Rarely if ever do I get a collision on the really fast services, but I often get one or more collisions when service is slow -- understandable since there is much more time for (and likelihood of) a spontaneous timed GETMAIL interrupting the otherwise sequential flow.

In an ideal world, all service requests would be handled quickly. In reality, things slow down. Either way, collisions can and do happen. Going back to a suggestion I made years ago and we're trying to bring forth now: Please, if a collision popup occurs, let it time out (ie- without requiring user intervention) so the GETMAIL process can continue and finish.

Yeah, the problem is when I absentmindedly, or intentionally click get mail more than once.
Then I often get 10 modal windows that must be dismissed for the processes to continue.
Silly that this would be a part of an automated process, but for some reason someone put it in and it's a problem.
The solution would be to queue the get mail clicks and process them when possible, or ignore extra clicks, but for some reason none of that was done.

So what we have here is a user training mechanism: "Click more slowly, or we will annoy you until you do, or move on to another mail client."

This is to deter the human involved from telling the computer to do something too quickly, whereas I consider the sole purpose of any computer to do my drudgery for me. So I don't have to.

Anytime a developer involves the user in the computer's drudgery is a defect, and this sure is. A fairly significant defect.

But my question today is about why this installation acts differently than the other installation in this one regard, since they are both set the same, with that files per folder (mbx) setting.

As I said, I did bring the entire three gig thunderbird directory from the other location with me. I wonder if there is an INI file or something that I can replace here that would change the behavior? Or should I try to change the entire installation with the other directory from the other location that does not have this problem?

Do you know for a fact that the location without the problem is actually different? If the SETTING changes, ie mbx vs not), that does not retroactively change the file structures that existed at the time of the change. Existing accounts will maintain their original file structure AFAIK. Perhaps they were set one way when created and the setting was changed afterward and now looks the same as the others but actually isn't the same underneath.

...when I absentmindedly, or intentionally click get mail more than once.

On this point, I think there is something else afoot here too. I often click specific accounts for GETMAIL and occasionally end up with a full GETMAIL of all accounts, of course then resulting in collisions and the busy popups. This is not an absent-minded click-more-than-once. Something triggers the full GETMAIL. Often it's the delayed start of imminent fetches when TB first starts up -- but not always (I've mentioned clicking "between" account menu lines in my Comment #44). Sounds like you might be getting this too, through no fault of your own. So either the focus area has gaps (allowing "all" as a default), or specific clicks randomly switch to GETMAIL ALL (for reasons unknown), or we're both extremely clumsy on the keyboard (not so likely).

I don't know for a fact. This installation was an import from Eudora that didn't go well.
And it works crappy.
I wonder if I can just replace the directory with the one from the other location? I can rename it and plop the other files in and the mail will all resync, I think. Mostly.

Not sure how to advise on this. I have only limited experience with hacking an import -- a POP3 file, actually -- but it went OK and the analogy may work for you.

I had a mailbox file ("Dan Email" for illustration) which I wanted to import. I just created a new account with that name ("Dan Email") so that it created the empty files likewise called "Dan Email". Then I substituted the file to be imported directly in place so TB would recognize it, "repaired" (rebuilt) the folder (since the internal index structure was obviously new & different), and it was done. TB was able to fully handle the file and its content.

I'm not sure how an imap folder of files will work, but I suspect as long as you create the new account so that the top file structure (folder) is in place, then fake it out by -- as you say -- plopping the other files in and re-syncing, I suspect it will probably work. (But you already have more experience than I do with this aspect...) If it doesn't work right away, at least try the "repair folder" option, which I suspect is what you mean by resyncing -- TB has to know what it's looking at. Good luck with it!

aceman, given the feedback, any ideas for a patch v2? Or, what more information is needed?

Flags: needinfo?(acelists)

FYI, still happening (exactly as previously described) on TB 68.1.0 and on 68.1.1 (just installed minutes ago).

And... TB happened to log a crash during its update -- there's a pop-under crash screen with an icon in the taskbar. This would have been the previous instance not closing off properly -- happens to me all the time on Firefox updates (my bug 1544001), but don't recall seeing it happen on TB.

Blocks: 168648

Dan, Do you also see what is described in bug 822625 ?

Flags: needinfo?(acelists) → needinfo?(dannyfox)

I never run with any accounts set to Check-on-Start-up because, when I start, sometimes I want to look at something specific and don't want a bunch of unrelated downloads in my way. But for the sake of testing, I turned on Check-on-Startup for several accounts -- all at the same ISP -- and opened several emails in those various accounts. Then I exited specifically using FILE, EXIT as per steps in 822625, and ran various tests.

Before starting anything, I needed to check my initial situation. I did two overlapping GET MESSAGES (ALL) four times in total -- 14 accounts at 3 ISPs included in the GET ALL MESSAGES list -- and got the popups 4x, 1x, 1x, and 3x. So this proves that the popups still happen (TB 68.10.0) when GETMAILs collide. Then I set up as above and began testing bug 822625. [[ Sorry, I can no longer find the option that used to say "Include this account in Get All Messages?" ]]

From now on, whenever I say "close" or "exit", I mean I used FILE, EXIT to close TB, then waited a couple of seconds before continuing.

First, with the various Checks-on-Startup set and no emails open, I closed and re-launched TB -- no popups. Repeated a few times, nothing.

Then I opened several emails representative of various accounts. I closed TB and re-launched -- there were no popups. I repeated four or five times -- no popups. So this seems to disprove bug 822625 (by having more accounts to open and trip over).

Finally, several times I launched TB and immediately launched GET MESSAGES (ALL) so that it overlapped the startup GETMAILs in progress. No surprise, I got the occasional popup. This proves two things: (1) as above, overlapping GETMAILs still collide, whether automatic or manual; and (2) my own observation of popups during any one single specific manual GETMAIL is in fact due to collision with automatic GETMAILs running in the background -- in this case several at once, but not the complete set as would happen when all items start up after a delay (ie- "not at startup").

If I was guessing, based on my experience with multiple accounts at multiple ISPs (documented above in this bug), it might be a matter of timing too. Even though bug 822625 has only two accounts, the ISP might be a tad slow responding to (and perhaps overlapping) both polls. Or if the accounts are at different ISPs the responses might return almost simultaneously. A quick test for bug 822625 would be -- with no emails open -- to click on GET MESSAGES (ALL) several times in quick succession and try to force a collision. With only two accounts repeating, it might take several tries, but... If a collision happens quickly, that might prove the popup has everything to do with timing or responses and little to do with emails being open (except that TB has to open the emails while it is polling the startups).

Flags: needinfo?(dannyfox)
See Also: → 288896
Blocks: 822625

(In reply to :aceman from comment #53)

OK, I have created a test build with an experimental patch. It is a build of TB 60.x plus patch at https://hg.mozilla.org/try-comm-central/rev/9bc7f82566d4a6f3475b887ee2911d7a3dedd358 .
It would be great if somebody could try it out.
For Windows 64bit, the package is at: https://queue.taskcluster.net/v1/task/X2DUU5y5R0eYnh76dt4dZg/runs/0/artifacts/public/build/target.zip .
Aceman, do you still have your patch?
You just unpack it somewhere and can run from there, it will not affect your existing TB installation, but will run on your existing TB profile (where all your data are). So you can directly test if the behaviour changes.

Whiteboard: [workaround, partial: comment 10] → [implement comment 36?][workaround, partial: comment 10]
See Also: → 929281

I just now noticed something different in my current TB installation (TB 78.6.0, running on Windows 7/32 Pro). As previously reported, I presently have 17 mailboxes for polling, 14 of which are triggered by the GET ALL MESSAGES button (aka GETMAIL).

Triggering two or more GETMAIL streams simultaneously (by pressing the Get Messages button two or more times in quick succession) has always caused the busy popup sooner or later and caused all further email fetches to hang until the popup is acknowledged. Now, if I trigger two GETMAIL streams, eventually one will hang with the popup waiting -- but the other continues to completion (nothing else to crash into). If I trigger three or more, one or more will hang waiting. Any uncrashed streams complete, followed by any others whose popups have been acknowledged.

So it looks to me that THIS version is on the way to a fix -- at least a fully uncrashed stream will complete without hanging whether or not a popup from some other stream is waiting. But the popups that do happen still need to time out and continue, as discussed above in this report and elsewhere.

(In reply to Dan Pernokis from comment #79)

So it looks to me that THIS version is on the way to a fix -- at least a fully uncrashed stream will complete without hanging whether or not a popup from some other stream is waiting. But the popups that do happen still need to time out and continue, as discussed above in this report and elsewhere.

How is version 91?

Flags: needinfo?(dannyfox)

It took me under two minutes -- far less time than it took to write this message -- to prove all things: running multiple GETMAIL streams will cause collisions and frozen streams, but ONE will finish unimpeded. (See caveat)

(1) Running two: One popup waits for acknowledgement and the other stream continues.
(2) Running three: Two popups are inevitable -- and both freeze until ack while the last one finishes. An ack'd popup will eventually finish, and the last waits for ack and finishes too. (See caveat)
(3) Running four (if I can click GETMAIL fast enough, because the mouse can't see it if focus is on a popup): As per three, the last finishes since no collisions are lurking; the second-last runs when ack'd; then the remaining stream completes. (See caveat)

CAVEAT: Also noticed in this quick test: When I was running four, and the last of the streams was polling, one of the timed automatic polls triggered and caused a popup. This is inevitable too, given how I see this thing working now. Whatever it is in the timing that makes a collision happen is still there.

So to answer your question, Wayne: 91.4.1 seems to be no different than 78.6.0 in Comment 79. It is nice that one stream (of multiples) will usually finish, but that has always been the case. Given a number of inboxes being polled, a timed automatic poll can and does trigger at any moment, causing a collision/popup in the GETMAIL stream and hanging the whole process. But at least it appears a given GETMAIL stream doesn't hang its brothers unless they too collide with something.

And as I concluded in Comment 79... "The popups that do happen still need to time out and continue, as discussed above in this report and elsewhere."

Flags: needinfo?(dannyfox)

(In reply to Petr Vones from comment #10)

I think I've found a workaround:

  • set the same checking interval for all POP3 accounts
  • set preference mail.biff.add_interval_jitter to false

What have others found - does this help you, or not help you??

FWIW jitter was addedd 14 yearsa go in Bug 235086 - add a bit of jitter to the biff-interval

(In reply to Dan Pernokis from comment #79)

I just now noticed something different in my current TB installation (TB 78.6.0, running on Windows 7/32 Pro). As previously reported, I presently have 17 mailboxes for polling, 14 of which are triggered by the GET ALL MESSAGES button (aka GETMAIL).

What settings were you using? jitter disabled, or enabled?

Flags: needinfo?(dannyfox)

I can't find anything about jitter in User Preferences, nor do I see anything in the Troubleshooting Info area. So my guess is it's doing whatever the default (undefined) condition is. I checked both my PC Tower and PC Laptop, both running Windows 7/Pro (32-bit). TB = 96.6.1 Standard user release.

FYI, The Tower runs the GETMAIL sequence fairly quickly all the time, whereas on the PC Laptop, there are often long pauses during each poll, especially later in the day, somewhat as described in Bug 235086. Maybe the local defaults are different? Or the jitter calculates differently on the two machines? Or just another red herring?

[ And an update: Currently 15 of 17 mailboxes are now polled automatically or by GET ALL MESSAGES (aka GETMAIL). ]

Flags: needinfo?(dannyfox)
See Also: → 1763491

After at least 11 years, this problem still exists right now in TB 91.11.0, and I have no reason to suspect it would have been fixed in 102. The fact that Bug 1763491 sees the screen means it still happens in some form or other. But I can't test it because using the GET MESSAGES button seems to trigger multiple downloads thereafter until relaunch, so for now I've disabled all accounts from linking to the GM button.

Today I updated to TB 102.0.3 Release and re-enabled my GET MESSAGES button configuration. This bug still occurs.

I'm getting the occasional popup warning that certain accounts are "busy". Happened when I clicked GET MESSAGES first time and the timed fetches started too. In fact, for the first time I had two popups simultaneously -- one immediately on top of the other -- as the two accounts competed for attention, and a total of three popups altogether (out of ten accounts actively polled on this machine).

(In reply to Dan Pernokis from comment #86)

Today I updated to TB 102.0.3 Release and re-enabled my GET MESSAGES button configuration. This bug still occurs.

I'm getting the occasional popup warning that certain accounts are "busy". Happened when I clicked GET MESSAGES first time and the timed fetches started too. In fact, for the first time I had two popups simultaneously -- one immediately on top of the other -- as the two accounts competed for attention, and a total of three popups altogether (out of ten accounts actively polled on this machine).

I take it it's not better even with 102.2?

In fact I upgraded to TB 102.2.0 just last night, and I've re-activated all relevant accounts for GET MESSAGES with the intention of checking for duplicate message downloads. So far so good, no dups, but there were some issues with REPAIR (separate problem).

I just now tested for the lock-ups re-occurring -- and yes they still do. I pressed GET MESSAGES and within the first few accounts I had three collisions and resulting popups, including (now for the second time ever) simultaneous popups. That means Stream A blocked Stream B, while Stream B blocked something in Stream A -- as unlikely as that sounds, but now I've seen it twice!

But something hasn't completed -- the progress bar still shows something being polled for a long time now, and it has happened a few times recently. This is a new bug I've discovered, still to be reported -- the only workaround is to close & relaunch TB.

(And I saw your comment about error console in Bug 707933 -- thanks, but I've run out of time right now -- I'll try it later today and get back to you...)

(In reply to Dan Pernokis from comment #88)

In fact I upgraded to TB 102.2.0 just last night, and I've re-activated all relevant accounts for GET MESSAGES with the intention of checking for duplicate message downloads. So far so good, no dups, but there were some issues with REPAIR (separate problem).

I just now tested for the lock-ups re-occurring -- and yes they still do. I pressed GET MESSAGES and within the first few accounts I had three collisions and resulting popups, including (now for the second time ever) simultaneous popups. That means Stream A blocked Stream B, while Stream B blocked something in Stream A -- as unlikely as that sounds, but now I've seen it twice!

But something hasn't completed -- the progress bar still shows something being polled for a long time now, and it has happened a few times recently. This is a new bug I've discovered, still to be reported -- the only workaround is to close & relaunch TB.

So, still happening but safe to say it's improved a tad? I think there's been a lot of POP3 work done between 102.0.3 and 102.2 so that may have helped out somewhat.

I think I read that you have like 14+ POP3 accounts across two ISPs, right? Now making me wonder if your Internet provider is throttling you somehow seeing 14 accounts suddenly polling for new mail on, I assume, the same port and saying "whoa!" and it possibly setting off triggers or alarms on an IDS or some such similar appliance. Do you have something like 13 gmail account and 1 other account?

Another thought just popped into my head. I'm on IMAP so I don't know if this setting holds true for POP3 as well but can you try something for me out of curiosity?

  1. Right-click one of your accounts and choose Settings.
  2. Click on Server Settings.
  3. Click the Advanced button.
  4. If "Maximum number of server settings to cache" is there and it shows the default of 5, try setting it to 1 for all the accounts. Do note the setting for each account just in case you have to revert back and it's not 5 for all accounts.

I'm just thinking here that maybe the ISP is seeing 10+ accounts caching up to 5 connections (possibly 50 connections in total) and ain't happy about it? It's a stretch but worth exploring.

(And I saw your comment about error console in Bug 707933 -- thanks, but I've run out of time right now -- I'll try it later today and get back to you...)

No worries. I appreciate your follow up here!

(In reply to Arthur K. [He/Him] from comment #89)

So, still happening but safe to say it's improved a tad? I think there's been a lot of POP3 work done between 102.0.3 and 102.2 so that may have helped out somewhat.

Actually I think it is worse. The pop-up "busy" screens still happen, and the polling of accounts takes at least as long. There are weird pauses between too. For example, when I click GET MESSAGES, the new/resurrected Progress Bar shows a "scan" pattern while polling an account, then quickly blips to the next and we see ONLY the scanning progress bar (with no status like "Checking...") for awhile before the next account comes up. If I manually poll one account and immediately another, the first is quickly checked while the second shows the scanning bar and nothing else (no status) for many seconds until the poll completes normally. If I click three accounts, there are similar pauses etc (and occasionally-but-rarely a popup busy message). On the other hand, if I pause several seconds between accounts, they each get polled & checked immediately and independently (ie- no "scanning progress bar without status" situations and no pausing during the poll sequence).

And then there is a new bug I have to document further before reporting formally: I've had two instances now where a manual poll has resulted in a status msg downloading, say, 2 of 4 messages, and the progress bar scans forever. I can manually poll other accounts OK, but the one is hung (with PB scanning forever). Checking network activity at the time shows little or no activity attributable to polling and downloading a message. I have to quit & relaunch TB to abandon the poll. TB shows first of the four messages, and a re-poll brings down 2, 3 and 4 without further issues -- and they're all KB-sized text messages, nothing huge. And at least one of these occurrences happened on TB 102.2.0 during its first test!

I think I read that you have like 14+ POP3 accounts across two ISPs, right? Now making me wonder if your Internet provider is throttling you somehow seeing 14 accounts suddenly polling for new mail on, I assume, the same port and saying "whoa!" and it possibly setting off triggers or alarms on an IDS or some such similar appliance. Do you have something like 13 gmail account and 1 other account?

Nice thought, but no. On the laptop (running TB 102), there are 12 accounts. One is GMAIL, not active and not being polled -- won't connect for new security reasons not relevant here. One is "on-demand only" -- it often contains spam and I poll it manually only when I want to bring stuff down. Of the remaining TEN accounts then, four are with a regional ISP who also provides my actual Internet connection and these four personal email accounts. The other six are with a regional ISP that provides website hosting and the emails for them -- different logical channels for each, I think, but in any case all farmed out to Amazon. The crux of my counter-argument is this: On my Tower (hard-wired to my router and running TB everything up to 98.13.0 now and never installing TB 102), I have these same 12 accounts and a handful more -- total ~16). I can do a GET MESSAGES and ALL of them will be polled in seconds, almost never getting collisions unless a timed poll happened to occur during the sequence. But if I click GM twice, I usually WILL get one or sometimes two consecutive collisions. I have never seen overlapping/concurrent collisions as I have just seen now twice under TB 102. Previously on the Laptop (connected via WiFi to the router) likewise running TB up to 98.9.0, the polling sometimes had mild pauses between accounts and would take a minute or more to poll all and also likewise got one or more collisions as per the Tower. But since TB 102 (up to current TB 102.2.0), two ticks on GET MESSAGES takes weird pauses between accounts and WILL almost guaranteed give multiple consecutive collisions & popups -- and now overlapping/concurrent collisions too.

Another thought just popped into my head. I'm on IMAP so I don't know if this setting holds true for POP3 as well but can you try something for me out of curiosity? (Re: "Maximum number of server settings to cache")

There is no such setting in POP3. (And in any case is same counter-argument as above...)

But thanks for the suggestions, Arthur -- I'll keep poking away at this...

(In reply to Dan Pernokis from comment #90)

There is no such setting in POP3. (And in any case is same counter-argument as above...)

But thanks for the suggestions, Arthur -- I'll keep poking away at this...

Thanks for the reply and detailed info. I'm a user here, not a dev. So if things are/were fine in 91.13.0 (I think that's what you mean, not 98.13.0), then maybe 102 introduced something that's causing racy behavior and collisions like you're seeing. Just my hunch.

Yes, Arthur, I meant TB 91.13.0 (and 91.9.0). Thanks.

But look at the date of THIS bug... This collision issue has been happening for the past 11 YEARS! There was only a minor improvement in the past year, nothing to do with TB 102. And as I said, TB 102 seems to have introduced some new wrinkles that make the situation worse.

Just adding that this is still happening in the current version, which is 115.1.0 as I write this.

Cheers!

See Also: → 1819566
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: