Closed Bug 790968 Opened 12 years ago Closed 7 years ago

Downloads multiple copies of emails from POP server

Categories

(MailNews Core :: Networking: POP, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: trevmrgn+bug, Unassigned)

References

Details

User Agent: Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20100101 Firefox/15.0.1
Build ID: 20120905151427



Actual results:

Thunderbird decided to download messages it had already downloaded, from multiple POP servers, multiple times.

This has been happening occasionally for some time.

The accounts are set up to leave messages on the server for 30 days (to allow reading on multiple computers)
One account has filtering, another doesn't.  Other accounts didn't have a problem.

Today it happened twice in a row - whilst I was deleting all the duplicates, it started downloading everything again (presumably on the regular checking schedule).  Two out of five accounts were affected.

I have previously suspected that it was related to compacting folders.  But I haven't said okay to that for a while.
Is your disk full? Is the server dropping the connection occasionally?

There is a file called popstate.dat in the folder where your emails are stored (you can get that in the account settings -> Local directory field). You can open it with a text editor like notepad, while TB is not running. That one contains the IDs of all messages that are on the server but TB has already downloaded them. You can watch that file while you see the problem. In TB if you View Source (Ctrl+U) a message to see its ID in the X-UIDL header. So try to see if the file gets reset to 0 size on any occassions (it should not be if you keep msgs on server for 30 days) or which messages have the problem.
I've had a similar problem some months ago (actually, it lasted for months) and in the end it 'solved itself' (appears to have been the mail server because around the time the problem didn't happen anymore, we received a new webmail version (based on a different back end)

The problem was that the mail ids on the server changed constantly, so for popstate.dat the mails were different (even if they had the same content).
> The problem was that the mail ids on the server changed constantly, so for
> popstate.dat the mails were different (even if they had the same content).

Yes, that is also one explanation, which I even experienced myself in the past.
I've had a look at popstate.dat.
For the two accounts that had duplicates, the file creation time of popstate.dat exactly matches the 'From' timestamp in the source of the last duplicate.
For unaffected accounts the creation time is in July (possibly the last time I saw the problem).
All entries in the affected files have a similar timestamp (I presume that's what the other number is).

All copies of the same email have the same X-UIDL.

The affected accounts are with different providers and the format of the X-UIDL is different, so I imagine they are running different server software.

I notice that the popstate.dat files are touched or rewritten every time it checks for mail, even if there are no changes.  That seems a waste / performance hit / opportunity for corruption.

On an account that has had no recent activity the popstate.dat file has no entries, but still has the header, a creation date in July and a recent modification time.
Great, now you are familiar with the file. Yes, the second value on each line is the timestamp of when the message was downloaded (it is used for deleting expired messages on the server after given number of days).

So now you can have TB running and leave the folder with the popstate.dat file open besides it. After TB downloads new messages see if the file is suddently truncated (to empty with just the header) or 0 bytes. After that happens the next attempt to download new messages will download all that are on the server.
Then try to remember if there was some strange behaviour in TB/network that could have caused the reseting of the file.

There is an extension for removing dupes:
https://addons.mozilla.org/sk/thunderbird/addon/remove-duplicate-messages-alte/?src=search

But I am not sure how it behaves in respect to deleting messages on server. Be sure to have the option "leave messages on server until I delete them" (in account settings -> server setting) unchecked while you use it to remove duplicate messages.
Trevor, have you see this again since? 

> I notice that the popstate.dat files are touched or rewritten every time it checks for mail, even if there are no changes.  That seems a waste / performance hit / opportunity for corruption.

It is necessary.
Flags: needinfo?(trevmrgn+bug)
It happened again on October 15th. Creation date of popstate.dat changed to that date.  Nothing unusual noticed with the file or network at the time (although internet drop-outs are not uncommon and may have occured on the previous connection attempt)
Flags: needinfo?(trevmrgn+bug)
While it has been months since I had this problem (don't know how I can find back the ticket number), I agree that it's a pain that Thunderbird's mail recognition is not more robust than it is.

Maybe this component could be redesigned with the assumption that sometimes, mail servers are not reliable enough to 'remember' their unique mail identifiers (which at least for me seemed to be the problem, since the problem was solved after a backend migration of the mail providers).
(In reply to Rob from comment #8)
> While it has been months since I had this problem (don't know how I can find
> back the ticket number), I agree that it's a pain that Thunderbird's mail
> recognition is not more robust than it is.

It might be useful to know whether there is a client that works as you suggest. Because Thunderbird is working according to spec.

 If the server isn't working to spec, how is the client to work correctly?  TB shouldn't be making "guesses" that message X was already downloaded.
(In reply to Rob from comment #9)
I've read through bug 461195
The issue I am seeing is different from what Rob reports there.
* I see the problem on multiple accounts at the same time - different servers from different hosting companies using different software, downloading into different folders (and different popstate.dat files).
* Duplicates have the same X-UIDL.
Wayne: I know, but since T'bird already contains a feature that allows searching for duplicate mails... the same logic could be re-used?

Trevor: Sorry, I didn't realize this last thing (duplicates have the same x-uidl). I think that was different for me. In that case, I'll retreat from the discussion :)
Do you still encounter this issue?

I'm not finding a similar issue in open bug reports https://mzl.la/2jqnZIk
Flags: needinfo?(trevmrgn+bug)
Flags: needinfo?(rob.smeets)
Whiteboard: [closeme 2017-12-15]
See Also: → 255745
(In reply to Wayne Mery (:wsmwk) from comment #13)
> Do you still encounter this issue?

No.  I don't remember seeing this for years.

Reviewing our previous comments, I notice that popstate.dat now always has identical creation and modification times.
Flags: needinfo?(trevmrgn+bug)
I haven't had my (different) issue for years either.
Flags: needinfo?(rob.smeets)
Thanks for the quick updates
Whiteboard: [closeme 2017-12-15]
Status: UNCONFIRMED → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.