Closed Bug 240049 Opened 20 years ago Closed 17 years ago

Duplicate Messages Received with Pop3-Accounts and option "Leave Mail on server"

Categories

(MailNews Core :: Networking, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bart, Assigned: Bienvenu)

References

Details

(Keywords: verified1.8.1.5)

Attachments

(3 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; de-AT; rv:1.7b) Gecko/20040316
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; de-AT; rv:1.7b) Gecko/20040316

Since version 1.5 of Mozilla getting the problem about receiving duplicate
messages on a pop3 account where the option "Leave Mail on server" is activated
(like described here:
http://kb.mozillazine.org/index.phtml?title=Thunderbird_:_Issues_:_Duplicate_Messages_Received
for Thunderbird)

Reproducible: Sometimes
Steps to Reproduce:
1. Checking eMail and getting a message 
2. Later checking eMail again and etting the same message again
3.

Actual Results:  
Duplicate Messages Received
Yep, a log would be interesting. And also the content of the popstate.dat for
this account before and after getting a duplicate mail.
Please create attachments for both (see above), no inline paste.

I'm currently working on two bugs (bug 156998 and bug 238087) which can cause
the effect you're seeing. But I want to be sure before duping this here.
Is your popstate.dat file hidden? See Bug 242736.
Got several times repeated the same problem (especially when Mozilla is crashing
caused by some irregularity).
It generates me thousands of dupes and thus I'm running out of disk-space.

Is there any way or tool for cleaning this dupes ?
(In reply to comment #4)
> Got several times repeated the same problem (especially when Mozilla is crashing
> caused by some irregularity).
> It generates me thousands of dupes and thus I'm running out of disk-space.

I can only assume that your popstate.dat gets corrupted. But the only situation
where a crash can cause that is when short after finishing receive the file is
written.
As for all problems I need some more infos and popstate.dat after such a crash
to verify.

> Is there any way or tool for cleaning this dupes ?

No really easy I know of. But you should read the document linked in the
original post.
This is the most annoying bug I experienced with Mozilla Mail.

This bug caused to use hundreds of MBytes on my harddisk for thousands of
dupemails, while I experience this about one time a week with around 2000 Mails
waiting on my POP-Server (wherof maybe 50% spam and most of the rest from
several mailinglists).
This bug prevents me from recommending Mozilla Mail to other users and makes me
think about looking for another mailclient to handle this issue.

Don't really know about the mail-database-system; but purging of dupes should be
possible, if each message gets its timestamp for an unique id and so - sfter
sorting by it - dupes would be identified (and so a much easyer job then
filtering spam).

If I instead have to go through all my messages and folders it will take at
least half a day for cleaning the dupes.

With this bug remaining producing mass of dupes Mozilla-Mail is still
inappropriate for business use.

(PS: I experienced this bug with Win2000 as with XP)
Gerold, do you think the bug goes away if you're complaining about it? Help us
finding the cause of getting already downloaded messages again would be way better.

-What version of Mozilla are you using? Did it happen with previous version too?

-What provider/server is it you're using? Or does this happen on various accounts?

-If you get these dupes once a week, are the (now twice) downloaded messages
still on the server after getting them? Or are they deleted while getting them
(despite "leave messages on server" is on)? If they don't get deleted, I guess
the number of messages downloaded increases in each such incident.

-What I wrote in comment #2 is also valid for your case. But I know that it's
hard to have the popstate.dat before getting a dupe since you never know when it
happens. 
But if once a week is a quite reliable, copying the popstate.dat to a save place
before getting messages when the time comes near could do the job.
Also the log of a normal get and a dupe get would be interesting. You could
create a log right now and save it to another place. And then only start Mozilla
with logging enabled using the batch file until it captures a dupe get. Then
leave Mozilla, save the log away again.
You can then attach these two popstates and two logs here or mail it to me.

Here you can find instructions on how to create a communications log:
http://www.mozilla.org/quality/mailnews/mail-troubleshoot.html#pop
Product: MailNews → Core
*** Bug 249725 has been marked as a duplicate of this bug. ***
perhaps a dup of 207109?
Still actual.

Note: Mail is NOT being downloaded every time it checks for it. So NO dupe of
bug 207109.

Just happens after crashes or user initiated program termination.

Using Thunderbird 1.0.6 (20050716)

Questions:
What is the actual status about this?
What is the technical background of the problem?
Isnt there a similarity checker, that compares subject and body of mails for
congruence and ables the user to remove dupes? -> Especially important for most
users with old versions to at least purge the inbox.
Yippieh! I found it!

For all fellow sufferers:
https://addons.mozilla.org/extensions/moreinfo.php?application=thunderbird&id=956
(removes duplicated messages semiautomatic (with user interactions)
Found via (but not even lightly.)
http://kb.mozillazine.org/Duplicate_messages_received

So happy! Yeah, yeah yeah!

Good Night! :>

[SCNedit]
After abnormal termination of the POP3 connection, on the next mail fetch my Thunderbird 1.0.7 will sometimes download all messages from the POP3 server, including ones already downloaded many days ago.

"Abnormal termination of the POP3 connection" includes things like momentary loss of WiFi connection, so this is a pretty major issue.
Dupe of bug 188190?  See also bug 92973 comment 15.
Summary: Duplicate Messages Received with Pop3-Accunts and option "Leave Mail on server" → Duplicate Messages Received with Pop3-Accounts and option "Leave Mail on server"
Assignee: sspitzer → bienvenu
Status: UNCONFIRMED → NEW
Ever confirmed: true
This code looks wrong, so I've changed it - if we haven't finished listing the messages on the server, we shouldn't flush popstate.dat, since it'll be wrong.  Other places in the code that flush popstate.dat do the same check.

I still think there are other situations where dropping the connection/stopping the download cause us to redownload all messages but I'll need to debug that some more. It's hard because my isp locks me out for 10 minutes if the connection is dropped!
Attachment #240230 - Flags: superreview?(mscott)
Comment on attachment 240230 [details] [diff] [review]
handle case where we lose the connection listing the messages

clearing request - I think I need to do more than this...
Attachment #240230 - Flags: superreview?(mscott)
Comment on attachment 240230 [details] [diff] [review]
handle case where we lose the connection listing the messages

I think this will be an improvement - I'm not sure if there's still a problem that occurs if we lose the connection while downloading headers (as opposed to just listing the uidls, which this should fix). I was not able to reproduce a problem losing the connection while downloading headers in my simple tests, but the code is complicated...In geopilot's case of 20,000 messages on the server, losing the connection while listing all 20,000 is not unlikely.
Attachment #240230 - Flags: superreview?(mscott)
Attachment #240230 - Flags: superreview?(mscott) → superreview+
I've checked that patch into the trunk and 2.0 branch. 
I just had this happen to me when there was an error retrieving a partially downloaded message because an other client had the inbox locked. The RETR command failed, and we displayed the error message given by the server, and then the next time through we downloaded all the messages on the server. I'll try to look at that control flow...
Dup of 237131?
I have encountered one way to get two pop3 urls running at the same time (this is a recipe for various kinds of disaster, including, I think, corrupting popstate.dat). If you try to do a full download of a partially downloaded message while a mail retrieval is going on, you'll have issues. So I changed the code to check if the server is busy, and not do the partial download, if so. I also display an error message in either case, if the user clicks get new mail while a full download is going on, or clicks the full download link while a new mail retrieval is going on.

One visible change will be that if you do a get new mail while a get new mail is currently going on, we will alert the user - background new mail retrieval won't pop up the alert, because there's no msg window associated with the url. This might be annoying, but I think it's better than silently ignoring it.
Attachment #253767 - Flags: superreview?(mscott)
Attachment #253767 - Flags: superreview?(mscott) → superreview+
attachment 253767 [details] [diff] [review] checked into trunk and branch.
Hi David,

I see this bug isn't marked as fixed, even though you checked in your patch, so I'm not sure if that means the symptoms should still occur or not. Anyway...

A couple of times lately, I've had this problem occur immediately after the connection to my mail server timed out, even though I'm using latest trunk builds. Should this still happen?
Or am I seeing something else related to bug 237131? (or are both bugs really
the same thing?)
I suspect there are multiple issues, but I don't know for sure. Did your connection timeout in the middle of retrieval? A pop3 protocol log of a session where this happened might be useful:

http://www.mozilla.org/quality/mailnews/mail-troubleshoot.html#imap
(In reply to comment #26)
> Did your connection timeout in the middle of retrieval?

Yes, it had downloaded several messages before stalling and timing out. The duplicate downloads started happening as soon as I tried to get mail again.

Thanks for the link to the logging method. I'll try this tomorrow. All the computers in our area at work share a (buggy) wireless link to the main building. Incoming data are frequently lost mid-download when multiple computers are generating traffic at the same time... web pages just hang and never complete, mail downloads stall mid-download on very large mails, etc. So it should be easy to reproduce.
OK, I see what's going on here, at least in one instance. We're looping through the known messages on the server, calling GetMsg and downloading the ones we need to download. For each message, we copy the uidl info from m_pop3ConData->uidlinfo->hash into m_pop3ConData->newuidl. If a msg download fails, we end up calling commit state. Commit state checks if newuidl is non null, and if so, uses it instead of m_pop3ConData->uidlinfo->hash. But if the message we're downloading is in the middle of the list, then newuidl is only partly populated.

I'm not sure why the new message we're trying to download would be in the middle of the pop3 server's list of messages, but I see it happen in my test case - I have 450 messages in my pop3 inbox on the server, but a newly arrived message is number 149.

I'm not sure why we're keeping two hash tables, at least in this case. We can't just not commit our state in the case where the download fails, because we might have downloaded 10 messages and failed on the 11th one. I'll try to figure out how to deal with this...
I attached and checked in a patch in bug 352998 - if anyone wants to try a trunk build from tomorrow, and report back, that would be great. If it looks OK, then we might be able to get it into the 2nd 2.0.0.x release (but not the first one, which is coming out pretty soon). Marking this fixed
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
David, I hate to be the bearer of ill tidings, but I just had this bug recur on a Tbird trunk nightly (20070614). I'll see if I can reproduce it in the latest nightly, but nothing should have changed.
STR:
During a long download of new mails (after being away for 2 days), I thought Tbird had finished prematurely (our network is flaky) when it actually hadn't. I hit Get Mail, and it popped up the alert saying that it was still processing the folder. To get around the hang, I hit Stop, then Get Mail again. This caused it to duplicate (re-download) the ones it had already received. I found that if I kept hitting Stop then Get Mail mid-download, I could end increase the number of copies downloaded indefinitely.
I should mention, however, that Tbird only re-downloaded emails from that batch that got interrupted. It didn't try to download earlier emails that were still on the server, which it had done in the past.
Clearer STR (tested with latest trunk nightly):
1. Ensure the Stop button is configured to appear on your Thunderbird toolbar
2. Stack up 5 or so emails to send to yourself. You can make most of them small, but in the middle place one with some attachments big enough to slow down your downloads
3. If you have automatic mail checking, close the main 3pane window in case it starts downloading them before you're ready
3. Send all the messages, with the big one in the middle
4. Open the main (3pane) window and press Get Mail. Immediately hover the mouse over the Stop button
5. Wait till it's delayed downloading the big mail in the middle and hit Stop
6. Press Get Mail again. Watch as it begins downloading all 5 messages (including the first 2 that you already downloaded) again
7. Repeat from step 5, if you want to repeat the effect
8. Eventually let it finish downloading. Hit Get Mail again and note that it won't download them again once it's been allowed to complete
That's true. New, not truncated messages only get added to the new hash which is only written out when finished downloading. This could be compensated by doing the same for full messages that Davids last patch did for headers only/too big ones.

The problem I see with this approach is that messages could get lost. Not on the server instantly since it wont get deleted there. But since TB thinks it's already done it also wont download it again until it's gone on the server eventually.

I think we should move adding new entries to the hashes (or at least the old hash) to RetrResponse() after we know a message is save.
Thanks for your response :) I'll reopen it, then.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Christian, I'm interested in what you think of this patch. Basically, we copy the entries for the headers we have downloaded from newuidl to pop3ConData>uidlinfo->hash, in the case where we've had an error retrieving one of the messages. I think this should prevent us from re-downloading those headers.

The message we had an error on should have been removed from newuidl because we pass PR_FALSE into CommitState, which removes the currently downloading message.
Attachment #269614 - Flags: review?(ch.ey)
I can't say anything on the code since you attached the one for bug 33451 here.
trying again...
Attachment #269614 - Attachment is obsolete: true
Attachment #269671 - Flags: review?(ch.ey)
Attachment #269614 - Flags: review?(ch.ey)
Comment on attachment 269671 [details] [diff] [review]
fix to not re-retrieve messages we've retrieved in current session

I also removed some redundant comparisions for next_state (if state == POP3_GET_MSG or SEND_TOP, then it won't equal SEND_DELE) - this makes the code a bit easier to understand
Comment on attachment 269671 [details] [diff] [review]
fix to not re-retrieve messages we've retrieved in current session

That looks right to me and seems to work fine.
But one request: Though it's not as likely as for whole downloads, it's possible that a truncated download fails. In this case we'll still have the entry in our old hash and it gets written to the popstate.dat. 

So I propose you remove the two lines again you added in attachment 267209 [details] [diff] [review]:
// store in old hash table too, just in case our download is aborted.
put_hash(m_pop3ConData->uidlinfo->hash, info->uidl, TOO_BIG, popstateTimestamp);

And please don't reintroduce blanks at line endings.
Attachment #269671 - Flags: review?(ch.ey) → review+
thx, Christian. I've incorporated your suggestion, and removed a couple spaces at the end of some comment lines - those were the only blanks at the end of lines I found, at least in code I added/changed. 

Carrying forward Christian's r=, requesting sr=.
Attachment #269671 - Attachment is obsolete: true
Attachment #269694 - Flags: superreview?(mscott)
Attachment #269694 - Flags: review+
Attachment #269694 - Flags: superreview?(mscott) → superreview+
fixed on trunk, and removed the spaces after the { (must be some weird thing that the xcode editor puts in).
Status: REOPENED → RESOLVED
Closed: 17 years ago17 years ago
Resolution: --- → FIXED
Comment on attachment 240230 [details] [diff] [review]
handle case where we lose the connection listing the messages

I'd love to get these fixes into 2.0.0.5 - this bug is very annoying. The fixes all seem to have improved things.
Attachment #240230 - Flags: approval1.8.1.5?
Comment on attachment 269694 [details] [diff] [review]
incorporate Christian's suggestion

I'd like to consider this for 2.0.0.5, along with the patch in bug 352998
Attachment #269694 - Flags: approval1.8.1.5?
Comment on attachment 269694 [details] [diff] [review]
incorporate Christian's suggestion

a=mscott for mailnews only change to 1.8.1.5.
Attachment #269694 - Flags: approval1.8.1.5? → approval1.8.1.5+
fixed for 1.8.1.5
Keywords: fixed1.8.1.5
Comment on attachment 240230 [details] [diff] [review]
handle case where we lose the connection listing the messages

clearing approval flag on the assumption that the "fixed1.8.1.5" keyword means we've checked in everything we actually wanted for this bug.
Attachment #240230 - Flags: approval1.8.1.5?
yes, everything should be landed for 1.8.1.5
OS: Windows 2000 → All
Hardware: PC → All
verified fixed 1.8.1.5 using Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.5) Gecko/20070716 Thunderbird/2.0.0.5 ID:2007071611 (Thunderbird 2.0.0.5 RC1) 
Great to finally have this fixed. Thanks David!
bug 90422 (mentioned in bug 171110) maybe duplicate?
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: