Closed Bug 9413 Opened 25 years ago Closed 19 years ago

Don't allow duplicate messages.

Categories

(MailNews Core :: Database, enhancement, P3)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: CodeMachine, Assigned: Bienvenu)

References

Details

(Keywords: fixed1.8)

Attachments

(2 files, 1 obsolete file)

Moz should not allow the storage of duplicate msgs.  This can occur in the
following circumstances.

(a) Subscribed to multiple mailing lists it comes through, eg the Mozilla lists.
(b) POP download terminates (crash, line dropout, etc) and the server sends the
msg through again.
(c) Copying a msg from one folder to another.
(d) Could be present from legacy mail folders before the dupe prevention was
introduced.

I'm not sure in which of these cases dupes can be detected, but when they can,
it would be nice if a second copy wasn't saved.

Since the "copy" function exists, you'd probably want to allow dupes in
different folders, although a "Find Dupes" function would still be nice.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → LATER
I'm sorry, but there's not much we can do about this - it's fraught with peril
to start throwing away messages.
My remarks are consistent with David Bienvenu's -- this is just more detail.

Identity is a hard problem to solve, and knowing when a message is a dup
requires some global identification scheme that is not practical to define
when many operating contexts will use their own local msg ID definition.

The bug report seems to intend saving storage space by avoiding dups,
since having a dup is not itself an error; it is merely redundant.  But
storage space is cheap and very small amounts of redundancy is a low cost
compared to risk of actual data loss that can occur if code ever became
confused about identity and deleted messages too aggressively.  As a
risk, data loss is orders of magnitude greater risk than excessive space
usage of modest proportions.

David's remarks summarize that modest cost savings is not worth a big
increase in risk for data loss resulting from identity confusion.
I don't care about the disk space, it just *#$%s me off when there's three
copies of every message in a folder cause my POP download crashed in the middle
twice.  Anything that would resolve this would be great..
Ow, sorry about your distress.



Here's an idea to run by David Bienvenu.  It might be feasible to suppress

dups that are bytewise identical from beginning to end.  It would involve

storing a hash of message bytes as an attribute in summary files, and then

one could look for hash match before doing a bytewise comparison.  This

implies an index by hash value at runtime, which is big footprint in Mork.

So it would use up RAM, disk, and cycles at runtime to avoid dups that way.

A pref might turn on such checking.  Does pop permit such dup detection?
If we add this feature into the product, would you like to write up the test
plans and test cases for this, matty? The main testing item would be to make
sure that no messages are deleted incorrectly, thus losing someone's data. You'd
need to test it on Win32, MacOS, and Linux.  There are no QA resources for
testing this feature if implemented.  Thanks.
Well, I guess I could try.  I have facilities for Win32 and Linux, not Mac,
someone else would need to do this.  I'm not sure how easy it would be to test
this on two platforms though, because I'd be using my real mail file and real
mail box.  This could depend a lot on bug #7067 (cross-platform formats), for
me to be able to do this, as I'm not eager to lose mail.

This being said, I've seen other people request this, although I don't know if
it's a POP problem or just mailing lists.

It seems with my ISP POP server if you download, cancel and then restart right
away it redownloads all of the msgs.  I assume this happens if it hasn't
finished deleting and compacting the mail file.  So I can definitely test this.

BTW is this standard behaviour from POP servers?  I do this because I get around
~500 msgs a day and POP dropouts can be bad.  Is there no way for POP servers to
be told to mark deleted messages part way through the transfer to avoid the
whole problem?  I would say build in a cancel and restart after x messages
feature into Moz, except I've already explained my POP server's behaviour.

What I might be able to do is fiddle around a bit in Linux and work out how to
set up a pop server.  Then I could do a client/server combo and fill the mailbox
with test msgs.

I understand the pseudo-heuristic nature of this and the need to
undercompensate.  I gather test cases would be along the lines of no messages,
one, two, ten, thousands etc. with different combinations of dupes in different
orders (together, one apart, two apart, ten apart, two in a row, three in a
row, ten in a row, thousand dupes all together, etc).  These cases are probably
much the same for both overzealous deletion and proper deletion.  Also,
there should be tests on near dupes - mailing list cross-posts might fall into
this category if you're not trying to catch them, maybe also messages whose
text are proper prefixes of other messages.  Anything else?

BTW, how do mailing list dupes differ?  On certain headers I presume, but which?
BTW, I can see the efficiency problems that could be involved here - perhaps it
might be better to do this on a menu item which computes the hashes for all the
message in a folder, sorts them, does a true compare (maybe?) and removes
dupes.  So it would be manual rather than on the fly.  In these pop crash
situations at least, you generally know when it needs to be done.
Shall I mark this [HELP WANTED]?
Feel free to go ahead and do so.  Thanks.
Sounds like you are on the right track.  Hashing should be involved to help do a
fast accept or reject, where those are terms used in algorithms that attempt to
answer a related question in a fuzzy manner that obviates the need for a more
expensive approach.  For example, if messges are bitwise identical, they will
have the same hash; so if you get a hash you've never seen before, it cannot be a
dup of one you already have in hand, or else the hash would be found in some
table that maps your existing hashes to existing messages.  But if the hash is
the same as one you have seen before, you still don't know, because two messages
can have the same hash (although it's unlikely), so you need to bytewise compare
them.  However, the need to do the bytewise compare is thus reduced to the rare
circumstance when the hash is the same, so you pay the cost very infrequently.
I suppose the interesting problem is what to do with messages that differ only in
the headers, when they body is the same.  I'll ignore the ways you might decide
two messages might be "equal" even with slightly different headers, and address a
better approach that assumes you have some fine control of message storage that
does not involve the mbox format that stores each message in toto.  You could
break headers apart from bodies, and hash the headers and bodies separately, and
thus prevent any duplication of bodies, and maybe let multiple parent headers use
the same child body.  And this approach loses no data, and keeps all the
messages, but avoids storing a body twice when it is the same, even with
different headers from multiple messages.
Summary: Don't allow duplicate messages. → [HELP WANTED]Don't allow duplicate messages.
Whiteboard: HELP WANTED
Target Milestone: M15
For what it's worth, I'm not too fussed about mailing list and other body dupes,
because I generally put them in different folders.  One day I'll get around to
converting my filters to using Resent-From to fix up the Moz lists.  =)

The duplicate body idea is an interesting form of natural compression, but would
still require at-arrival hash computation for comparison to existing hashes,
which the manual option doesn't.  You'd know more than me as to what sort of hit
that would lead to though.

I'm not sure many people are concerned with this however, rather they want to
reduce the database entries rather than size.  I've personally never looked at
my mail databases and thought that they were too big.  I'm sure it's interesting
to a database implementer though.  =)

I wouldn't say that the hashes are the same in rare circumstances - at least for
the manual option, I imagine dupe msgs and hence hashes would be common.

Setting bug to the current standard for [HELP WANTED] bugs - should I have used
the new M20 milestone?  =)
My main goal was less compression that just identification of which messages
actually shared duplicate bodies with others.  This could become an attribute
of a message upon which one could filter with absolute knowledge that being a
body-dup was actually a known fact.
Reopen mail/news HELP WANTED bugs and reassign to nobody@mozilla.org
I don't think the risk of data loss is very high (but existant) when it is fired
by the user, because otherways, he had to do this by hand, and this is risky,
too.

I thought, every message has an unique Msg-ID, which is preserved during all
resending and copying. Why not use this?
Exactly.  Very risky.

I don't think there's any disagreement that pop crash and copy message
duplicates can be handled, since they are exactly the same.  The remaining issue
was mailing list dupes, which can differ, for example in "Resent-From".  A good
implementation that ignores this last kind of dupe would work well and be safe.
For POP servers which support the UIDL command, you can safely skip downloading
messages with <server/UIDL> pairs matching messages you've already downloaded.
This was a key motivation for adding the UIDL command to the POP protocol.
In addition to tracking server/UIDL pairs to avoid duplicate pop3 downloads,
a "download at most <N> messages at once" feature could help limit the
duration of a single pop3 session.  It might be better to download my
three thousand incoming messages in six batches of 500 than to download all
of them at once, for instance.
Keywords: helpwanted
Summary: [HELP WANTED]Don't allow duplicate messages. → Don't allow duplicate messages.
Whiteboard: HELP WANTED
Target Milestone: M15
The mailer shouldn't hide the fact that two mailing lists sent you the same 
message.  If neither mailing list adds something to the message to make it 
clear what list it is from, the mailer should at least indicate to the user 
that the message was recieved twice.
Please take all issues of reducing the number of messages that need to be
downloaded on POP failure to newly filed bug #69244.

> The mailer shouldn't hide the fact that two mailing lists sent you the same
> message.  If neither mailing list adds something to the message to make it
> clear what list it is from, the mailer should at least indicate to the user
> that the message was recieved twice.

Maybe, but I'd call that a separate RFE now.  See also bug #60876.
Keywords: mozilla1.2
Please don't make it impossible to store duplicates in a folder -- we already
have this with bookmarks, and it is really annoying. If I *want* a duplicate, it
should be possible. I prefer bug 60876's suggestions. See also bug 87853.
Whiteboard: WONTFIX? (see bugs 60876 and 87853)
*** Bug 117796 has been marked as a duplicate of this bug. ***
>The mailer shouldn't hide the fact that two mailing lists sent you the same
message.

I think having an option to avoid (optionally!) it would be better... 95% of the
time you don't need to know that you received a duplicate (well, if that option
were out there, I'd use it).

About the difficulties in detecting dupes: isn't "Message-ID:" header used by
most mail-server out there? I think having dupe-control on messages which do
have such a header would be enough to keep everyone happy.
can I ask a stupid question here, what does the market leader m$outlook do in
this situation?

The main reason for existance of duplicate messages is when a failure occurs 
during receiving of messages and a mail client redownloads the same messages 
when the process is restarted. When I had such an interruption when using 
Outlook, it didn't download the same messages again when I restarted. It 
appeared that Outlook either sent commands to a POP server to delete messages it 
just downloaded so that that they stay there or it could determine which of the 
messages on the server have been downloaded and delete them before resuming 
downloading the remaining messages, or something else. You can try it yourself: 
configure Outlook to work with your server and close it (or better yet, 
disconnect from the net during message download) and see what it does.

I think that at the very least, an email client should allow configure a number 
of messages to download at a time, just like a newgroup client does.
Whether or not you think this should happen, it does. A lot. Besides, I should
be able to merge separate folder and delete duplicates. Regardless of the cause,
a solution must be found.

It has been stated that this would be difficult, and that autodeletion is
dangerous. To this, I say "not really" and "not with a bit of care".

The message header window shows a number of fields. Clicking on any field sorts
on this field. Obviously, sorting on addresses, subjects, size, and date use
different algorithms to determine less than/equal/greater than. So too would the
sort on the Message-ID field.

The care that is required is to compare the message hesders and text byte for
byte. Duplicates could be detected and deleted. Partial messages (a subset which
lines up at the left  side) can be thrown away too, keeping the full version.

Note here that I am talking about duplicates in one file/folder only. I am also
talking about cleanup on demand. Duplicates are allowed and not checked for by
default, thus avoiding wasting the overhead when not needed.
Gnus does this.  In two different ways.

In cases where the duplicate shows up in the same group (Gnus equivalent of
a folder) as the other one, they are not suppressed by default but can be by
setting a pref; it is display that is suppressed, not storage.  

Across groups, nnml stores dupes (including copies) as symlinks (unless you
disable this, which has to be done if you store your mail on a filesystem that
doesn't support symlinks), and marking a message as read in one group marks
it as read everywhere it occurs.  This (marking all instances as read) also
happens on usenet (nntp backend) with crossposted messages, but multipostings
seem to evade it (possibly because they are not bytewise identical).  I'm not
sure what the other Gnus storage backends do (nnfolder, nnmbox, and so on).  

The ability to store duplicates once and reference them multiple times would
depend heavily on the storage backend.  Additionally, these two features 
would accomplish two different things (storage savings versus saving the 
user time).  Marking all instances as read would probably not, however -- 
maybe those should be two separate bugs.  The reporter seems more concerned
with the latter, so this bug could be about that and anyone who wants the
disk savings could file the other separately.  
I propose a Duplicate Management function similar to the Junk Management
function that is emerging:

1. Nearly certain duplicates would be moved to a Duplicates folder where they
could be examined and perhaps moved back to the Inbox or to another folder.

2. Suspected duplicates would be flagged in a column similar to the Flag and
Junk columns. If confirmed, they could be moved to the Duplicates folder or deleted.

3. Training would enable the establishment of rules such as, "If duplicate
messages come from the PIML and LIS-LEAF lists, leave the latter in the Inbox
and move the former into the Duplicates folder (or delete it).

4. Might want to have a macro language, perhaps similar to that of Z-Mail, that
could be used to facilitate various management functions of this kind, including
which of several duplicates to keep and which to move or delete.


Blocks: 195158
I very much support the duplicate removal function as a tool, similar to the
Junk  function. 

Also, I acknowledge the difficulty in defining what duplicates are, and how they
can occur, and how they should be detected. However, I think in such a case, it
pays off to simplify matters and reduce the possible cases. 

Couldn't we tackle *real* duplicates first ?

Binary duplicates (or partial binary matches of broken mails) are a real
problem. They can occur through mail server download problems, or in my case,
through severe thrashing of my mail database whilst reorganising mail in folders
using older versions of Mozilla. Moving large volumes of mail to another folder
really caused havoc on my side. I've got many many many binary duplicates all
over the place. Cleaning them manually would take ages, and is error prone.

I'd find a tool which would delete binary duplicates, or partially matching ones
(meaning the fragment fits an existing mail completely), very useful indeed.
I do fully agree with the latest comment, since I have EXACTLY the same problem
: many *real* duplicates in my 50000+ email database, which I can't delete
manually, since it would take ages !!
I think detecting "real" duplicates is not a difficult task. As has been 
previously mentioned, messages contain Message-ID's that should uniquely 
identify them.
Two circumstances you don't mention:

(e) copying messages from a newsgroup to an archive folder, not always
easy/convenient to manually keep track of which have already been copied across

(f) moving a message posted to a mailing list or newsgroup from Sent to the same
archive folder as the received or on-newsgroup copy has been placed.

But what we really need is a means of searching for duplicates to look through
and delete manually.  Maybe some measure of 'sameness' the user can see as an
aid would be handy.  But I'm not sure:

- If a message is sent to several people at once, will they go with the same
Message-ID?
- Do mailing lists preserve the Message-ID when propagating to the individual
members?
QA Contact: lchiang → nobody
> If a message is sent to several people at once, will they go with the same
Message-ID?

Yes if sent at once.

> Do mailing lists preserve the Message-ID when propagating to the individual
members?

   I haven't seen cases otherwise, but there may be some list managemen software
that changes the message-ID. I have to check RFC 821/822 (and RFC 2822/2821)
about this. OK. RFC 2822 has the following to say about the Message-ID (it's
optional but marked as 'SHOULD' falling short of 'MUST'):
(http://www.faqs.org/rfcs/rfc2822.html section 3.6.4)

Note: There are many instances when messages are "changed", but those
   changes do not constitute a new instantiation of that message, and
   therefore the message would not get a new message identifier.  For
   example, when messages are introduced into the transport system, they
   are often prepended with additional header fields such as trace
   fields (described in section 3.6.7) and resent fields (described in
   section 3.6.6).  The addition of such header fields does not change
   the identity of the message and therefore the original "Message-ID:"
   field is retained.  In all cases, it is the meaning that the sender
   of the message wishes to convey (i.e., whether this is the same
   message or a different message) that determines whether or not the
   "Message-ID:" field changes, not any particular syntactic difference
   that appears (or does not appear) in the message.

So, list management software and MTA/MDA are not supposed to change the message-id. 
since i'm downloading pop3 messages on more then one computer, i usually keep
messages on pop3 server for several days
every now and then it happens that the whole server mailbox is read again and i
end up with a lot of dupes

i like the idea of user function (somewhere in menu) to remove dupes, it could
be something like this:

1. user selects mailboxes which should be searches
2. user selects comparison method (i.e. method how to transform full email
source into something which will be then compared); it can be either full
compare, or body compare, or sender+date+subject compare, etc ...
3. all messages in selected mailboxes are read, based on comparison method
selected in step 2.pre-processed, and hash is computed (from the result of
preprocessing)
4. hashes are sorted, and second pass will do compare of email sources
(transformed using method selected in step 2) based on matchin hashes
5. found dupes can be either removed or moved to a special folder

i think that this will solve most of the problems with dupes; more clever
comparison methods (some fuzzy magic) can be still added later; but i think that
the most annoying are dups with either the whole sources or at least bodies
completely (bitwise) identical
since i'm downloading pop3 messages on more then one computer, i usually keep
messages on pop3 server for several days
every now and then it happens that the whole server mailbox is read again and i
end up with a lot of dupes

i like the idea of user function (somewhere in menu) to remove dupes, it could
be something like this:

1. user selects mailboxes which should be searches
2. user selects comparison method (i.e. method how to transform full email
source into something which will be then compared); it can be either full
compare, or body compare, or sender+date+subject compare, etc ...
3. all messages in selected mailboxes are read, based on comparison method
selected in step 2.pre-processed, and hash is computed (from the result of
preprocessing)
4. hashes are sorted, and second pass will do compare of email sources
(transformed using method selected in step 2) based on matchin hashes
5. found dupes can be either removed or moved to a special folder

i think that this will solve most of the problems with dupes; more clever
comparison methods (some fuzzy magic) can be still added later; but i think that
the most annoying are dups with either the whole sources or at least bodies
completely (bitwise) identical
How would it decide which to remove, especially if they're not exact matches?

One possible rule, if one has header lines that the other doesn't, keep that
one.  This would happen in the case of a sent message that is later received,
either on the newsgroup it was posted to or because I send to a mailing list
that includes myself.

Of course, things get tricky if neither header list is a subset of the other, or
if the body has changed.  Of course, the latter'll usually be a signature added
by a mailing list server....
Product: MailNews → Core
I've written some code that optionally detects incoming duplicates (based on
message-id+subject) and does one of four things to the duplicate: 1) nothing
(current and default behaviour) 2) deletes the dup 3) moves dup to trash 4)
marks the dup read. The Tbird UI will probably just be a check box that lets you
choose 1) or 2), but the other options are implemented in the backend. This
feature is mainly meant to deal with duplicates that arrive because of mailing
lists (e.g., you're on overlapping mailing lists, or an e-mail is sent directly
to you and to a mailing list you're on. We're relying on the message-id to be
unique - we don't hash the message body or anything like that. We don't even
compare the message sizes because mailing lists can add extra headers that throw
off the message size. We only look at incoming headers from the current session,
so if you get one msg, shut down, and restart, and get a dup message, we won't
flag that as a dup. We're really just trying to solve the annoying mailing list
problem. 

A remove dups feature is a separate issue, and will be separate code. It's
possible that I may be able to adapt the current code to add an "isDup"
attribute for filters, so you can do what you want with incoming duplicates, but
that's lower priority.
Attached patch work in progress (obsolete) — Splinter Review
I've verified that this works for pop3, but haven't tried imap yet.
Assignee: nobody → bienvenu
Status: NEW → ASSIGNED
Attached patch proposed fixSplinter Review
I've been running with this for a few days, mainly with pop3, and it seems to
be working fine.
Attachment #192758 - Attachment is obsolete: true
Attachment #193002 - Flags: superreview?(mscott)
Comment on attachment 193002 [details] [diff] [review]
proposed fix

David, is there a rational to using 500 as the size of our duplicate table?
That seems pretty large. Although each entry looks pretty small so it probably
isn't a big deal.

strHashKey.Append work ok if for some reason the message doesn't have a message
id or an eempty subject? 

+  aNewHdr->GetMessageId(getter_Copies(messageId));
+  strHashKey.Append(messageId);
+  aNewHdr->GetSubject(getter_Copies(subject));
+  strHashKey.Append(subject);
Attachment #193002 - Flags: superreview?(mscott) → superreview+
there's no guarantee that dups will come in next to each other, if you have list
servers munging the messages, or if you get a lot of meail, etc, and as you say,
the hash table entries are pretty small.

I think the empty strings will be "", but I can try that.
should we add code that does nothing if either the message id or the subject is
empty? Maybe we should only check the cache if we have a valid message id AND
subject.
 
or perhaps a valid message-id OR subject...but I think you're right to err on
the side of caution and check both.
I use formail on the server for doing duplicate detection, with a cache of 8192
bytes.  Does a pretty decent job for me, just wanted to share that datapoint. I
could mine my procmaillog for data since August 2002, if someone wanted me to
run some specific query over it.

Do we have to keep the message ID cache in memory?  We could allocate 64K for it
on disk, and virtually never miss a dup.
What's it caching in those 8192 bytes? message-id+message key pairs? The
interesting data is what's the maximum delta in message arrival number between
two messages with the same message-id, but I can't imagine there's a good query
for that. Re spilling out to disk, I'm reluctant to add the code, and to add
another data file per server. But, a persistent store does solve the "receive a
msg, shut down, start up again, receive a dup" problem.  And it might make it
easier to deal with the cross-server problem, which we'd pretty much already
decided isn't really worth solving...

If we wanted, we could add a pref (all together, groan) to set the max cache
size.  But I'd want hard evidence that this approach wasn't solving the 99% case.
Just message IDs, 186 of them separated by NULs.  That system is tuned for
filtering mail as it's delivered, though, and not as it's fetched, so it tends
to need a smaller window.  A slider to go between

less disk space used -------------|--- better duplicate filtering

might not be bad, if 64K of per-profile dup caching is going to offend anyone. 
Given that training.dat can easily grow over a meg, it doesn't seem likely.
fixed - we'll still need UI for turning this on, per server.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago19 years ago
Resolution: --- → FIXED
there was a problem with the pop3 delete to trash dup action - we need to set
m_curHdrOffset earlier, so I moved it from ApplyFilters to before we call
ApplyFilters, so that MoveIncorporatedMessage can use it from either
ApplyFilterHit or the dup handling code.
Attachment #195947 - Flags: superreview?(mscott)
Attachment #195947 - Flags: superreview?(mscott) → superreview+
So, now that this bug was fixed only with respect to 1 message got through
several mailinglists (the feature I probably won't use), maybe Bug 238365 can be
used for the case of downloading the same messages (from the same server and
mailinglist), after a problematic download, handling also mozilla restarts.
Comment on attachment 193002 [details] [diff] [review]
proposed fix

wew're going to put this on the branch for thunderbird 1.5 but leaving it
pref'ed OFF.
Attachment #193002 - Flags: approval1.8b5+
Comment on attachment 195947 [details] [diff] [review]
fix for problem with pop3 delete to trash dup action

wew're going to put this on the branch for thunderbird 1.5 but leaving it
pref'ed OFF.
Attachment #195947 - Flags: approval1.8b5+
clearing some obsolete status and keywords.
Keywords: helpwanted
Whiteboard: WONTFIX? (see bugs 60876 and 87853)
I've landed this on the 1.8 branch.
Keywords: fixed1.8
Since landing of the patch SeaMonkey Branch 1.8 boxes are burning.

c++ -o nsMsgIncomingServer.o -c  -D_IMPL_NS_MSG_BASE -DMOZILLA_INTERNAL_API
-DOSTYPE=\"Linux2.6\" -DOSARCH=\"Linux\" -DBUILD_ID=2005092701 
-I../../../dist/include/xpcom -I../../../dist/include/xpcom_obsolete
-I../../../dist/include/string -I../../../dist/include/msgbase
-I../../../dist/include/rdf -I../../../dist/include/necko
-I../../../dist/include/msgdb -I../../../dist/include/intl
-I../../../dist/include/mork -I../../../dist/include/mailnews
-I../../../dist/include/locale -I../../../dist/include/pref
-I../../../dist/include/rdfutil -I../../../dist/include/mime
-I../../../dist/include/caps -I../../../dist/include/msgcompose
-I../../../dist/include/addrbook -I../../../dist/include/docshell
-I../../../dist/include/uriloader -I../../../dist/include/appshell
-I../../../dist/include/msgimap -I../../../dist/include/msglocal
-I../../../dist/include/msgnews -I../../../dist/include/txmgr
-I../../../dist/include/uconv -I../../../dist/include/unicharutil
-I../../../dist/include/nkcache -I../../../dist/include/mimetype
-I../../../dist/include/windowwatcher -I../../../dist/include/msgbaseutil
-I../../../dist/include -I../../../dist/include/nspr   -DMNG_BUILD_MOZ_MNG 
-I/usr/X11R6/include   -fPIC  -I/usr/X11R6/include -fno-rtti -fno-exceptions
-Wall -Wconversion -Wpointer-arith -Wcast-align -Woverloaded-virtual -Wsynth
-Wno-ctor-dtor-privacy -Wno-non-virtual-dtor -Wno-long-long -pedantic
-fshort-wchar -pthread -pipe  -DNDEBUG -DTRIMMED -ffunction-sections -O2
-gstabs+  -I/usr/X11R6/include -DMOZILLA_CLIENT -include
../../../mozilla-config.h -Wp,-MD,.deps/nsMsgIncomingServer.pp
nsMsgIncomingServer.cpp
nsMsgIncomingServer.h: In constructor
‘nsMsgIncomingServer::nsMsgIncomingServer()’:
nsMsgIncomingServer.h:118: warning: ‘nsMsgIncomingServer::m_serverBusy’ will
be initialized after
nsMsgIncomingServer.h:112: warning:   ‘PRInt32
nsMsgIncomingServer::m_numMsgsDownloaded’
nsMsgIncomingServer.cpp:89: warning:   when initialized here
nsMsgIncomingServer.cpp: In member function ‘virtual nsresult
nsMsgIncomingServer::GetPasswordWithUI(const PRUnichar*, const PRUnichar*,
nsIMsgWindow*, PRBool*, char**)’:
nsMsgIncomingServer.cpp:889: warning: enumeral mismatch in conditional
expression: ‘nsIAuthPrompt::<anonymous enum>’ vs
‘nsIAuthPrompt::<anonymous enum>’
nsMsgIncomingServer.cpp: At global scope:
nsMsgIncomingServer.cpp:1756: error: extra ‘;’
{standard input}: Assembler messages:
{standard input}:1981: Error: Local symbol `.LTHUNK0' can't be equated to
undefined symbol `_ZN19nsMsgIncomingServer6AddRefEv'
{standard input}:1981: Error: Local symbol `.LTHUNK1' can't be equated to
undefined symbol `_ZN19nsMsgIncomingServer7ReleaseEv'
{standard input}:1981: Error: Local symbol `.LTHUNK2' can't be equated to
undefined symbol `_ZN19nsMsgIncomingServer14QueryInterfaceERK4nsIDPPv'
gmake[5]: *** [nsMsgIncomingServer.o] Fehler 1
I ported a build bustage fix from the trunk to the branch which hopefully fixes
that problem for you. 
Yes, both boxes building again.
(In reply to comment #1)
> I'm sorry, but there's not much we can do about this - it's fraught with peril
> to start throwing away messages.

Does this affect already delivered email, or does it just act on incoming? I
have a bunch of duplicates I'd like to at least flag from before this was
working. If I can provide more info to help, let me know.
no, just new incoming mail in the same thunderbird session.
I still have build problems on 64bit platform (x86-64):

nsMsgIncomingServer.cpp: In static member function 'static PRBool
nsMsgIncomingServer::evictOldEntries(nsHashKey*, void*, void*)':
nsMsgIncomingServer.cpp:2529: error: cast from 'void*' to 'PRInt32' loses precision
nsMsgIncomingServer.cpp: In member function 'virtual nsresult
nsMsgIncomingServer::IsNewHdrDuplicate(nsIMsgDBHdr*, PRBool*)':
nsMsgIncomingServer.cpp:2548: error: cast from 'void*' to 'PRInt32' loses precision
Depends on: 310495
I have sometimes had a problem with messages coming in truncated due to a
dropped mail session. In that case this would still delete a new message I
presume even if it was the full message and the existing message was not
complete (particularly with attachments). Might be something to be aware of.
Here are the relevant prefs and their values:

// for the global default:
+pref("mail.server.default.dup_action", 0);

substitute <serverXX> for default if you want to change it on a per-server basis.
+
+  const long keepDups = 0;
+  const long deleteDups = 1;
+  const long moveDupsToTrash = 2;
+  const long markDupsRead = 3;
+
+  attribute long incomingDuplicateAction;
Product: Core → MailNews Core
See Also: → 618809
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: