Don't collect addresses into the Collected Address Book if the address already exists in other Address Books (PAB or user created). Current behavior: 1. create an entry for firstname.lastname@example.org in my PAB 2. send an email to or receive an email from email@example.com 3. tim.taylor is collected into CAB or 1. send an email to or receive an email from firstname.lastname@example.org 2. tim.taylor is collected into CAB 3. "move" tim.taylor from CAB to PAB 4. send/receive another email to/from email@example.com 5. tim.taylor is collected into CAB In both cases, we end up with two entries for tim.taylor in the address books: one in the PAB and another in the CAB. If I want to fill in some fields for that address, I need to enter them in both AB entries. This defeats the purpose of having a PAB at all. Expected behavior: 1. create an entry for firstname.lastname@example.org in my PAB 2. send an email to or receive an email from email@example.com 3. tim.taylor isn't collected into CAB or 1. send an email to or receive an email from firstname.lastname@example.org 2. tim.taylor is collected into CAB 3. "move" tim.taylor from CAB to PAB 4. send/receive another email to/from email@example.com 5. tim.taylor isn't collected into CAB This should also take into account the "additional email" field in the address book entry when determining whether to collect into the CAB. In other words: Expected Behavior: 1. create an entry for firstname.lastname@example.org in my PAB 2. specify email@example.com as additional email address for tim.taylor 3. send an email to or receive an email from firstname.lastname@example.org 4. tim.taylor isn't collected into CAB 5. send/receive email to/from email@example.com 6. tim_taylor_mi isn't collected into CAB
Confirming. I see the current behavior on linux trunk 2000111106 and the expected behavior would definitely be nicer. Right now I get a friend of mine in three different incarnations in autocomplete; all three end up going to the same address and two are collected while also being listed in my personal address book.
Changing qa assign
reassigning to chuang.
*** Bug 77524 has been marked as a duplicate of this bug. ***
Adding myself to CC list -- I like this enhancement idea!
reassigning to cavin.
*** Bug 75050 has been marked as a duplicate of this bug. ***
I've been dreaming of this for a while now, but only today decided to look for it in Bugzilla. What needs to be mentioned too is that only the e-mail address, case-sensitive should be checked for duplicates. Actually some people change the case-sensitivity and/or name: John Travolta <firstname.lastname@example.org> Travolta, John <John.Travolta@showbiz.com> should be considered as the same address.
*** Bug 155969 has been marked as a duplicate of this bug. ***
It would also be nice to have some way, perhaps interactive with the user at runtime, to accurately set the html preference of the new entry before it is saved.
This bug was marked as depending on bug 150360 which was duped onto bug 85344. Correcting dependency. Anyways, the target milestone was missed, so there is something to be done, but this bug is not even assigned. pi
*** Bug 199196 has been marked as a duplicate of this bug. ***
This bug was opened in Nov of 2000?? am i reading that right?
Yes, you are reading that correctly.
wow, then this is pretty old. it seems to be fairly minor though so I guess no one is breaking down the doors for this one, though there are 12 votes
*** Bug 209211 has been marked as a duplicate of this bug. ***
*** Bug 230158 has been marked as a duplicate of this bug. ***
transferring nomination from duped bug.
Not going to block for this. Who can own this?
*** Bug 240935 has been marked as a duplicate of this bug. ***
*** Bug 252008 has been marked as a duplicate of this bug. ***
*** Bug 237455 has been marked as a duplicate of this bug. ***
*** Bug 280780 has been marked as a duplicate of this bug. ***
This would be a fantastic improvement as well. I am going to use one of my votes for it.
*** Bug 281653 has been marked as a duplicate of this bug. ***
Please can we see some action on this bug? It's really annoying when I set out to clean my Collected Addresses book. I should also request that the check should be carried out on the mail id, not the same, as people write stuff like "Taylor, Jim" or "Jim Taylor" with the same mail id, or sometimes their mail clients munge their names that way. Also, the mail id check should be carried out case-insensitive.
*** Bug 294703 has been marked as a duplicate of this bug. ***
*** Bug 320993 has been marked as a duplicate of this bug. ***
I don't believe that the dependency on bug 85344 is valid (why should adding more fields help us to implement not collecting other addresses???) - therefore I'm removing it.
This really should be a bug not an enhancement. Maybe the impact is not understood, let me try to explain: - Used has 200 addresses in "Collected Addresses". - They sort them into separate address books ("Friends", "Work" etc.) - Collected Addresses is now empty. 3 months later, they have sent lots of emails, mostly to old contacts, but some to new contacts. - They open "Collected Addresses". It contains 205 addresses. How are they supposed to know which are the 5 new addresses?
Just poking around to see if I could spot where this is happening. In mozilla\mail\components\compose\content\MsgComposeCommands.js there is a function called GenericSendMessage which I would have expected to check for the preference to add the recipient's to the CAB. However its not there and there and it ends with a call to gMsgCompose.SendMsg(); However in the 30mins I spent looking I couldn't find where this was defined. As it would be the next logical place for me to look. I am expecting that the add to address book function call just needs extra checks against the address book database but until I find where they are getting added I cannot be sure. Anybody care to point me in the right direction?
I think I found it: mozilla\mailnews\addrbook\src\nsAbAddressCollecter.cpp
Re: nsAbAddressCollecter.cpp From my figuring (which could be wrong) we have a function called CollectAddress and GetCardFromAttribute (for comparing the emails). in GetCardFromAttribute thee is the following: // Please DO NOT change the 3rd param of GetCardFromAttribute() call to // PR_TRUE (ie, case insensitive) without reading bugs #128535 and #121478. return m_database->GetCardFromAttribute(m_directory, aName, aValue, PR_FALSE /* retain case */, aCard); Which indicates that the comparison is case sensitive due to the bugs listed.
According to bugzilla, those bugs (bugs #128535 and #121478) are "verified fixed". And this bug is listed as an "enhancement".... it's not. It's a bug, a severe enough bug as to make "collected addresses" useless. Jose
I agree with Jose, >>"collected addresses" useless<< without this fix.
There are times when it may be useful to have variant forms of the Display Name field. (An example is if you want to send a message with bare addresses only. At some point this became impossible for an address with an AB entry unless there is an AB entry for that address with a blank Display Name. Actually I consider that a separate bug, but for the sake of this argument let's assume it will stay that way.) On this theory any new variant should perhaps still be collected into CAB. Comment #32 could still be dealt with, by UI that indicates whether a given address is duplicated or is unique. In fact, if this alone were fixed, the remainder of the bug would not be very serious and could easily be worked around.
For clarity, another example of the above requirement is if you receive (reply to) a message from "Smith, Bob (home)" and a message from "Smith, Bob (work)"; it is useful not to discard the additional information about home vs. work just because you already have an entry for "Bob Smith" corresponding to one of them. (You may later want to tidy them up to only two entries, but the information loss would already have occurred.)
Sure, but wouldn't Bob@Home and Bob@Work be different email addresses? If you only store Collected Addresses on the email address, Bob would indeed end up with two separate entries, which is what you'd want.
re: comment 41: I can think of few reasons to have variations on the display name field automatically created by the program. Sending messages with bare addresses should be done by the SeaMonkey client itself, not "worked around" by duplicating addresses in the address book. And while it's true that email addresses are sometimes shared among several people, and people have several email addresses, the way to handle this is to make the address book into a relational database. I'm not holding my breath. I'd be happy if the program simply did not collect any Email address with a case insensitive duplication anywhere in any of my address books. I'll handle the exceptions myself by hand. Jose
I can certainly see no benefit, and can see some inconvenience, in collecting an entry into CAB that agrees in its entirety with an existing entry in some other AB (where such agreement means, I think: every non-blank field in the collected entry is the same as the corresponding field in the existing entry). But if the two entries agree in address but differ elsewhere (in a non-blank field of the collected entry) then the collection is not necessarily a bug. Two people with the same address is indeed a better example of why. Exceptions will always be dealt with by hand, but the question is what should be the exception and what should be the norm. Comment #32 is still an issue, however, and I was pointing out that UI to fix it would also enable a better workaround to the bug. Not that this would be as good as fixing it, but it would be better than what we have.
>> But if the two entries agree in address but differ elsewhere (in a non-blank field of the collected entry) then the collection is not necessarily a bug. << The only "fields" that can be collected are the Email address and the display name. Email addresses that differ only in case or leading or trailing spaces are identical and should not be considered different. Right now they are. Bug. Display names most often differ because of spelling, leading spaces, inclusion (or omission) of a middle initial, nickname, etc. These are the same person. Sometimes a display name will differ because of collision between "First name + Last name" and "display name". These are the same person too. Rarely, I'll get an Email from somebody else who is sharing an Email address. I prefer to put that in by hand. It happens once, whereas case and spelling stuff happens all the time. The "collected addresses" is a convenience if it gets ONLY the new Email addresses, and leaves the other fields to the user. It is =useless= if the computer is constantly insisting that J smith J. Smith John Smith John Smith JOhn Smith Jon S. j. smith JohnSmith and John SMith are different even though they share the same Email address. >> the question is what should be the exception and what should be the norm. << The norm should be that if the two Email addresses are identical in a case insensitive comparison of non-whitespace characters, then nothing should be collected. GO through your own collected addresses and see how many are dupes, and how many are legitimate cases of "different people". Jose
even worse, in a related case: If I click "add to address book" (defaulting to PAB), it appears to create a new card -- even when a 100% identical card ALREADY EXISTS in the SAME address book! This is not merely identical with respect to "case-insensitive, whitesapce removed": they are 100% identical duplicates, created in exactly the same way.
I understand the points being made; however, let's not exaggerate. Comment #46: CAB is not "useless" at present; it is certainly a convenience for some. One may be perfectly happy to choose between duplicates when they are offered as alternative completions. Comment #47: You are explicitly asking to do something that you then complain is done! Yes, the redundancy could potentially be detected, but this is hardly "even worse" than the other bug(s).
Rather more importantly, the bit before the @ sign of an e-mail address is in principle case sensitive. RFC 2821 says so, in the strongest way (section 2.4): The local-part of a mailbox MUST BE treated as case sensitive. Therefore, SMTP implementations MUST take care to preserve the case of mailbox local-parts. Mailbox domains are not case sensitive. In particular, for some hosts the user "smith" is different from the user "Smith". However, exploiting the case sensitivity of mailbox local-parts impedes interoperability and is discouraged. Clearly, most implementations uphold the last sentence by violating the other ones! If most really means all, then we could do the same.
Most final delivery agents, that is. In our case, ABs are used to provide addresses for SMTP, so the RFC certainly applies.
>> Comment #46: CAB is not "useless" at present; it is certainly a convenience for some. One may be perfectly happy to choose between duplicates when they are offered as alternative completions. << I have not met such a one. If duplication is meaningful, it should be under the control of the user, not the machine. That is, in the rare instances where I actually want to choose between "J. Smith" and "J Smith" (and "Jon Smith" and "JOn Smith" =every= =time= I send an Email, I'll set those cards up myself. In the much more common instance that I reply to people who, while being the same person, send with different display names, I end up with a collection of meaningless duplicates. This means... 1: that I have to choose among meaningless duplicates every time, and also means that in the uncommon (but important) instance where I actually send to a totally new person with a totally different name and a completely different Email address (for example, the first time I send to Amadeus Mozart <email@example.com> this name gets tossed not in a pile of NEW PEOPLE, but rather, in a huge collection of meaningless variants of existing people. 2: that if I have a bunch of people (say, a theater cast) to whom I am sending replies, and I may not have all of the people already in my book, but I'm not sure who I do have and who I don't have, and who has just changed Email addresses, I have to check EACH AND EVERY Email address in my CAB against the ever growing "theater people" address book, because the computer, which is supposed to be able to do this for me, won't. So, when I go to put my new collected addresses where they belong, instead of having five, I have five hundred to sort through. and four hundred ninety five need to go in the trash. Further, unless I restrict all my Email activity to this one address book, I have to go through =each= of my address books to check for dupes. (I don't know if firstname.lastname@example.org is a theater person, a music person, a friend with a new address, a clay person, a commerce person, or belongs to any of the other categories. >> Rather more importantly, the bit before the @ sign of an e-mail address is in principle case sensitive. RFC 2821 says so, in the strongest way (section 2.4): << Ouch. A Pox on RFC 2821. I always thought that internet addresses were case insensitive. They =should= be case insensitive. But Quantum Mechanics should be nothing more than a bad dream. Alas... So, I propose that we do NOT break RFC 2821, but we still need to work around the issue so that the CAB is actually useful, by adding a few bells. One workaround: First, leading and trailing whitespace would be trimmed. Then... 1: CAB would have another field... a dynamic virtual field which would indicate the kind of duplication, if any, that exists with this card. The only field to be considered would be the Email address. 2: The user would be able to hide or display the following kinds of duplications: 1: hide/display all entries which duplicate, in a case INsensitive manner, the Email address of any existing entry. The CAB entry "rSmith@abc.com would be a duplicate of the existing RSmith@abc.com, and would be affected here. The Email address would be a link, clicking on it would open a window listing the duplicates (and their locations), along with the new CAB entry. Users could edit all entries using that window. 2: hide/display all entries which duplicate, in a case =sensitive= manner, the same as above. Robert<RSmith@abc.com> and Bob<RSmith@abc.com> would be duplicates here because their Email addresses are identical, even though their NAME fields are different. IF the user chooses to hide both of these, they would only see the entries that reflect Email address that do not exist in any of their ABs. They would also not see "new" Email addresses that differ only in case... but at least they could choose to view those too if they wanted, after they dealt with the guaranteed new addresses first. Another possible workaround is a user preference for collecting: [x] DO NOT COLLECT into the CAB if the email address exactly matches an existing entry in any address book, irrespective of differences in any other field (i.e. "name") [x] DO NOT COLLECT into the CAB if the email address matches an existing entry in any address book, ignoring capitalization as well as differences in any other field. A third, similar, workaround would be to have three different CABs. One would be for guaranteed unique Email addresses. The only entries here would be those whose Email address fails to match any entry in any address book, in a case insensitive test, irrespective of any other fields. There'd be another CAB for case sensitive comparisons, and another for entries that differ anywhere (such as the name field). That way you could have the best of all worlds. Jose
RFC 2821: * is about SMTP (i.e. from address book through to destination mail server). * says that the case of local parts must be preserved. * discourages the use of case-sensitive local parts. i.e. Case MUST be preserved up to where mail is sorted into mailboxes, where it SHOULD be ignored. The decision about whether to collect an e-mail address or not is outside the scope of SMTP. If the address is collected, however, the case (of local parts) must be preserved. Collected address book is most useful where only new addresses are added. Because _people_ frequently violate RFC 2821 case preservation when writing down or typing in e-mail addresses, case variations are common and should be ignored in this instance. Likewise I get many variations of display name and only a few people who share an address (and they don't think of changing their display name). So variations in display name should not be collected. Adding options for case sensitivity and display name matching may be useful to keep the pedants quiet, but are probably not necessary. Trevor
Wanted yes, but not a blocker.
Created attachment 347399 [details] [diff] [review] Stop collecting addresses in other ABs This patch stops us collecting addresses that exist in other address books. It does not address the case-sensitivity issues. They will be addressed in a later patch/bug; the work for this patch is a big enough change that case-sensitivity needs to be thought about and addressed later. nsIAbAddressCollector::getCardFromAttribute is no longer needed (we removed the last js instance recently), hence the removal. The rest of the code is reworked so that we're not using mork-specific code (which will mean we match the UI), and hence as more address book types become writeable, we'll be able to correctly collect to those as well. Also included is a unit test to cover the new functionality.
Comment on attachment 347399 [details] [diff] [review] Stop collecting addresses in other ABs Maybe change the name of SetAbUri? It's odd to pass in a pref branch - maybe SetAbUriFromPrefs? "adding new cards and modifying existing ones" reads better. + * This tests the main collection functions for adding new cards, modifying + * existing ones. I guess we hold all local AB's open, so this won't have a huge performance impact.
(In reply to comment #56) > + * This tests the main collection functions for adding new cards, modifying > + * existing ones. > > I guess we hold all local AB's open, so this won't have a huge performance > impact. That's correct with how it is set up at the moment. If we change it in the future then we'll have to revisit this and various other areas.
Checked in: http://hg.mozilla.org/comm-central/rev/f1f7bcf2dee2 This now fixes this bug - addresses in other AB's won't be collected. It does not fix it in the LDAP situation - that is bug 129393. The case sensitivity issues will be covered by bug 446571 and or bug 446574. I'll comment in those in a while to point them back at the discussion on this bug.
(In reply to comment #58) > The case sensitivity issues will be covered by bug 446571 and or bug 446574. > I'll comment in those in a while to point them back at the discussion on this > bug. and bug 129393
->VERIFIED Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1b2pre) Gecko/20081116 Lightning/1.0pre Shredder/3.0b1pre + Mac
To the best of my knowledge the case sensitivity bug still exists in the main code. To get it fixed please go to bug 129393 and click the vote button to raise its level of importance.