Open Bug 669160 Opened 13 years ago Updated 2 years ago

Search will remember address after deleting all emails and address book entries for that address

Categories

(Thunderbird :: Search, defect)

defect

Tracking

(Not tracked)

People

(Reporter: gobzo, Unassigned)

References

Details

(Keywords: privacy)

User Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/1.9.2.17-2.fc14 Firefox 3.6.3 (.NET CLR 3.5.30729)
Build ID: 20110428205629

Steps to reproduce:

All emails to and from that address have been deleted. An address book entry for that address has been deleted from both collected addresses and personal address book.
Thunderbird has been closed and started again.


Actual results:

Entered characters from an email address into the search and auto-complete had that address still remembered.


Expected results:

There should not have been remembered email address after it and its emails have been deleted.
more a privacy issue than a security one. Don't see any Thunderbird equivalent to Firefox's sanitization dialog but there may be some non-obvious way to purge things from the search database.
Group: core-security
Keywords: privacy
Trying to understand how that helps.
This probably the global database storing the email addresses.

You'll need to purge the database itself, I can't remember what the policies for retention are atm.

iirc you can do this by going into preferences (or options), selecting Advanced, then General, and uncheck "Enable Global Search and Indexer", then restart Thunderbird, and then you can re-check that box.
did not work
(In reply to comment #4)
> did not work

So gloda is off and your AB are empty and search still remembers the emails ?
Turning off gloda does not purge the contents of the database.  It just stops gloda from initiating the indexing process.  Arguably this transition state should ask the user if they want to purge the database or keep it around because they are planning to turn indexing back on again soon.  The option really only works in an ideal form if it is set to the desired state when a profile is being created/upgraded from a pre-gloda version of Thunderbird.

Gloda never forgets e-mail addresses in its current state, even if the user deletes all the related e-mails and gloda purges them from its database.

It's probably important to note that if the sequence goes like this:
- Gloda indexes a bunch of messages.
- User turns off gloda (and reboots).
- User deletes said bunch of messages.
Then the messages will continue to exist in the gloda index.

It's worth noting that in general Thunderbird is not good at cleaning up data.  Apart from gloda issues, a user would need to force a compress commit of the address book mork db, .msf files, and trigger compaction of the (offline) mail stores to make sure stuff gets gone.  Gloda is slightly more troublesome because we never are able to trigger a VACUUM of the database and so unused pages or obsolete FTS sub-records can lie dormant with info in them.
This way you may set some people in the 3d world dictatorships up for arrest and abuse, even for torture. Give your head a shake.
Because of the nature of how hard disks and file-systems work, Thunderbird deleting data does not reliably prevent it from being recovered either.  Anyone who needs to make sure the information is not read should never let it be written to their hard disk in the clear.

This suggests that Thunderbird's profile should be placed on an encrypted partition for those in such situations (or who want any guarantee of strong privacy should their machine/hard disk be stolen, etc.).  In the event deniability of the existence of the encrypted partition is required, deniable encryption should be used.  Various solutions exist to both ends on multiple platforms; truecrypt being an example of an implementation available across multiple platforms.
It is nice to debate this from the comfort of your house, but for some it may mean handing by the ankles from the ceiling being beaten with power cords.
It should still be fixed.
(In reply to comment #9)
> It is nice to debate this from the comfort of your house, but for some it
> may mean handing by the ankles from the ceiling being beaten with power
> cords.
> It should still be fixed.

Even if you're arguing for this purely from a security perspective, this is by no means the most vulnerable place for privacy violation. As asuth mentioned, local mail stores, address books, etc all keep data around, much of which would be more dangerous if they fell into the wrong hands.

While this should be fixed, it would be more fruitful to address these other issues first. The only difference here is that the stale data here is more apparent when using Thunderbird.
Component: General → Search
QA Contact: general → search
Status: UNCONFIRMED → NEW
Ever confirmed: true
From bug 1318410 comment 1 Andrew writes:

== context

Gloda Contact names are a best effort type of thing.  When an email address is first seen, an identity and its parent contact are created.  The display name parsed out at that time is set as the contact's name.  If no display name was present, the email address is used verbatim.  At the same time, an addressbook indexer job is queued.

The addressbook indexer logic at https://dxr.mozilla.org/comm-central/source/mailnews/db/gloda/modules/index_ab.js#264 is:
  // update the name
  if (card.displayName && card.displayName != aContact.name)
    aContact.name = card.displayName;

Which is part of the problem here.  I believe the original rationale was that if the user isn't assigning a preferred name to the contact, then the originally scraped display name is probably best.  This code might pre-date the "star an email address to quickly create a contact" which automatically extracts-and-fills in the display name logic.  With that in place, it makes sense to treat an entry displayName as intent.

The second factor is that although a card indexing job is triggered when addressbook changes occur (https://dxr.mozilla.org/comm-central/source/mailnews/db/gloda/modules/index_ab.js#117), no job is scheduled on removal (https://dxr.mozilla.org/comm-central/source/mailnews/db/gloda/modules/index_ab.js#99).  This made sense in context because the indexing really only exists to scrape out explicitly set names and "freetags" (tagging support that never got meaningfully surfaced in the UI).

An overarching issue to be aware of is that the gloda DB representation is fully normalized but does not have reference counts on identity/contact usage.  The identity record and its owning contact can't be safely deleted without ensuring that all message/conversation/whatever references to them are gone.

The best thing that can be done is to reset the state of the contact back so it is as if the addressbook card had never existed.  The simplest action would be to reset the name to the email address of the first identity in question.  A more thorough approach would be to look up an email where the identity/contact is involved and scrape the display name back out from there.

== fixing

There's a few options.  The simplest might be to:
- Change the addressbook listener to queue a job that will cause the card to be re-indexed.  Since the indexer assumes the card exists, it might make sense to create a synthetic record that looks enough like an AB card to not blow everything up and also have a flag that indicates it's a deletion.
- Change the card indexer to treat an empty displayName as a request to purge the existing displayName and replace it with the email address.

The existing addressbook tests at https://dxr.mozilla.org/comm-central/source/mailnews/db/gloda/test/unit/test_index_addressbook.js can be built on.  test_remove_card_cache_indication could be renamed and made to check that the display name gets reset to the email address.
OS: Other → All
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.