Open Bug 519202 (qfasfailtracker) Opened 15 years ago Updated 9 months ago

[META] incomplete/incorrect/inaccurate/missing messages in "Edit > Find > Search Messages" (advanced search) or QuickFilter (QuickSearch) results

Categories

(MailNews Core :: Search, defect)

defect

Tracking

(Not tracked)

People

(Reporter: wsmwk, Unassigned)

References

(Depends on 19 open bugs)

Details

(Keywords: meta, papercut, Whiteboard: see meta Bug 541349 for Gloda search results)

This bug is to focus on incomplete/incorrect search *results*, eg false positive or messages missing from results list.  The current intent is to provide focus on backend issues, but that could expand. I have not yet sought out protocol related issues. And I have not included gloda search ATM in part because of time, and because these already have some recent attention.

news-specific,  message count and UI issues are specifically excluded. 

I haven't double checked every bug to make sure it belongs here.  Initial list:

 Bug 404255 -  Consider using UTF-8, when searching inside message body (IMAP online search) 
 Bug 176694 -  error when search in body message for non-latin charset (koi8-r)
 Bug 270868 -  saved search folder using the content-base header fails
 Bug 92219 -  Search missing hits (imap) on both UW-IMAP and Cyrus
 Bug 124641 -  Filter or Search: does not handle multi-line (wrapped, folded) headers correctly when search term spans lines
 Bug 481616 -  Searching message fails when "=" is in the body
 Bug 363238 -  saved searches fail for searches on x-headers
 Bug 500272 -  Message search using "contains" fails to match substrings of email addresses (Gmail IMAP doesn't support SEARCH properly yet) 
 Bug 395100 -  search messages, body, fails to find text in part of mail
 Bug 37031 -  searching message body yields false positives because base64 encoded binary attachments are treated as plaintext
 Bug 389488 -  "Entire message" search fails if charset not known - should use default charset


The above is certainly is not a complete list. The query I started working from is
https://bugzilla.mozilla.org/buglist.cgi?query_format=advanced&short_desc_type=allwordssubstr&short_desc=search&product=MailNews+Core&product=Thunderbird&resolution=---&bug_severity=major&bug_severity=normal&bug_severity=minor&chfieldto=Now&field0-0-0=short_desc&type0-0-0=nowordssubstr&value0-0-0=facet+ldap&field0-1-0=component&type0-1-0=nowordssubstr&value0-1-0=address&field1-0-0=short_desc&type1-0-0=anywordssubstr&value1-0-0=search+filter&field1-0-1=component&type1-0-1=anywordssubstr&value1-0-1=filter+search&field2-1-1=short_desc&type2-1-1=anywordssubstr&value2-1-1=false+fail+positiv+handle+hits+wrong

Please feel free to improve the query and update the bug list
Depends on: 532636
filed Bug 541349 for ...
 [faceted search] incomplete/incorrect/inaccurate/missing messages in "Search all messages" results for Gloda Global Search and Indexer [meta]

Please feel free to improve the query and update the bug list
Summary: incomplete/incorrect/inaccurate search or filter results [meta] → incomplete/incorrect/inaccurate/missing messages in Search Messages or filter (quick search) results [meta]
Whiteboard: see meta Bug 541349 for Gloda search results
Depends on: 574799
Wayne, having META bugs like these is very helpful. Thank you! And please CC me next time you create one, especially, but not limited to, meta bugs on search :)
Meta bugs help to reduce duplicates, if it's easy enough to find them. To that end, tweaking summary. hth
Summary: incomplete/incorrect/inaccurate/missing messages in Search Messages or filter (quick search) results [meta] → [META] incomplete/incorrect/inaccurate/missing messages in "Edit > Find > Search Messages" (advanced search) or filter (QuickSearch) results
Summary: [META] incomplete/incorrect/inaccurate/missing messages in "Edit > Find > Search Messages" (advanced search) or filter (QuickSearch) results → [META] incomplete/incorrect/inaccurate/missing messages in "Edit > Find > Search Messages" (advanced search) or quick filter (QuickSearch) results
Summary: [META] incomplete/incorrect/inaccurate/missing messages in "Edit > Find > Search Messages" (advanced search) or quick filter (QuickSearch) results → [META] incomplete/incorrect/inaccurate/missing messages in "Edit > Find > Search Messages" (advanced search) or QuickFilter (QuickSearch) results
See Also: → glodafailtracker
Alias: qffailtracker
Alias: qffailtracker → qfasfailtracker
Depends on: 338761
Bug 576994 appears to belong on this list, and I don't think others on the list duplicate it, based on their descriptions.
Depends on: 569009
Depends on: 591796
Depends on: 697021
No longer depends on: 598605
No longer depends on: 628075
No longer depends on: 532636
No longer depends on: 571421
Another bad boy:

Bug 460737 - Quickfilter ignores searches for friendly display names from address book contacts, as displayed on message header and message list by default (no or incomplete results for From, To, CC, BCC)
Depends on: 460737
Depends on: 586729
Depends on: 546925
Depends on: 721167
Depends on: 771488
Depends on: 1101474
Depends on: 1259534
Depends on: 1263783
Ticket #1280840 reports extremely unreliable results for saved searches using criteria based on the Body field (using Gmail IMAP).
Ticket #775024 reports failure of Body-based filters to match base64-encoded mails. This might be the same issue as that reported in ticket #1259534.
Ticket #521649 reports failure of Body-based quick searches to match mails containing non-ASCII characters, when search terms contain non-ASCII characters (such as letters with German umlauts).
Ticket #1282244 reports transient false negatives and/or false positives using Body-based quick filters in saved searches.
Ticket #1282346 reports numerous false negatives using Body-based quick filters on a Gmail account's Trash folder.
The following tickets appear to report advanced search / quick search bugs causing false positives or false negatives:

Ticket #37031
Ticket #338761
Ticket #379988
Ticket #404255
Ticket #460737
Ticket #481616
Ticket #500272
Ticket #507819
Ticket #521649
Ticket #569009
Ticket #576994
Ticket #586729
Ticket #628098
Ticket #667854
Ticket #700541
Ticket #721167
Ticket #1259534
Ticket #1263783
Ticket #1280840 (saved searches)
Ticket #1282244
Ticket #1282346

Some of the 21 bugs reported there are serious. I have tried using Thunderbird to clean up my mailbox, eventually giving up and resorting to using Gmail's webmail after having lost well over 10 hours trying to understand the numerous anomalies I faced. I am not counting proper time I spent filing the last tickets.

These bugs are very likely to cause indirect data loss or other consequences worst than wastes of time to users who assume that Thunderbird search has no false negatives or false positives. I see no indication that the next Thunderbird versions will be much better in that regard.

I therefore recommend Thunderbird to warn users (at least) the first time they perform a search about the possibility of false negatives and false positives. This warning could link to a "sub-ticket" of this one focusing on false positives and false negatives, or to a dedicated article. I volunteer to redact such an article.

I do not remember seeing such a buggy area in software Mozilla markets as stable. The saga I went through before giving up pushed the limits of my patience, and I am probably among the lucky ones if I was able to avoid data loss (which I doubt). Mozilla's reputation is at stake. A warning will not solve the bugs, but it will greatly reduce their impact.
Restrict Comments: true
For easier reference, here the expanded bugs in comment #11, which contains bugs in comment #6 to #10:

> Bug 37031 - searching message body yields false positives because base64 encoded binary attachments are treated as plaintext
> Bug 338761 - searching body of emails for text doesn't match word-wrapped text
> Bug 379988 - search body should not match words in MIME headers
> Bug 404255 - Consider using UTF-8 when searching inside message body (IMAP online search) to avoid search failure
> Bug 460737 - Quickfilter ignores searches for friendly display names from address book contacts, as displayed on message header and
> Bug 481616 - Local searching message fails when = is in the body, because quoted-printable text is searched as plain text even thoug
> Bug 500272 - Message search using contains fails to match substrings of email addresses (IMAP online search, Gmail IMAP doesn't supp
> Bug 507819 - imap search does not work when file headers or body are not locally indexed and cached
> Bug 521649 - Quick Search Message body filter does not find message text with umlauts (ä,ö,ü) in saved drafts messages (Character
> Bug 569009 - Message body filter misses most manually moved messages (IMAP folder only,All Folders view,not virtual folder,Body sear
> Bug 576994 - Body quick filter option searches body and seemingly-random header values
> Bug 586729 - Quickfilterbar search lost swap Sender vs. Recipients magic for Inbox vs. Sent style folders
> Bug 628098 - Searching body searches code, rather than just text (string in HTML style is searched by Body search)
> Bug 667854 - Mails with a body containing quoted-printable-like strings (= followed by 2 hexadecimal digits) not matched (false nega
> Bug 700541 - Body search yields false positive for user(it looks for user that message header of mail is searched ), because message
> Bug 721167 - Body search of Gmail IMAP includes headers / If condition contains body, Tb always executes Online Search for IMAP fold
> Bug 1259534 - Search for a string in message body fails to find message if message parts are base64 encoded. Searches the undecoded
> Bug 1263783 - Thunderbird Quick Filter matches image content
> Bug 1280840 - Trivial false negatives/positives with Search Messages... (Ctrl+Shift+F) on Body field with fully ASCII search terms
> Bug 1282244 - Transient false negatives and/or false positives using Body-based quick filters in saved searches
> Bug 1282346 - Numerous false negatives using Body-based quick filters (trash)
Sadly the area of body search is pretty bad, that's why I asked Kent to take a look and he said ...
Read: Bug 1259534 comment #14.

Looking at this summary, I realised that bug 37031 and bug 1259534 may be related, both are about not decoding the base64 encoded body/attachment.

Also not joining lines before searching is reflected in bug 338761 and bug 1230815, just added to the collection ;-)

IMAP search has its own set of problems.
Depends on: 1230815
Depends on: 1427124
Depends on: 521649
Depends on: 1235444
Depends on: 1420796
Depends on: 1501358

Adding papercut keyword so as to keep this on the radar. We should strive for some incremental improvements here. I'm aware that some of this will be non-trivial.

Keywords: papercut
Depends on: 1790793
Severity: normal → S3
Depends on: 284856
Depends on: 1839085
You need to log in before you can comment on or make changes to this bug.