Closed Bug 1282244 Opened 9 years ago Closed 9 years ago

Transient false negatives and/or false positives using Body-based quick filters in saved searches

Categories

(Thunderbird :: Search, defect)

45 Branch
defect
Not set
major

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1280840

People

(Reporter: chealer, Unassigned)

Details

Attachments

(4 files, 2 obsolete files)

I have been cleaning my mailbox from old discussions in which I was not involved which can be found in mailing list archives. I achieved this using saved searches with several criteria. For example: Cc contains "tikiwiki-cvs" From, To, Cc or Bcc doesn't contain "cheal" This yields a list which could still contain my name in the Body field. Body-based saved searches being unreliable (as reported in ticket #1280840), I have worked around this by performing Body-based exclusions using quick searches where I tag the matching mails, and where I then refrain from deleting tagged mails after clearing the quick filter. Unfortunately, this exposed another bug, which I fortunately noticed quickly enough. I applied a quick filter searching for either "Klutiero", "cheal" or "chealer" and tagged the matching mails. But before deleting the non-matching mails, I noticed that one of the non-matching mails did name myself. I was so surprised that I cleared and re-applied the filter several times, probably trying different terms, and I obtained the same results each time. But as I was going to report, this stopped happening. The matching mails started including the problematic mail. And despite having tried to reproduce the issue at least 10 times, I could not reproduce once more. I then tried reproducing with other mails. But I failed to reproduce with that saved search. Today I did a new saved search of the kind and applied again an equivalent quick filter. I had the good idea to check whether the results were accurate. And surely enough, there were numerous inaccuracies. But again, these disappeared after some time. So I went to another PC and created an identical saved search to confirm that this different install would also reproduce the problem, though the inaccuracies were different. My search for "chealer" in body should have yielded 3 mails. Instead, it showed 4 mails, 2 of which should not have matched. That means there were 2 false positives and 1 false negative. Again, I reproduced the same inaccurate result set several times. After reading the false negative from that second install, it would still not show up after re-applying the quick filter. Eventually, that last case also went away, after I went to the inbox and came back to the saved search. Restarting Thunderbird was not needed. The bug could be seen for several minutes in each case. The false positives disappeared before the false negative appeared. The false negative failed to show even when "2 messages" appeared in green. This is not trivial to reproduce since I do not know exactly what is happening, but I can say this is way too easy to reproduce for me. This has the potential to indirectly cause data loss. Both of the affected machines are Windows 10 installs running Thunderbird 45 using the same Gmail account. I will attach some of the false negatives and positives which I noticed, but my feeling is that the mail contents are irrelevant.
Failed to show using search term "chealer" on Body
Listed in the results using search term "chealer" on Body
Listed in the results using search term "chealer" on Body
Listed in the results using search term "chealer" on Body
Attachment #8765189 - Attachment is obsolete: true
Listed in the results using search term "chealer" on Body
Attachment #8765190 - Attachment is obsolete: true
The false positives are base64-encoded. Saving them on this PC pseudo-corrupted them (reported in ticket #844208), so I attached them corrupted. I re-uploaded them intact.
Would you mind attaching messages as text/plain and writing "bug 844208" instead of "ticked #844208". The former lets one view the message in Bugzilla, the latter lets you follow a link to the bug.
The saved search I provided as example was intended to finish cleaning up my mailbox from the tikiwiki-cvs mailing list. Later yesterday, I realized my mailbox was still full of mails I did not want to keep from that mailing list (probably as a result of a search bug of Thunderbird or Gmail). So I created again a saved search I must have done in the last 2 weeks: To contains "tikiwiki-cvs" From, To, Cc or Bcc doesn't contain "cheal" That search listed 3094 mails. Filtering for "Klutiero" on Body selected 624 of these. Yet, "Klutiero" was actually only found in at best 95 of these 624 mails. Needless to say that with such a majority of false positives, this quick filter is almost useless. About an hour later, I opened the same saved search in a new tab and applied the same quick filter. That new tab was then displaying only the 91 mails expected.
Attachment #8765192 - Attachment mime type: message/rfc822 → text/plain
Hi Jorg, (In reply to Jorg K (GMT+2, PTO during summer, NI me) from comment #7) > Would you mind attaching messages as text/plain Bugzilla says with my Firefox 47: >Attachment is not viewable in your browser because its MIME type (message/rfc822) is not one that your browser is able to display. So this must be a limitation of your browser. Note that I did not manually set the MIME type; it was auto-detected by Bugzilla. And detecting "message/rfc822" is in fact quite right. Bugzilla could also allow viewing these files by embedding these in an HTML document. I am not clear on whether doing what you ask is desirable, but in any case, attaching mails is not something I do frequently enough to remember next time. > and writing "bug 844208" instead of "ticked #844208". Writing "bug 2" does not necessarily refer to ticket 2. You could just be reporting that you "experienced the bug 2 times". Therefore, when one actually refers to ticket 1234, it less ambiguous to write: > bug #1234
(Not sure why you'd want to start a discussion with the developers.) text/plain is preferable, see bug 1154521 comment #7. You can write bug #1154521 if you want.
(In reply to Jorg K (GMT+2, PTO during summer, NI me) from comment #10) [...] > text/plain is preferable, see bug 1154521 comment #7. The fact that you and someone else prefer text/plain does not mean that text/plain is generally preferable. If you are convinced that is the case, I suggest you file a ticket requesting Bugzilla to assign such mails the MIME type text/plain when auto-detection is requested.
A quick filter on Body for "cheal" on the same saved search as that described in Comment #8 showed 2422 mails. I did not check each result, but I would estimate that about 2000 are false positives. Unfortunately, this time fixing seems to take a long time, if it's ever going to complete. I opened the saved search in a new tab hours after I first opened the search, and it would still match the exact same number of matches. However, a few minutes later I closed all tabs for that saved search and opened the new one. The number had gone down to 2098. It appeared the number was slowly going down, but it hadn't changed one bit several minutes later, during which Thunderbird was restarted. 2 minutes later, during which I put Thunderbird in offline mode, this has come down to 2089. ½ an hour later, 2081. Still there 10 hours later... and 2080 now. It has been 3000 mails total for about 12 hours.
Status: UNCONFIRMED → RESOLVED
Closed: 9 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: