Corrupted .msf files (search indexes) are not auto-detected

UNCONFIRMED
Unassigned

Status

Thunderbird
Filters
UNCONFIRMED
5 years ago
5 years ago

People

(Reporter: Olivier Houdas, Unassigned)

Tracking

17 Branch
x86_64
Windows 7

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

5 years ago
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20100101 Firefox/16.0 (Beta/Release)
Build ID: 20121010144125

Steps to reproduce:

Sometimes, .msf files (search/filter index files) can get corrupted. This is written by the TB team in the General tab of the properties of a folder.

In such cases, filters won't work properly, not finding mails with the filtered keywords (see my entry #891305, that I closed as invalid).


Actual results:

You filter your mails for a keyword, but it doesn't find the mails, although they do exist.
You have no warning that the filter might have missed some valid mails.

Note: when I repair, it resets the sort order of the folder, this should not happen. I want my mails sorted with the most recent first, I should not have to click on the date column after repairing.


Expected results:

Thunderbird should auto-detect and auto-repair corrupted .msf files!
What's a search worth if you're not sure you have the correct results?

And systematically repairing folders "just in case" is absurd...

Updated

5 years ago
See Also: → bug 891305
(Reporter)

Comment 1

5 years ago
I noticed that the sort order lost happens only on my IMAP account folders. It does not happen when repairing POP account folders.

I'm pointing that behavior out as if the present request is fixed, then we don't want that the sort order would be "randomly" lost (that is, on automatically repaired IMAP folders).
(In reply to Olivier Houdas from comment #0)
> Bug summary : Corrupted .msf files (search indexes) are not auto-detected 

Does it implys "always not auto-detected and Repair Folder is always mandatory" and the "always" == "with any kind of bad .msf, with any operation"?

IIRC, at least when "outdated .msf condition" of local mail folder,
- Tb automatically invoked Rebuild Index when explicit mail folder open
  by folder click at Folder Pane.
- Upon mouseover on mail folder, exception(==outdated .msf consdtion)
  was shown. Rebuild Index was not automatically executed.
  This may have been changed.
When ".msf" of local mail folder is intetionally deleted and Tb is restarted,
- Null .msf or cleared8initialized) .msf was created by Folder
  Re-Discovery.
  But Rebuild Index was not invoked until explcit folder open.
- Access by "Search Folder(==Virtual Folder)" didn't invoke Rebuild
  Index automatically.
- Sorry but I don't remember about "Filter copy to the mail folder".
  Because following phenomenon is observed in Tb 17,
    When message filter move from IMAP ServerA to IMAP ServerB,
    if no Mbox is opened at ServerB yet(==no login to serverB yet),
    "filter move to ServerB" does do nothing on ServerB.
    i.e. "Move to ServerB" does do nothing, except writing log of
    message filter rule of Move to ServerB was applied".
  Message filter copy/move may not open Copy/Move target Mbox of
  serverB when IMAP, unless one Mbox is opened at ServerB(==first login
  is done at ServerB).
When data lines in ".msf" file of local mail folder is manually deleted carefully(simply last part of data is removed, without breakng structure etc.), and Tb is restarted,
- Rebuild Index was not invoked, because "outdated .msf condition"
  dosn't exist. And, it's simply "some entries were removed by manual
  editing". This is similar to "write some data to .msf was not done
  upon termination of Tb. So, "no automatic rebuild-index" is normal.

"Loss of sync with IMAP server" may be similar to "cleared .msf file after delete .msf in local mail folder".
If Search Folder or Unified folder for IMAP Mbox,"First login to IMAP server is doen or not" may be relivant to your problem.

What do you call by "Corrupted .msf files"?

> filters won't work properly, (snip)

"Filter" has too many meanings in Tb.
- Messege Filter
- Junk Filter
- Search by Global Indexer and Seach
- Saved Search, Unfied Folder
- Quick Search in Quick Filter Bar
- Customized View in "View"
- Advanced Search(Edit/Find/Search Messages...)
If possible, please clearly state about "in which Filter" for each phenomenon/problem you saw.

(In reply to Olivier Houdas from comment #0)
> Note: when I repair, it resets the sort order of the folder, (snip)
(In reply to Olivier Houdas from comment #1)
> I noticed that the sort order lost happens only on my IMAP account folders.
> It does not happen when repairing POP account folders.

This is known bug on IMAP folder, which is still reported as new bug sometimes, and is reported repeatedly in bugs for "lost column choice" or "lost sort order", isn't it?
Or "lost column choice" won't occur in your case and "lost sort order" only occurs always?



> I noticed that the sort order lost happens only on my IMAP account folders.
> It does not happen when repairing POP account folders.
> 
> I'm pointing that behavior out as if the present request is fixed, then we
> don't want that the sort order would be "randomly" lost (that is, on
> automatically repaired IMAP folders).
(Reporter)

Comment 3

5 years ago
About filtering, I am talking about the Quick Filter button (to the left of the Search input zone in the standard configuration, indicating CTRL+SHIFT+K as the accelerator).
That means that my search fails on an open folder (so all cases of repairing when opening a folder are excluded from the case raised in this issue).

On my side, I don't care when the folders .msf files are checked (at TB's opening, on folder's opening, during scheduled tasks like daily index check, etc.), but what should not happen is that the user needs to go and click the Repair button in the folder's properties to be sure that the search is going to work.
(In reply to Olivier Houdas from comment #3)
> About filtering, I am talking about the Quick Filter button

I see. Let's focus on Quick Search only.
If Quick Filter, it's for "Opened Folder" only. So, "Folder is not opened by message filter" like case is surely ruled out.

> but what should not happen is that the user needs to go and click the
> Repair button in the folder's properties to be sure that the search is going to work.

Actually cause of your problem is "Corrupted .msf" which you wrote in bug summary?

IMAP folder consists of ".msf" file, and "offline-store file" if Offline-Use=On folder and auto-sync is enabled. 
If Body search, search target is "offline-store file".
And, "Repair Folder" is slightly misleading.
- "Repair Folder" simply does do "re-creates .msf and offline-store file
  from scratch" when IMAP. 
  - Erase .msf, re-fetch all message headers from server,
    and mail's meta data such as Subject:, From:, To:, to .msf file.
  - Erase offline-store file, re-fetch entire mail data from server
    if auto-sync is enabled and offline-use=On folder.
  - When IMAP, Tb fails to save non-mail related data in .msf file
    such as "column choice" before "erase .msf" by "Repair Folder".
    So, loss of "column choice" occurs by "Repair Folder" if IMAP.
And, there are known problems like bug 823838(see all duped bugs) which produces broken mail data in offline-store file. When problem like that bug, all data is petty consistent. "Bad data" is state of "data held in offline-store file is good" only.
So, "Repair Folder was needed to recover from your problem" doesn't always mean ".msf file was actually corrupted".

Is auto-sync enabled? (default is Enabled)
Is Offline-use=On folder? (Folder Properties/Synchronization.default=On)
Is "Body Seach" involved?
When your problem occurs, does same problem(not found) occur in Advanced Search too? (Edit/Find/Search Messages..., or Search in context menu)
How about "Global Search"?
(Reporter)

Comment 5

5 years ago
Thank you very much for your time to analyse the situation.
Yes, Body search is involved.
Yes, offline-use is checked in the synchronization tab of the folder properties.
I'm not sure about the auto-sync (couldn't find where this is set), but I have never played with such options,so I guess it's on.

Global search (CTRL+K) finds the mail.
Advanced search (CTRL+SHIFT+F) does not find the mail, even if "Search on server" is checked.
On the computer where I repaired the folder, the Quick search lists the mail. On the computer where I did not repair, it does not.

Note that I moved the mail from Inbox to 'My custom folder' on the computer where the issue occurs, and on the other computer (where I find the mail in quick searches), the mail is now both in Inbox and in My custom folder. However, I don't think this is related to my issue.
(Reporter)

Comment 6

5 years ago
Note: this bug might be a duplicate of https://bugzilla.mozilla.org/show_bug.cgi?id=569009.

I entered a separate item for the last issue noted, that the deletions of mails are not reflected on TB IMAP clients: https://bugzilla.mozilla.org/show_bug.cgi?id=894818.
(In reply to Olivier Houdas from comment #5)
> I'm not sure about the auto-sync (couldn't find where this is set),
> but I have never played with such options,so I guess it's on.

If so, auto-sync is enabled, because uto-syc is enabled by deault.

> Global search (CTRL+K) finds the mail.

If so, and searched/found string is actually "string in messaged body", it indicates that the mail is already sych'ed by auto-sync(entire mail data is downloaded to offlne-store file).

> Advanced search (CTRL+SHIFT+F) does not find the mail,

If newly arrived mail, enetire mail data may not be downloaded to offline-store file yet when "Body Search" of Quick Search or Advanced search is executed.
If entire mail data is already downloaded to offline-store file, View/Message Source while Work Offfline mode shows message source, but if entire mail data is not downloaded to offline-store file yet, View/Message Source while Work Offfline mode can not show message source.
When your problem occurs, is the mail's message source shown with Work Offline mode?

> even if "Search on server" is checked.

If Online Search, IMAP ommand of "search" is sent to server. 
And, charaset used in "search" command by Tb depends on "default character encoding" which is set in Folder Properties/General.
This charset affect on "search at server".
What charset is set? Different result by changig to utf-8, windows-1252 etc.?
(Reporter)

Comment 8

5 years ago
I'm not sure to get your comment about CTRL+SHIFT+F : as the message is found with the Global search, and you say that this means that the message's body is fully downloaded, then isn't it fully downloaded also for the Message search?
At any rate, the message source is shown when I open the mail and choose Display\Show message source, and I can see my string "1000" in the mail and in its source.
A Quick search on "1000" run after checking that this displays fine still does not list the mail.

Changing the charset to UTF-8 instead of ISO-8859-1 does not change the CTRL+SHIFT+F results on searching for "1000" in the body. Note that searching for another word ("encore") lists 4 mails but not my test mail (which does contain this string).
(In reply to Olivier Houdas from comment #8)
> Note that searching for another word ("encore") lists 4 mails
> but not my test mail (which does contain this string).

Is "false positive"(it shouldn't be found, but is listed in search result) involved in your problem?
Because "Body Sarch" has problem of "false positive" like bug 697021, please clearly isolate "false positive" issue and "false negative"(it should be found, but is not listed in search result) issue, please.

> I'm not sure to get your comment about CTRL+SHIFT+F : as the message is found with the Global search, and you say that this means that the message's
> body is fully downloaded, then isn't it fully downloaded also for the
> Message search?

"Moved mail" case and "New mail" case may be different when Offline-use=On.
  When moved mail between Offline-Use=On folder, "Move" is done by
  multiple steps.
  - Copy mail data to move target locally with FakedKey, and pass the
    data to Gloda(Global Indexer) for quick indexing.
    => Found by Global Search, but entire mail data of moved/copied
       mail is not downloaded to offlne-sore file yet.
  - After it, "uid copy xx MoveTargetFolder" is issued.
  - After it, the "new mail in MoveTargetFolder produced by copy" is
    fetched, and local data with FakedKey is deleted, and entire mail
    data of the "new mail in MoveTargetFolder produced by copy" is
    fetched to offline-store file, and Gloda updates index data with
    actual UID/MessageKey of copied/moved mail.
So, if "moved mail", "found by Global Search" doesnt always mean "entire mail data is actually downloaded to ofline-store".
If "New mail", entire mail data is not always downloaded just after new mail is detected, because auto-sync has timer pop mechanism.
These are reasons why I asked you to check "entire mail is actually downloaded to offline-store or not" by "Work Offline mode" and "View/Message Source".

> Changing the charset to UTF-8 instead of ISO-8859-1 does not change
> the CTRL+SHIFT+F results on searching for "1000" in the body.

Online search is also affected by server side implementation of IMAP "search" command. See bug 404255, bug 721167, for issues around "Online Sarch".
Get IMAP log and check "search" command issued by Tb and search result returned from IMAP server, please.
(Reporter)

Comment 10

5 years ago
Hi,
The case is about false negative. Emails matching the searched string are not all found.
I activated the IMAP log, but when I check the box Search on server, it does not log anything in the IMAP log when I search, although it does update the log in real time (no delay writing) when starting TB or when closing TB. Searching for the string searched after closing TB finds no match in the log. As results are immediately shown when I press Search, my conclusion is that it does not even try to contact the IMAP server (gmail) with that option in CTRL+SHIFT+F.

Regarding the mail that I can't find with my search on one PC, I opened the folder file and the .msf files on both machines, and in all 4 files, the mail's body text is stored in plain text ("encore\r\n1000\r\nencore").

I can provide files if needed (logs, folder contents, etc.) but only privately.
You need to log in before you can comment on or make changes to this bug.