Closed Bug 1952259 Opened 7 months ago Closed 1 month ago

Global indexing stalls on corrupted mailboxes with no warning nor recovery

Categories

(Thunderbird :: Untriaged, defect)

Thunderbird 128
Desktop
macOS
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: davethenerd, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [datalossy])

Attachments

(4 files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.3 Safari/605.1.15

Steps to reproduce:

Enabled Settings > General > Enable Global Search and Indexer.
Message Store Type is mbox.
Hardware acceleration is enabled.

Actual results:

On many (most, even) of my mailboxes on a ~2-year-old Thunderbird installation, it stops mid-way through "Indexing messages" and just sits there. It doesn't gracefully fail (not even after a restart), nor does it warn the user in any way. Even looking at Activity Manager it seems like things are fine (in that there's no error reported). One must look over time and realize that the number hasn't increased.

Restarting Thunderbird does NOT remedy this. The only remedy is to first Repair Folder on the mailbox and then restart Thunderbird. The repaired mailbox alone does not cause the indexing on that folder to restart.

Once it hits whatever the issue/corruption/problem is, Thunderbird's indexer dies and doesn't do anything else.

The effect is that search results are massively incomplete, without the user knowing to expect them to be incomplete.

Expected results:

One of many things could happen:

Best that I can think of: a failed indexing prompts the user to Repair Folder that mailbox, and then Thunderbird knows to restart that folder's index from scratch. In the meantime, it should move on to another folder (after alerting the user).

At the very least: it should alert the user that there is a problem and instruct them on what to do (as I've detailed in the "What happened?" section above). Simply dying quietly is dangerous from the standpoint of users expecting their search to work and believing the faulty results.

Summary: Global indexing stalls on corrupted mailboxes with no warning or recovery → Global indexing stalls on corrupted mailboxes with no warning nor recovery
OS: Unspecified → macOS
Hardware: Unspecified → Desktop

I should add that I have Thunderbird running on 3 Apple Silicon Macs (esr on two of them, beta on one of them, all kept up-to-date), and this issue is rampant amongst all three. In fact, it was because of the issue being on all three that I noticed it: search queries were yielding different-yet-still-incomplete results amongst the three, and that's what caused me to dig in and discover this issue.

My theory that the issue is corrupted mailboxes is just that: a theory...even perhaps only a hypothesis. Certainly, repairing and then re-indexing (usually) works, and supports this hypothesis, but it's possible the mailboxes themselves are fine, and there's a deeper issue with the indexer itself, causing it to stall out for a different reason.

Do all three Macs access all the same account(s)?

Flags: needinfo?(orders)

(In reply to Wayne Mery (:wsmwk) from comment #2)

Do all three Macs access all the same account(s)?

Yes, for the most part. I have the following accounts:

Fastmail (accessed from all 3)
Gmail (accessed from all 3)
Google Apps for Domains, effectively gmail (accessed from 2)
Synology Mail Server (accessed from all 3)

I will note that mailboxes from ALL of these have been affected by this issue, and it's not necessarily the same mailboxes that are affected from machine to machine.

Flags: needinfo?(orders)

I have also encountered this problem, currently on:
Ubuntu 24.04 (64-bit)
Snap version of Thunderbird: 128.8.0esr-5
I have on the order of 400 folders across six mail accounts, and about 14GB of mail. All accounts are IMAP, a mix mostly of Rackspace Email and GMail.

Earlier this evening, I deleted the global sqlite db because I noticed the search index was very incomplete. After restarting, Thunderbird indexed a number of folders, but is now encountering broken folders.

According to Activity Manager, indexing will either

  1. stall at some message in a folder, or
  2. it will get stuck at "Determining which messages to index...".

Doing Repair Folder, waiting for all the messages in the folder to re-download, and then restarting Thunderbird allows it to proceed, as described by the OP. I am currently doing that across all folders, a number of which are stalling.

OP here. Just confirming that this problem also exists in 136.0.1 in the release channel. Happening on all my Macs.

Can you identify the message(s) on which it stalls?
Might the have calendar data or attachments?

Flags: needinfo?(orders)
Whiteboard: [datalossy]

(In reply to Wayne Mery (:wsmwk) from comment #6)

Can you identify the message(s) on which it stalls?
Might the have calendar data or attachments?

Would there be something in the logs that would indicate the message upon which it's stalling? Right now all I see is the original screenshot where it stalls ("Indexing 462 of 4059 messages...") or the one I have newly attached today ("Determining which messages to index in..."). It doesn't give me any obvious indication as to which message that is, though.

Flags: needinfo?(orders)

If you install https://addons.thunderbird.net/en-us/thunderbird/addon/glodaquilla-ng-index-ondisk/ you can add a column which shows which messages have not been indexed. It can also show whether a messages has been downloaded. I think if the folder is ordered by "Received" then the next message(s) in the list that has not been indexed will be the one(s) causing the problem.

Ok, the documentation for Glodaquilla-NG is confusing (and incomplete?), but I believe the g_dirty column here is the one that shows whether a message has been indexed or not(?). The g_Offline column is confusing, because these messages are clearly all available offline AND the mailbox is marked for offline use.

In any event, this Mailbox got to message 2796 before it stalled on indexing. Sorting by "Received", the highlighted message is number 2796. You can see that many messages before it are a mix of both 2 and 0 in the g_dirty column.

Sorting by the g_dirty column, there are also 1,375 messages with NO entry in this column (not 2 nor 0, simply empty, and g_id is ALSO empty for these same messages). For reference, there are 2,616 messages with a 0 in g_dirty, and then 8,515 with a 2.

To be thorough, there are also 15 messages with contents in g_dirty and a 2 in g_id. Everything else with contents in g_dirty has an entry like 5bd07 in g_id. Hopefully this helps, and obviously happy to test/report more.

Just because a folder is marked for offline doesn't mean all it's messages are offline. If g_Offline is 0 it should mean that the message body has in fact not been downloaded. NOTE: A message can be indexed even if the message body is not downloaded.

g_dirty indicates the state of g_id for g_id>10 as mentioned below (if i remember correctly, gloda ids <=10 have special meaning). My notes from 2009:

<wsm0> asuth: what are gloda dirty values 0,1,2?
<altsuth> wsm0, not dirty (gloda-id valid), dirty (gloda-id valid but reindexing required), filthy (gloda-id not valid)

<wsm0> interesting - i have a lot of "2". maybe I should blow away the db.
<altsuth> blowing away the database turns everything to a 2
<altsuth> gloda gives up when it sees bad messages in its indexing sweep
<altsuth> the only way to force things to get indexed is to cause the event-driven indexing mechanism to fire
<altsuth> which means modifying messages in groups of 20 or less

  • wsm0 must have several bad msg
    <altsuth> which I personally wouldn't bother with, but if you really want a message indexed
    <altsuth> it only takes 1 to make a folder unhappy

<wsm0> altsuth: does sweep die on a per folder basis?
<altsuth> wsm0, yes

(In reply to Wayne Mery (:wsmwk) from comment #11)

Just because a folder is marked for offline doesn't mean all it's messages are offline. If g_Offline is 0 it should mean that the message body has in fact not been downloaded. NOTE: A message can be indexed even if the message body is not downloaded.

Huh. I went offline (disabled my Wi-Fi connection) and all of those messages appeared just fine. Also, clicking on the folder resulted in the note that indicated that there were no additional messages to download.

Be that as it may, the fact that this exchange goes back 14 years to 2009 is... interesting:

<altsuth> which I personally wouldn't bother with, but if you really want a message indexed
<altsuth> it only takes 1 to make a folder unhappy

<wsm0> altsuth: does sweep die on a per folder basis?
<altsuth> wsm0, yes

Seems this issue I've reported has been known for well over a decade. I wonder how many users think that Thunderbird's search results are reliable?

In any event, I'm glad we're re-surfacing it now. Hopefully this can be addressed in a way that makes Thunderbird robust and reliable as we all would expect?

Bug 1916063 may have fixed this.

See Also: → 1916063

(In reply to Magnus Melin [:mkmelin] from comment #13)

Bug 1916063 may have fixed this.

That's possible. My Mac running 138.0b3 seems to not have any hung "Indexing messages" items in Activity Manager.

My primary Mac running 137.0.2 still does show this bug to exist, but I believe that's likely attributed to the fact that Bug 1916063 is only fixed in the 138betas and perhaps the 128esr release, near as I can tell.

As soon as the release channel gets this fix (presumably when 138 is rolled out of beta), then I'll have a definitive way to confirm. But yeah, the fact that my 138.0b3 Mac isn't showing any signs of this is a good indication, indeed. Thanks for the heads-up!

Well, good news and bad news. The good news is that some of the indexing stalls have been resolved, in that I have watched 138.0 complete the indexing of previously-stalled mailboxes. But I'm also seeing it still stall on some. On one stalled mailbox, I was able to Repair Folder, quit-and-relaunch Thunderbird and it eventually indexed that folder, too.

And then it hit another one, and I've attached that here. I'm in the process of repairing this folder now, too, and hopefully that will resolve it.

So, despite the fact that the .ics issue has been fixed, the primary issue still remains: Thunderbird's indexer will stop when it hits a message it doesn't like, and won't proceed forward nor does it warn the user that their search results are incomplete (or worse, incorrect).

Seems like there's still some work to do.

I was seeing this problem also. Thunderbird was stuck indexing one folder (which it claimed had about 250 emails but it really had 20000).
I updated to v138 and rebuilt the mailbox and the problem was resolved. It indexed that mailbox and then several behind it that it never got to because it was stuck.

I also had indexing issues. Loosing the ability "open in conversation" was lost & my clue there was an issue.

This was still failing to index fully or report the issue/problematic messages with 140.0.1 (Linux / AMD_X64). There were two failure conditions: one was attemping to find messages to index, which would just hang, and the other was indexing a folder would get through so many messages and then hang.

Repairing the folders (e.g. right click folder, click repair button for each) did fix this for me as well.

It may be noteworthy that literally all my sent mail folders had an issue/needed repair. This was in addition to a random handful of other folders seeming to follow no particular pattern.

(In reply to Dave Hamilton [:davethenerd] from comment #15)

Well, good news and bad news. The good news is that some of the indexing stalls have been resolved, in that I have watched 138.0 complete the indexing of previously-stalled mailboxes. But I'm also seeing it still stall on some. On one stalled mailbox, I was able to Repair Folder, quit-and-relaunch Thunderbird and it eventually indexed that folder, too.

So it seems at least part of the problem has been fixed by bug 1916063. Repairing a folder with indexing issues appears to be a still needed but effective workaround for the rest. Closing as WFM.

Status: UNCONFIRMED → RESOLVED
Closed: 1 month ago
Resolution: --- → WORKSFORME

(In reply to Hartmut Welpmann [:welpy-cw] from comment #18)

Repairing a folder with indexing issues appears to be a still needed but effective workaround for the rest. Closing as WFM.

Indeed, there is (now) a way to make it work. But... I'd still lobby for a warning (a big, huge one IMHO) telling users that their search results will be incomplete (at best) until they reindex a specific mailbox. Because right now a user would need to either (a) check Activity Manager to see if there's an issue or (b) notice that certain things were missing from the index, neither of which is a great scenario. For example, right now in 142.0 I just happened to check and I see that my INBOX is stuck at 72 of 124 messages. Who knows how long it's been like that? (And I'm obviously aware that this bug exists).

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: