Gloda warning repeated every half-second causing high cpu and slow response. "Observed header that claims to be gloda indexed but that gloda has never heard of during compaction" in error console and "Determining which messages to index""

NEW
Unassigned

Status

defect
2 years ago
5 days ago

People

(Reporter: mgoldey, Unassigned)

Tracking

({perf, regression, regressionwindow-wanted})

Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [regression:TB5x?], )

Attachments

(1 attachment)

User Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:56.0) Gecko/20100101 Firefox/56.0
Build ID: 20170926190823

Steps to reproduce:

Open Thunderbird.  Delete a message.  

Some metrics:  I have a significant amount of e-mail stored in TB using several accounts.  Probably 4GB or more spread over a few dozen folders.  So, lots of complexity.  


Actual results:

Thunderbird is consuming 12-18% of CPU at all times and is laggy and difficult to use.  The reason seems to be reflected in the error console.  The console repeats the following message once or twice a second, continually, without stop, 24/7, so long as the program is running.

2017-10-06 17:01:15	gloda.index_msg	WARN	Observed header that claims to be gloda indexed but that gloda has never heard of during compaction. In folder: mailbox://[[username]]@pop.gmail.com/Inbox sketchy key: 119174057 subject: your  server rejected

This message is followed by a number in a red circle ranging between 200 and about 1200.  Other than the number in the circle, the message never changes.  

In this case, there is no inbox message with a subject containing "your server rejected" although I'd sure I sent such messages in the past.  

In other instances, the specific e-mail item reflected in the e-mail changes, but the structure of the message is the same.  In some cases, there is a message or series of messages with the subject shown in the console, but I cannot tell if the key matches an existing message or one that has been deleted or moved to a folder.  

Compacting the mailbox does not change the behavior of TB.   Compacting, and then restarting TB seems to stop the behavior until messages are again deleted.  Then, the CPU use and console messages return, although not instantly.  My instinct is that the messages start after global indexing kicks off.

Closing the main TB window but leaving the console open does not stop the warning messages from appearing, or reduce CPU use.

Clicking "Repair Folder" for the account and folder in question seems to stop the high CPU use as well.  


Expected results:

Although there may be an underlying bug (or file corruption?) relating to global indexing, this bug report is about the high CPU use and ceaseless error logging.  Whatever the underlying problem is, the fact that TB seems to constantly repeat whatever step generates the warning, instead of moving on or alerting the user or something, is itself a problem.
Inbox sketchy key is also mentioned in Bug 520115 and Bug 1278977
Component: Untriaged → Backend
Product: Thunderbird → MailNews Core
Version: 56 Branch → 56
Additional behavior observed since initial report:

I opened TB, entered passwords for my various accounts, deleted the most recent item in my inbox of my primary account, and the compacted the inbox.  The non-stop error reporting immediately started in the browser console.

The WARN message began with sketchy key: 2771 but without the number in the red oval, and then continued sequentially until sketchy key 3429 which is where it is now stuck, repeating the same message a few times a second, with varying numbers in a red oval.  

As far as I can see, every key is shown through 3427.  There was no error message involved key 3428, however.  

Restarting TB stops the error messages.

Then, starting to use the program, none of these actions cause the gloda errors to resume:   receiving mail; sending mail; deleting mail; moving messages to another folder.

But, compact the inbox, and the moment that the green progress bar vanishes, the errors kick off.

Any ideas about what's causing the underlying key issues?  I am concerned for the safety of my mail folders.  But, again, the error is that the WARN message keeps repeating incessantly, consuming 10-15% of my CPU.
Another data point:  if I delete global-messages-db.sqlite and let TB regenerate it, and then do a few message delete/moves and a compaction, there is no error.  At some point, however that changes, and  and from that point onward, compaction results in the endless error messages and CPU use described above.  

Maybe I have confused cause and effect, but it seems like the underlying problem is that the sqlite database gets messed up in a way that TB cannot handle gracefully after a compaction.  It's not the compaction process itself that screws up the global-messages database, because it is possible to compact without a problem if that database is newly generated.  But once the database is wonky, it cannot be compacted without kicking off the errors once compaction completes.
Another data point, please see discussion of bug 947771:

I have 25 or 30 subfolders of my inbox.  When I drag/drop a message from the inbox to a folder, I get the following error. 

TypeError:  this.view.dbView is null.   The offending code is SearchDialog.js which is the middle of these three lines:


  updateStatusResultText: function() {
    let rowCount = this.view.dbView.rowCount;
    let statusMsg;

I surmise that somewhere along the line of moving messages around my inbox folders, the sqlite database isn't being updated properly, which is causing both errors.
See Also: → 1362483, 1348662, 632791, 520115
See Also: → 947771
I migrated my Thunderbird profile to a brand new Windows 10 computer.  Same behavior as before, although the number of repeats on each gloda error message sometimes exceeds 3000!

I recently regenerated global-messages-db.sqlite and on my next compaction I had the following two errors (but no "sketchy key" errors):

2018-04-18 20:37:14	gloda.index_msg	ERROR	Exception while attempting to mark message with gloda state afterdb commit [Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIMsgDBHdr.getUint32Property]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: resource:///modules/gloda/index_msg.js :: PendingCommitTracker_commitCallback :: line 156"  data: no]

2018-04-18 20:37:14	gloda.index_msg	ERROR	Exception while attempting to mark message with gloda state afterdb commit [Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIMsgDBHdr.getUint32Property]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: resource:///modules/gloda/index_msg.js :: PendingCommitTracker_commitCallback :: line 156"  data: no]

Don't know if they are related to this bug.
(In reply to mgoldey from comment #5)
> I migrated my Thunderbird profile to a brand new Windows 10 computer.  Same
> behavior as before, although the number of repeats on each gloda error
> message sometimes exceeds 3000!

With version 60?
Flags: needinfo?(mgoldey)
60.0b5 to be precise.

Just picked up 60.0b6.  Compacted without any problems.  This time . . . .
Flags: needinfo?(mgoldey)
Example of this bug in action.  These errors repeat so quickly that the entry at the bottom of the screen is pushed off the top of the Error Console in less than 1 second.
Even after restarting TB, these messages continue, 8,000 to 10,000 per second, consuming about 12% of a Coffee Lake i5-8600 processor.  Not good.

The TB status window is stuck on "Deciding which messages to index in folder Flowers" (that is the the folder containing the message that the gloda has never heard off 10,000,000 times an hour).  That's new information.  It seems like the compact routine gets stuck when it comes across a message that isn't in the gloda, and does GOTO 10 instead of gracefully handling it.
If you can find a regression range for when this starts, it would be a fantastic help. See bug 1362483 comment 27.
Summary: Indexing / Compaction Gloda warning repeated every half-second → Indexing / Compaction Gloda warning repeated every half-second causing high cpu. Seeing "Observed header that claims to be gloda indexed but that gloda has never heard of during compaction"
Also, do you have saved search folders?
No saved search folder.  

I will try the regression as soon as time permits.  Compacting my gmail folder causes the errors, so I will run the regression tool against that.  I have made a copy of my profile to run the tool against.  I will start w/ v. 56 and work backwards.  

A data point:  the Activity Manager displayed "deciding which messages to index" at the time the error messages kicked off in the console.
Flags: needinfo?(mgoldey)
Summary: Indexing / Compaction Gloda warning repeated every half-second causing high cpu. Seeing "Observed header that claims to be gloda indexed but that gloda has never heard of during compaction" → Indexing / Compaction Gloda warning repeated every half-second causing high cpu. Seeing "Observed header that claims to be gloda indexed but that gloda has never heard of during compaction" in error console and "Determining which messages to index""
Whiteboard: [regression:TB5x?]
See Also: → 1438317
Summary: Indexing / Compaction Gloda warning repeated every half-second causing high cpu. Seeing "Observed header that claims to be gloda indexed but that gloda has never heard of during compaction" in error console and "Determining which messages to index"" → Gloda warning repeated every half-second causing high cpu and slow response. "Observed header that claims to be gloda indexed but that gloda has never heard of during compaction" in error console and "Determining which messages to index""
I have not been able to find a regression range but will keep trying.

Yesterday, I deleted global-messages-db.sqlite because the previous 3 or 4 compactions had triggered this bug.  

Today, I compacted again against a newly-build sqlite database, and received the following messages in the Browser Console:  I don't know if they are related but here they are:

2018-08-19 11:18:59	gloda.index_msg	ERROR	Exception while attempting to mark message with gloda state afterdb commit [Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIMsgDBHdr.getUint32Property]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: resource:///modules/gloda/index_msg.js :: PendingCommitTracker_commitCallback :: line 156"  data: no]
  log4moz.js:680
2018-08-19 11:18:59	gloda.index_msg	ERROR	Exception while attempting to mark message with gloda state afterdb commit [Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIMsgDBHdr.getUint32Property]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: resource:///modules/gloda/index_msg.js :: PendingCommitTracker_commitCallback :: line 156"  data: no]

2018-08-19 11:18:59	gloda.index_msg	ERROR	Exception while attempting to mark message with gloda state afterdb commit [Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIMsgDBHdr.getUint32Property]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: resource:///modules/gloda/index_msg.js :: PendingCommitTracker_commitCallback :: line 156"  data: no]
  log4moz.js:680
2018-08-19 11:18:59	gloda.index_msg	ERROR	Exception while attempting to mark message with gloda state afterdb commit [Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIMsgDBHdr.getUint32Property]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: resource:///modules/gloda/index_msg.js :: PendingCommitTracker_commitCallback :: line 156"  data: no]

2018-08-19 11:18:59	gloda.index_msg	ERROR	Exception while attempting to mark message with gloda state afterdb commit [Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIMsgDBHdr.getUint32Property]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: resource:///modules/gloda/index_msg.js :: PendingCommitTracker_commitCallback :: line 156"  data: no]
  log4moz.js:680
2018-08-19 11:18:59	gloda.index_msg	ERROR	Exception while attempting to mark message with gloda state afterdb commit [Exception... "Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIMsgDBHdr.getUint32Property]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: resource:///modules/gloda/index_msg.js :: PendingCommitTracker_commitCallback :: line 156"  data: no]
Thank you for the reminder.   

It does seem like 54.0a1 does NOT exhibit the compaction bug.  Impossible to prove a negative, of course.  But I deleted and moved (from folder to folder) lots of messages, and mixed it up pretty well.  I did not see this compaction bug, even once.
Flags: needinfo?(mgoldey)
(In reply to mgoldey from comment #16)
> Thank you for the reminder.   
> 
> It does seem like 54.0a1 does NOT exhibit the compaction bug. 

Still true?
Flags: needinfo?(mgoldey)
Still true.  Not that I've been using 54.0a1 for the past three weeks, but I fired it up again today and deleted and moved a lot of e-mail.  No messages in the error console related to this bug.
Flags: needinfo?(mgoldey)

I was able to do a solid test today and can confirm that this behavior is not present in TB 54.0a1.

I ran into the compaction bug today using TB 65.0a1. I immediately made three copies of my profile folder, and then ran each of 54.0a1, 65.0a1 and 66.0a1 against one of the folders dedicated to each build. Each time, within TB, I moved one and the same message from my Inbox to a subfolder, and then right clicked on the Inbox > Compact. Here were the results:

54.0a1 -- no error reported in the error console
65.0a1 -- error re: gloda as noted in bug report
66.0a1 -- error re: gloda as noted in bug report

I consider this a firm answer to the question posed: this behavior does not occur in 54.0a1.

hello there, i recently noticed that thunderbird ran into high cpu usage (like 30% of my cpu computer) and draning my battery. It seems that i have the exact same problem, activity window show indefinitly "Determining which messages to index" and the error log console is looping on "gloda.index_msg WARN Observed header that claims to be gloda indexed but that gloda has never heard of during compaction. In folder: mailbox://nobody@Local%20Folders/Inbox sketchy key: 84 subject: subjectmessage" where "subjectmessage" is the subject of an email i received this summer.

When i'm searching for this subjet on the global search bar i do have one result but when i clic on the result to open it the newly opened tab is empty (i mean no message in the list as it should be).

i dont understand why "nobody@" but anyway i repared the inbox folder whitout success, error still occured.

i just tried to deleted the global-messages-db.sqlite (~228mo) file as suggested on https://support.mozilla.org/en-US/kb/rebuilding-global-database and for now thunderbird is still reindexing all the messages.

is still reindexing all the messages.

and here we are again, after it worked fine for like an our thunderbird ran again into high cpu usage and the following message "Observed header that claims to be gloda indexed but that gloda has never heard of during compaction." looping into the error console.

Guillaume:

You may be seeing this over and over if Thunderbird is set up to compact folders without asking you for permission. Every time it compacts folders, this gloda problem will (usually) start again.

You can control this somewhat by telling Thunderbird not to compact automatically, so it has to ask you first. Also, tell it to compact only when it will save 50MB or 100 MB or whatever you like, so it will ask less often.

Hope that helps.

Duplicate of this bug: 1517647

Benjamin, if bug 1362483 does not make immediate progress, could you work with reporter(s) to find a regression range for this and bug 1362483? (fair chance they are the same issue - reference comment 19)

Status: UNCONFIRMED → NEW
Depends on: 1362483
Ever confirmed: true
Flags: needinfo?(benjamin)
See Also: 1362483

They do seem the same. I'll see what I can do to help.

Flags: needinfo?(benjamin)

Running a patched version of TB 60.8.1 that incorporates a patch fixing the endless repetition of gloda error messages as described in Bug 1362483.

On compaction, the error console reports the following:

uncaught exception: 2147746065 autosync.js:206:13

2019-07-16 14:05:03 gloda.index_msg WARN Observed header that claims to be gloda indexed but that gloda has never heard of during compaction. In folder: mailbox://[user]@mail.[domain].[tld]/Inbox/purchases sketchy key: 49818944 subject: Your AmazonSmile order of Chocolate Kennedy Half... and 2 more items.

... and about 30 similar messages. But each message is reported once, with a unique "sketchy key." But the endless repetition does not occur.

Going back my original bug report, the primary issue appears to have been resolved. (Huzzah!) The underlying gloda problems remain.

Per above, the endless gloda message loop reported here is a dup of 1362483, now resolved. However, also as stated, the underlying gloda issue (indexing failure) is not resolved. Need to either (1) restate this bug to address the underlying gloda issue, or (2) mark this issue as a dup of 1362483 (resolved) and create a new bug report focused on the underlying gloda issue. Not my call, but the underlying bug needs to be in the list.

You need to log in before you can comment on or make changes to this bug.