Open
Bug 745769
Opened 13 years ago
Updated 1 year ago
After quick search can't open search results / listed messages
Categories
(Thunderbird :: Search, defect)
Tracking
(Not tracked)
NEW
People
(Reporter: devotip, Unassigned)
Details
Attachments
(5 files, 1 obsolete file)
User Agent: Mozilla/5.0 (Windows NT 6.0; rv:11.0) Gecko/20100101 Firefox/11.0
Build ID: 20120312181643
Steps to reproduce:
Searched some keyword making the search results list to appear then clicked on one of the listed elements
Actual results:
the message appeared for a split second then vishes and appears the "welcome to thunderbird" page with in sight "tip of the week" and "test pilot survey launched on march 27th 2012"
Expected results:
clicked message in view
Comment 1•13 years ago
|
||
(In reply to Devoti Paolo from comment #0)
I don't have this problem. Can you check Tools ➯ Error Console just after you see the welcome page?
Reporter | ||
Comment 2•13 years ago
|
||
good hint last message is
too much recursion
resource:///modules/gloda/mimemsg.js line:172
which
points to
function stripEncryptedParts(aPart) {
if (aPart.parts && aPart.isEncrypted) {
aPart.parts = []; // Show an empty container.
} else if (aPart.parts) {
aPart.parts = aPart.parts.map(stripEncryptedParts);
}
return aPart;
}
Reporter | ||
Comment 3•13 years ago
|
||
Reporter | ||
Comment 4•13 years ago
|
||
Reporter | ||
Comment 5•13 years ago
|
||
Comment 6•13 years ago
|
||
This sounds like the quick filter bar is being used, a collapsed thread is being clicked on, and selectionsummaries.js is using MsgHdrToMimeMessage to stream some messages. One of the messages sounds like it has a messed-up MIME structure, either ridiculously deep, or structured in a way that tricks us into creating a loop in our data-structure.
Is this problem reproducible AKA do you know which messages this is happening for? I think we'd be interested in looking at the mime structure of the messages. Does it happen for all messages? If so, maybe you are using an extension that messes with libmime?
Reporter | ||
Comment 7•13 years ago
|
||
it happens for most searches but not all, happens even if the selected message is not part of a thread. A messed up message is possible and probably common considering the variety of sources and including spam. There are emails in folders back to year 2000.
The messages which are failing to open after the search can be opened going to them directly in their folder, so there is some dirt collected during the search process.
There is an error about "aFolder" but it is really unexpected such a name is not set by me.
How can I find what message is being processed during the "too much recursion" error? is there any available deep trace/warning logging optional feature?
Reporter | ||
Comment 8•13 years ago
|
||
Still open with version 12.0.1
Any hint about how to put something to trace the reason for the aFolder empty object variable?
Reporter | ||
Updated•13 years ago
|
Summary: After search can't open listed messages in thunderbird 11.0.1 → After search can't open listed messages in thunderbird 11.0.1 and 12.0.1 too
Comment 9•13 years ago
|
||
does it happen if you start thunderbird in safe mode?
http://support.mozillamessaging.com/en-US/kb/safe-mode
Summary: After search can't open listed messages in thunderbird 11.0.1 and 12.0.1 too → After quick search can't open search results / listed messages
Reporter | ||
Comment 10•13 years ago
|
||
Comment 11•13 years ago
|
||
Andrew, I think the code for the search results uses MsgHdrToMimeSnippetAndMeta for displaying the search results. If MsgHdrToMimeSnippetAndMeta uses MsgHdrToMimeMessage, then it could be what's making the entire tab fail.
A way to debug this would be to
- locate each one of these messages (that are listed in the search results) in their respective folders
- use my glodebug addon (see third paragraph at https://github.com/protz/GMail-Conversation-View/wiki/Debugging-Conversations) to inspect each one of these messages
- find the one that can't be successfully inspected
- attach it to the bug here.
Does that sound like a legit theory?
Reporter | ||
Comment 12•13 years ago
|
||
The second paragraph hint at provided link was too interesting to skip.
Deletion of global-message-db.sqlite solved the issue, there should be a "repair all indexes" button somewhere.
Comment 13•13 years ago
|
||
It's possible deleting the database may only provide a temporary reprieve since gloda may merely not know about the broken message yet.
Can you see if the problem comes back once gloda finishes indexing? If you look in the activity manager ("tools" menu, "activity manager" entry), you can see when gloda has finished re-indexing. Then you can retry the search and see if it fails.
If it fails, then doing what protz proposed would be great.
Reporter | ||
Comment 14•13 years ago
|
||
OK, if the issue surface again I will follow the given directions.
In the meanwhile,opening thunderbird I see the indexing looks stuck on one old folder and the error console gives a "too much recusion" error without me doing anything.
I start to think that the issue is related to the thread retrieval process.
Any hint? I can't publish my whole folders on mozilla for you to check. How can I arrange a bulk verify or a indexing log or a way to know on what the indexer stopped?
Reporter | ||
Comment 15•13 years ago
|
||
one message (of many) where glodebug tells "Gloda found 0 items"
whole output is:
/---------------------------\
| Mime results |
\---------------------------/
Size of the message: 834
Structure of the message:
Message (834 bytes): Re: segnalazione informatica
1 Body: text/plain (834 bytes)
Number of attachments: 0
This message is from: poltel.mi@poliziadistato.it
/---------------------------\
| Gloda results |
\---------------------------/
Gloda found 0 items
Comment 16•13 years ago
|
||
You can have the gloda indexer log to the (system, not error console) as it indexes to see what it is up to:
https://wiki.mozilla.org/Thunderbird:Debugging_Gloda
You can also install the Glodaquilla extension which lets you add a "gloda id" column. In the event gloda decides that a message is bad because it causes internal failures, to stop itself from indexing that message it will set the value to something less than 32 decimal (hex 0x20, which is how the column is encoded). It sounds like gloda is not detecting the self-failure here, however, so the indexer should just go off the rails every time it looks at the message that is killing it.
I don't think the thread retreival process is likely; gloda does not construct thread graphs itself, so loops are not possible. The header storage (.msf files) does build thread graphs, but bugs related to cycles from messed up messages were fixed some time ago.
Reporter | ||
Comment 17•13 years ago
|
||
with console provided info I found one killer message, on it the gloda debug button dies and gives these following informations in the log
2012-06-04 10:05:33 gloda.datastore DEBUG QUERY FROM QUERY: SELECT * FROM
messages INNER JOIN messagesText ON messages.id = messagesText.rowid WHERE (id I
N (SELECT id FROM messages WHERE (folderID IN (73)) AND (messageKey IN (12634612
))) AND +deleted = 0 AND +folderID IS NOT NULL AND +messageKey IS NOT NULL) ARGS
:
2012-06-04 10:05:33 gloda.indexer INFO Queue-ing job for indexing: mess
age
The MsgHdrToMimeMessage callback threw an exception: InternalError: too much rec
ursion
The specific message is a report from a bogous antispam system, it quarantined a mail and forwarded as attachment the quarantined email. Then analyzed the forwarded message and quarantined it and the whole mess was repeated for 15 times before actually forwarding the message.
This implies that the message I received includes one attached eml message which is including 210 nested replica of the nested bogous quarantine process.
This kills gloda, one exception handling is missing.
Comment 18•13 years ago
|
||
Wow, that makes sense, but it definitely was not my first guess. Thanks for the detective work!
Is it possible for you to attach or zip the message up and forward it to myself and protz (using the e-mails we use for bugzilla)? It's okay if it's sensitive and you cannot send it at all, but it's possible that part of the reason the stack gets exhausted so quickly is due to some quirk of the message which affects the rate of stack growth or also results in an actual infinite loop occurring.
Comment 19•13 years ago
|
||
Andrew, do you have any idea about the order of magnitude authorized for the recursion depth? It may be that the code is running under SpiderMonkey and is not being traced, which would explain the surprisingly low level of recursion allowed.
Paolo, congratulations for the detective work!
Comment 20•13 years ago
|
||
(In reply to Jonathan Protzenko [:protz] from comment #19)
> Andrew, do you have any idea about the order of magnitude authorized for the
> recursion depth? It may be that the code is running under SpiderMonkey and
> is not being traced, which would explain the surprisingly low level of
> recursion allowed.
http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/XPCJSRuntime.cpp#2024
2024 JS_SetNativeStackQuota(mJSRuntime, 128 * sizeof(size_t) * 1024);
So it's blowing through 512k on x86 or 1MiB on x86_64. (The related pointer arithmetic does not seem to result in a multiplicative factor, but I could be wrong.) Unless it's not the XPCJSRuntime, but that seems like the right runtime.
And yes, I could see the interpreter burning through that stack pretty easily, although in theory the Method JIT should be active.
Any thoughts on whether we should keep our code recursive and just declare that at some level of depth we refuse to process it, or rejigger things to use iterative logic? We could probably also be more diligent about catching the exceptions...
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 21•13 years ago
|
||
Er, and it may be worth noting that the stack limits are relative to the base of the native stack, as far as I can see:
http://mxr.mozilla.org/mozilla-central/source/js/src/jsapi.cpp#3035
So it could also be the situation where the native C++ stack is already burning through a lot of the stack and leaving us with precious little headroom. In fact, if something jerky is spinning a nested event loop, we could really be in trouble. Hopefully I'm missing something about the stack check or our accounting...
Comment 22•13 years ago
|
||
So from a quick glance at the code, I would say the stripEncryptedParts function is the only recursive part that looks suspicious (I wrote that function). I could be wrong, of course, because it's late here. So assuming that function is recursive, we could:
i) make it tail-recursive, but alas, none of the js engines have TCO,
ii) maintain a stack by hand but that would burn memory as well and potentially fragment the JS stack,
iii) emulate a continuation-style function using a custom stack and a zipper-like algorithm (yay for readability).
I would say the sane solution here is to cut off anything below an arbitrary depth, say, 100. I'm saying "cut" because that function is supposed to strip off any encrypted parts, so we have to be conservative. The rationale being that any message with nesting depth > 100 is insanely formed, and we don't want to be indexing such things...
How does that sound? I can also try to implement iii) because that sounds fun, so don't hesitate to tell me if the conservative solution seems to harsh :).
Cheers,
jonathan
Reporter | ||
Comment 23•13 years ago
|
||
I would vote for a conservative low risk solution.
Everything is ok unless there is too much trash, then is ok to just toss the trash.
Reporter | ||
Comment 24•12 years ago
|
||
It would be nice to limit the exception handling scope to the specific email allowing the search to be usable, or to have a tool to find all the unmanageable emails.
This issue is becoming a very bad and common annoyance. I keep finding people saying "I use Thunderbird but recently the search is too bad".
Reporter | ||
Updated•12 years ago
|
Severity: normal → major
OS: Windows Vista → Windows 7
Version: 11 → 17
Updated•1 year ago
|
Attachment #9386419 -
Attachment is obsolete: true
You need to log in
before you can comment on or make changes to this bug.
Description
•