gloda does not flag non-offline messages as having attachments [was: faceted search ignores attachments flag]

NEW
Unassigned

Status

--
enhancement
9 years ago
3 years ago

People

(Reporter: peterp, Unassigned)

Tracking

Trunk
x86
Mac OS X

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

9 years ago
User-Agent:       Opera/9.80 (Macintosh; Intel Mac OS X; U; en) Presto/2.2.15 Version/10.10
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.5) Gecko/20091121 Thunderbird/3.0

Fresh profile, couple of IMAP folders, cca 5GB of emails, Inbox with 18000 emails. After indexing the folder, looking for mail via faceted search from X with attachment doesn't work: though person X is filtered, Attachments is (0), despite seeing emails with attachment in the default view.

Reproducible: Always

Steps to Reproduce:
1. choose Search all messages
2. enter person's name, who you know has sent you some attachment(s)
3. in new tab, 'Filters' misses Attachments entirely
4. Filter From Me or To Me are both (0) <-- problem #1
5. click your name in People (must involve...)
6. Attachments Filter appears, but stays (0)
Actual Results:  
Attachments filter could not be selected.

Expected Results:  
Attachments filter to contain correct number of emails found.

Could this be related to the fact I have sqlite3 installed on my system as a part of macports.org?
Are you sure all you emails re indexed ?

When you filter by attachment - does this generates something in Tools -> Error console ?
(Reporter)

Comment 2

9 years ago
All the emails in given folder are indexed. Other folders are yet being indexed.

Just doing faceted search gives me the following repeated 8 times:

Warning: Cannot specify value for internal property.  Error in parsing value for '-x-system-font'.  Declaration dropped.
Source File: chrome://messenger/content/glodaFacetView.xhtml
Line: 0

As for filtering for attachment, that is not doable - Attachments (0) is grayed out.
(Reporter)

Comment 3

9 years ago
The bug should probably be something else, since the problem is also in step 4. in the steps to reproduce.
(Reporter)

Comment 4

9 years ago
With all the messages indexed, the problem still persists. How can I assist you with debugging this problem?
If you install https://addons.mozilla.org/en-US/thunderbird/addon/9873 glodaquilla, we can make sure that your messages are indexed.  You can also dump gloda following https://wiki.mozilla.org/Thunderbird:Using_Gloda.

With this we'll be sure that your emails are indexed.
(Reporter)

Comment 6

9 years ago
Will do. One thing though: I can not see browser.dom.window.dump.enabled in config editor - has it been replaced? Removed?
You need to add it :-)
(Reporter)

Comment 8

9 years ago
Getting somewhere :)
I reverted to my old profile, installed glodaquilla. I will provide details on
Inbox only, let me know if that's enough or if I should work with other folders
as well (got about 40 of them).

OnDisk is off for all the messages (stored on IMAP server).
All old messages (old = received more than 4 hours ago) have their GlodaID,
however some messages have the same GlodaID.
Next, the old messages (same definition as above) have 0 in Gloda Dirty. Some,
received after 09:50 today, have this field empty and newly received messages
have Gloda Dirty flag 1.
Still, if I see the message has valid GlodaID and I try to search specifically
for it, I can not reach it. For example: I search for NAME, I get new tab with
search results, but filter only contains From Me and To Me. Only when I click
either of those, rest of the filters appear (Starred, Attachments). Both
Starred and Attachmens have (0) near them, even though I know that NAME sent me
some emails with attachment(s).
What else can I do? I admit I have not tried the dump.enabled yet (scared of the load it will generate). Would that move us forward?
Peter sorry for the delay :-(.

Today what would move us forward would be to try 3.1b2pre as we have fixed plenty fo issues in gloda. that might work now. That build will force a reindex and blow you gloda db so if you do not intend o stay using it backup your current db.

If it doesn't and you still see the issue and identical gloda id , then will need to use dump.
(Reporter)

Comment 10

9 years ago
Ludovic,
thank you for the response. I have tried using 3.1b2pre (http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-comm-1.9.2/, thunderbird-3.1b2pre.en-US.mac.dmg    15-Apr-2010 04:58     26M).

After reindexing all the folders I confirm that the issue is not fixed
and stays the same as described originally :-(
As for the messages with identical glodaID, I tried saved them to file,
deleted, re-added and - they got the same glodaID.

So: what exactly would you need me to do to move forward?
Thank you,
Peter
Andrew - how do we go on with getting more information with this ?
No information required.  Without the mime body available, gloda does not archive any data about the presence of attachments.

This is a potentially annoying problem because my impression of the 'has attachment' bit on the message header is that it might only get set when the message gets displayed.  It's possible the code that led me to believe that pre-dates the spam filter; maybe the spam filter's streaming of the message gets the bit set?

In any event, that's generally the problem and it's certainly possible to reflect that bit into gloda somehow, but the semantics potentially get a bit ugly.
Severity: major → enhancement
Status: UNCONFIRMED → NEW
Component: Search → Database
Ever confirmed: true
Product: Thunderbird → MailNews Core
QA Contact: search → database
Summary: faceted search ignores attachments flag → gloda does not flag non-offline messages as having attachments [was: faceted search ignores attachments flag]
Version: unspecified → Trunk

Comment 13

9 years ago
(In reply to comment #12)
 
> This is a potentially annoying problem because my impression of the 'has
> attachment' bit on the message header is that it might only get set when the
> message gets displayed.

It's much more likely to get set when headers are downloaded, and cleared when the message is displayed, because we treat multipart (non-alternative, iirc) messages as likely having attachments, but sometimes discover they don't once we've streamed them.
(In reply to comment #13)
> It's much more likely to get set when headers are downloaded, and cleared when
> the message is displayed, because we treat multipart (non-alternative, iirc)
> messages as likely having attachments, but sometimes discover they don't once
> we've streamed them.

That definitely makes more sense.  Can you elaborate on what the IMAP server is telling us?  Are we finding out that the mime type of the root message is multipart/* and setting the flag and then when we see that there aren't really any attachments we clear it?

Comment 15

9 years ago
(In reply to comment #14)

> That definitely makes more sense.  Can you elaborate on what the IMAP server is
> telling us?  Are we finding out that the mime type of the root message is
> multipart/* and setting the flag and then when we see that there aren't really
> any attachments we clear it?

For any message (pop3 or imap), we look at the content type when parsing the header and if it's multipart/mixed, we set the has attachment flag. When displaying the message, we update the flag based on whether we were told of any attachments (and we ignore certain ones like v-cards)

http://mxr.mozilla.org/comm-central/source/mailnews/local/src/nsParseMailbox.cpp#1658
Thanks for the reference.  Whoever takes this on will find the information very useful.

At the current time I am not going to be pursuing a fix for this based on the high potential for false positives and the complexity of solutions even if false positives weren't an issue.
You need to log in before you can comment on or make changes to this bug.