Last Comment Bug 732372 - Search fails - a.contact is undefined in gloda.js
: Search fails - a.contact is undefined in gloda.js
Status: RESOLVED FIXED
[datalossy]
:
Product: Thunderbird
Classification: Client Software
Component: Search (show other bugs)
: Trunk
: All All
: -- major (vote)
: Thunderbird 13.0
Assigned To: Andrew Sutherland [:asuth]
:
:
Mentors:
Depends on:
Blocks: 734507
  Show dependency treegraph
 
Reported: 2012-03-02 04:11 PST by Gareth Hughes
Modified: 2012-04-04 03:09 PDT (History)
12 users (show)
standard8: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
+
fixed
+
fixed


Attachments
A fix for this. (897 bytes, patch)
2012-03-07 10:33 PST, Blake Winton (:bwinton) (:☕️)
bugmail: review-
Details | Diff | Splinter Review
layer 1 - logging fixes (23.41 KB, patch)
2012-03-07 19:30 PST, Andrew Sutherland [:asuth]
mozilla: review+
standard8: approval‑comm‑aurora+
standard8: approval‑comm‑beta+
Details | Diff | Splinter Review
layer 2: tests that fail (7.41 KB, patch)
2012-03-07 19:32 PST, Andrew Sutherland [:asuth]
mozilla: review+
standard8: approval‑comm‑aurora+
standard8: approval‑comm‑beta+
Details | Diff | Splinter Review
layer 3 - don't pass the card as a raw rep; wrap it in an object (3.24 KB, patch)
2012-03-07 19:36 PST, Andrew Sutherland [:asuth]
mozilla: review+
standard8: approval‑comm‑aurora+
standard8: approval‑comm‑beta+
Details | Diff | Splinter Review

Description Gareth Hughes 2012-03-02 04:11:12 PST
User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11

Steps to reproduce:

searched for some words in the standard search bar


Actual results:

Goes to the search page and the Searching animation just sits there.
Went to Error Console and got

Timestamp: 02/03/12 12:09:07
Error: a.contact is undefined
Source File: resource:///modules/gloda/gloda.js
Line: 1345

Timestamp: 02/03/12 12:09:03
Error: uncaught exception: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: chrome://messenger/content/search.xml ::  :: line 156"  data: no]
Source File: resource:///modules/gloda/gloda.js
Line: 1345



Ubuntu 10.04 64 using Thunderbird Daily

  Application Basics

    Name: Thunderbird-Trunk
    Version: 13.0a1
    User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120229 Thunderbird/13.0a1
    Profile Directory: Open Directory

              (Local drive)
    Application Build ID: 20120229053756
    Enabled Plugins: about:plugins
    Build Configuration: about:buildconfig
    Crash Reports: about:crashes
    Memory Use: about:memory

  Mail and News Accounts
    account1:
      INCOMING: account1, , (imap) imap.gmail.com:993, SSL, passwordCleartext
      OUTGOING: smtp.gmail.com:465, SSL, passwordCleartext, true

    account2:
      INCOMING: account2, , (none) Local Folders, plain, passwordCleartext

  Extensions
    British English Dictionary, 1.19.1, true, en-GB@dictionaries.addons.mozilla.org
    Lightning, 1.5a1, true, {e2fda1a4-762b-4020-b5ad-a41df1933103}
    Lightning Nightly Updater (Unofficial), 0.11.110714, true, lightning-nightly-updater@olive
    Provider for Google Calendar, 0.13pre, true, {a62ef8ec-5fdc-40c2-873c-223b8a6925cc}

  Important Modified Preferences

    Name: Value

      browser.cache.disk.capacity: 1048576
      browser.cache.disk.smart_size_cached_value: 1048576
      browser.cache.disk.smart_size.first_run: false
      extensions.lastAppVersion: 13.0a1
      font.size.variable.x-western: 15
      mail.openMessageBehavior.version: 1
      network.cookie.prefsMigrated: true

  Graphics

    Adapter Description: NVIDIA Corporation -- Quadro NVS 285/PCI/SSE2
    Vendor ID: tion
    Device ID: SSE2
    Driver Version: 2.1.2 NVIDIA 195.36.24
    WebGL Renderer: Blocked for your graphics driver version. Try updating your graphics driver to version NVIDIA 257.21 or newer.
    GPU Accelerated Windows: 0/3. Blocked for your graphics driver version. Try updating your graphics driver to version NVIDIA 257.21 or newer.

  


Expected results:

Presents search results
Comment 1 Blake Winton (:bwinton) (:☕️) 2012-03-07 10:31:03 PST
I got a report of this from shorlander, too.
Comment 2 Blake Winton (:bwinton) (:☕️) 2012-03-07 10:33:02 PST
Created attachment 603781 [details] [diff] [review]
A fix for this.

So, this checks for the things we're getting properties on, but doesn't tell us why it's failing.  Still, it does give us the place to hook in, if we wanted to dig further, and it means that we'll let the user see their search results.

Thanks,
Blake.
Comment 3 Blake Winton (:bwinton) (:☕️) 2012-03-07 10:54:29 PST
I can repro this, and dumping the failing contact gives me:
Identity:email:subscribe@dandyhorsemagazine.com

Looking at it in the address book, it has no Display Name, but that doesn't seem like a fatal error to me.
Comment 4 Andrew Sutherland [:asuth] 2012-03-07 13:51:38 PST
Comment on attachment 603781 [details] [diff] [review]
A fix for this.

It's a strong invariant that identities have contacts associated with them.  This is way too much of a hack.

Since you know the identity that it happens for, you would probably want to look in the database and look for obvious troubles.  For example, after invoking "sqlite3 global-messages-db.sqlite" in your profile directory:

sqlite> select * from identities where value = 'bwinton@mozilla.com';
37|33|email|bwinton@mozilla.com||0

sqlite> select * from contacts where id = 33;
33|||121|0|Blake Winton|{"68":[]}

Obvious trouble would be there not being a contact for that identity.  You might also want to then try the reverse lookup for paranoia reasons since the schema does allow multiple identities per contact.  (The addressbook just can't provide us with the data.)

sqlite> select * from identities where contactID = 33;
37|33|email|bwinton@mozilla.com||0


Assuming that turns out fine, it's possible that the query subsystem is dying or such.
Comment 5 Mark Banner (:standard8) 2012-03-07 14:26:22 PST
I just saw this error a couple of times when sending emails and getting updates for ab cards. This might help reproducing.
Comment 6 Andrew Sutherland [:asuth] 2012-03-07 15:49:49 PST
Forget the SQL from comment 4.  The cool SQL is:

select identities.*, contacts.* FROM identities LEFT JOIN contacts ON identities.contactID = contacts.id WHERE contacts.id IS NULL;

and bienvenu ran it and found 110 angry e-mail addresses.
Comment 7 Andrew Sutherland [:asuth] 2012-03-07 18:22:25 PST
I have a failing unit test for message indexing; I'm going to try and make the addressbook indexing test fail too, as it sounds like it needs test coverage.

The problem looks like this:
2012-03-07 18:15:24	gloda.indexer	DEBUG	Exception in batch processing: [Exception... "Cannot modify properties of a WrappedNative"  nsresult: "0x80570034 (NS_ERROR_XPC_CANT_MODIFY_PROP_ON_WN)"  location: "JS frame :: resource:///modules/gloda/gloda.js :: gloda_ns_grokNounItem :: line 1978"  data: no]

Presumably there has been a (reasonable) change in wrapped native behaviour which broke us.  In a nutshell, some of the code I wrote expected the aRawReps passed to Gloda.grokNounItem to be a JS object it could mutate, but the AB logic I added just passes the card object through as aRawReps.  I was probably trying to be efficient :(
Comment 8 Andrew Sutherland [:asuth] 2012-03-07 19:30:10 PST
Created attachment 603961 [details] [diff] [review]
layer 1 - logging fixes

Trying to figure out what was going on revealed issues with our logging code; I'm fixing a bunch of them at once here.  The changes fall into the categories:
- Fixing not throwing an Error object.  (We were throwing strings; some of this code was written when I was new to JS...)
- Fixing using log.warning (which does not exist) instead of log.warn
- Fixing assumptions that exceptions always have 'stack'/whatever properties.  Strings don't, and the error that was causing this problem didn't either.
- Adding logging where we didn't have any.
- Removing useless logging that has a performance overhead.
Comment 9 Andrew Sutherland [:asuth] 2012-03-07 19:32:52 PST
Created attachment 603962 [details] [diff] [review]
layer 2: tests that fail

Here's the message indexing test patch that fails.

I also moved some test setup logic (imports) into glodaTestHelper that had been living outside it.

Unfortunately, it's an order of magnitude more work to add a test for the address book indexing test because it would require some generalization in glodaTestHelper.js that I don't have time to undertake.
Comment 10 Andrew Sutherland [:asuth] 2012-03-07 19:36:04 PST
Created attachment 603964 [details] [diff] [review]
layer 3 - don't pass the card as a raw rep; wrap it in an object

This is the actual fix.  I have no messed-up contacts after indexing 86k messages with this fix stack.

I have pushed this patch series to try, but won't be around to check the results before tomorrow.  All gloda xpcshell-tests do pass for me with the patch stack.

Standard8, if you want to steal the reviews from bienvenu and/or land them, that's fine with me.

I have not ascertained what release of Thunderbird this problem appears in.  It's pretty serious, so we should land the fix on anything previous, but we might have to go so far as to bump the schema again.
Comment 11 Andrew Sutherland [:asuth] 2012-03-07 19:39:46 PST
Er, and for the logging patch, it's worth noting that log.log("blah:", ex) is indeed valid.  The changes I made to support fancy pants logging made that valid.  The idea being that it's pretty easy to just stick commas in there and reference objects that will toString() nicely (but can also be serialized in a richer form as well.)
Comment 12 David :Bienvenu 2012-03-07 20:21:37 PST
yeah, I kinda think we want to bump the schema, unless we haven't shipped a product with this bug.
Comment 13 David :Bienvenu 2012-03-07 20:26:11 PST
Comment on attachment 603961 [details] [diff] [review]
layer 1 - logging fixes

this looks reasonable - I'll have to build and run with it, but assuming that's ok, r=me.
Comment 14 David :Bienvenu 2012-03-07 21:17:51 PST
Comment on attachment 603962 [details] [diff] [review]
layer 2: tests that fail

This code was in the compaction test because the test only makes sense with berkeley store, but the other tests should work with any store type. Is there a reason you moved it? Other than that, it looks OK. My build is still going, but as long as the try server build results look OK, this looks good to me.

-Services.prefs.setCharPref("mail.serverDefaultStoreContractID",
-                           "@mozilla.org/msgstore/berkeleystore;1");
-
Comment 15 Jonathan Protzenko [:protz] 2012-03-08 02:58:22 PST
There's a schema bump in Thunderbird 11 already, so that means another one in Thunderbird 13?
Comment 16 Jonathan Protzenko [:protz] 2012-03-08 02:58:38 PST
Sorry, meant Thunderbird 10.
Comment 17 Mark Banner (:standard8) 2012-03-08 03:19:01 PST
The issue I've been seeing - which is that gloda is not indexing newly received emails, but marking them as bad, is present in TB 10 & later. It seems to be fine in TB 11.

Why I didn't notice this during the TB 10 cycle, I am not quite sure.

At the moment, the fixes on this bug seem to fix the issue when I'm testing with the try server build.

Therefore I think we need these to land for 11 and do another schema bump, unless there would be a way to avoid it (like could we update the definition of 'bad' to be an id of '2' and treat a gloda id of '1' as dirty, or could that impact add-ons too much?).
Comment 19 Mark Banner (:standard8) 2012-03-08 08:30:08 PST
Comment on attachment 603961 [details] [diff] [review]
layer 1 - logging fixes

[Triage Comment]
Running backported tests shows that this only affects back to TB 11. Not 10 as my tests suggested.

Therefore we're taking it back just that far at the moment.
Comment 20 Mark Banner (:standard8) 2012-03-08 08:30:22 PST
Comment on attachment 603962 [details] [diff] [review]
layer 2: tests that fail

[Triage Comment]
Comment 22 Andrew Sutherland [:asuth] 2012-03-08 12:51:46 PST
(In reply to David :Bienvenu from comment #14)
> Comment on attachment 603962 [details] [diff] [review]
> layer 2: tests that fail
> 
> This code was in the compaction test because the test only makes sense with
> berkeley store, but the other tests should work with any store type. Is
> there a reason you moved it? Other than that, it looks OK. My build is still
> going, but as long as the try server build results look OK, this looks good
> to me.
> 
> -Services.prefs.setCharPref("mail.serverDefaultStoreContractID",
> -                           "@mozilla.org/msgstore/berkeleystore;1");
> -

It broke when I moved the import that got us "Services" into glodaTestHelper, which brought my attention to it.  I figured it was a fix to make the test pass, and that there was presumably no harm in centralizing it, especially as we might want to make that bit configurable.

Would it be worth having the gloda tests run parameterized so that they try both store types (when they make sense)?  If all the event firing and sources of potential asynchronous behaviour happen above the pluggable storage layer, we probably don't need to...
Comment 23 Andrew Sutherland [:asuth] 2012-03-08 13:10:54 PST
(In reply to Mark Banner (:standard8) from comment #17)
> Therefore I think we need these to land for 11 and do another schema bump,
> unless there would be a way to avoid it (like could we update the definition
> of 'bad' to be an id of '2' and treat a gloda id of '1' as dirty, or could
> that impact add-ons too much?).

The leftover side-effects of this bug were:

1) contact objects are not created, so any access to someIdentity.contact would return undefined, and then possibly throw.
2) Message indexing dies in the 'optimize' stage because of #1, resulting in messages being marked bad.

We could do a soft-schema upgrade as you say, and it's a good idea, but #2 would recur unless we also addressed #1.  We can address #1 more easily by detecting identities without contacts and just inserting them during the schema upgrade transition.  Since the failures are correlated with contacts, the lack of information on the display name at that point from gloda is offset by the presence of display name information on the address book card.

The other option is to just do a hard schema bump and leverage the schema bump to also call setGrowthIncrement on the database which should potentially provide significant gloda database performance improvements.  (It causes SQLite to grow the database file by many pages at once, rather than one at a time, thereby reducing disk fragmentation.)

It's unfortunate that the schema bump happened in 10 rather than 11, otherwise I think it would be an easy call to do the hard schema bump.
Comment 24 David :Bienvenu 2012-03-08 13:12:05 PST
(In reply to Andrew Sutherland (:asuth) from comment #22)
> 
> It broke when I moved the import that got us "Services" into
> glodaTestHelper, which brought my attention to it.
ah, I see.

  I figured it was a fix
> to make the test pass, and that there was presumably no harm in centralizing
> it, especially as we might want to make that bit configurable.
It makes it pass if the default store type is maildir :-) I haven't figured out how I'm going to test multiple store types. Ideally, there would be a set of tests that would run once for each store type, and others that would just run with the default store type, but it was often surprising which tests broke for maildir store. I'd rather not double the time it takes to run tests, but the list of tests that don't use mail storage is pretty small.
> 
> Would it be worth having the gloda tests run parameterized so that they try
> both store types (when they make sense)?  If all the event firing and
> sources of potential asynchronous behaviour happen above the pluggable
> storage layer, we probably don't need to...

No, there's some event firing that happens in the pluggable store, in some cases, if the store handles the operation itself. I'd like to move that up a level when possible, but event timing could still vary.
Comment 25 Andrew Sutherland [:asuth] 2012-03-08 14:20:09 PST
(In reply to David :Bienvenu from comment #24)
> It makes it pass if the default store type is maildir :-)

Oh, okay, so my moving it just lost you some (manual) test coverage.  I've just put it back (with r=bienvenu since your review pointed it out as potentially undesirable):

http://hg.mozilla.org/comm-central/rev/1e2084994598
Comment 26 Gareth Hughes 2012-03-12 08:47:51 PDT
It's been enjoyable to just be a tester and watch someone else fix the bugs.

But, I just retested and got

Timestamp: 12/03/12 15:42:53
Error: uncaught exception: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: chrome://messenger/content/search.xml ::  :: line 156"  data: no]

No doubt this is a separate bug, let me know if you want it raised elsewhere

Application Build ID 20120311052325
Comment 27 Andrew Sutherland [:asuth] 2012-03-12 10:00:36 PDT
Yeah, that's an unrelated bug.  If you could file a bug under Thunderbird Search if you don't see one with that signature already, that would be fantastic.  I expect that to either be a tabs-on-top fallout or something that's always been that way.  Please cc me (":asuth") when you file.

Thanks!
Comment 28 Ludovic Hirlimann [:Usul] 2012-04-04 03:02:02 PDT
Gareth did you raise the new bug ?
Comment 29 Gareth Hughes 2012-04-04 03:09:40 PDT
No, I found a matching one. Added comments to it as well and I swear I linked it back to here but I see no evidence. (I use Jira)
Also, I've not seen it happen since. Consider closed.

Note You need to log in before you can comment on or make changes to this bug.