Last Comment Bug 740499 - IM conversations aren't indexed in gloda on the fly
: IM conversations aren't indexed in gloda on the fly
Status: RESOLVED FIXED
:
Product: Thunderbird
Classification: Client Software
Component: Instant Messaging (show other bugs)
: Trunk
: All All
: -- normal (vote)
: Thunderbird 16.0
Assigned To: Florian Quèze [:florian] [:flo]
:
:
Mentors:
Depends on: 776566
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-29 11:03 PDT by Philipp Kewisch [:Fallen]
Modified: 2012-07-23 09:26 PDT (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
fixed


Attachments
Patch (10.42 KB, patch)
2012-07-13 12:35 PDT, Florian Quèze [:florian] [:flo]
bugmail: review+
Details | Diff | Splinter Review
Patch v2 (11.04 KB, patch)
2012-07-13 16:45 PDT, Florian Quèze [:florian] [:flo]
clokep: review+
standard8: approval‑comm‑aurora+
Details | Diff | Splinter Review

Description Philipp Kewisch [:Fallen] 2012-03-29 11:03:27 PDT
So this is what I did, maybe it can be reproduced differently too:


1. Start a conversation with an offline gtalk contact
2. Write "something"
3. Close the conversation
4. Use "Search all conversations" and search for "something"

Results:

There are no search results

Expected:

The just created conversation where I wrote "something" should show up.
Comment 1 Florian Quèze [:florian] [:flo] 2012-03-29 11:56:24 PDT
Currently IM conversations are only indexed at startup, on the fly indexing hasn't been implemented yet, so the new conversation containing "something" won't be found until you restart Thunderbird.
Comment 2 Florian Quèze [:florian] [:flo] 2012-07-13 12:35:23 PDT
Created attachment 642003 [details] [diff] [review]
Patch

Andrew, I hesitated between requesting review or feedback, because (if there aren't issues that I haven't seen) I think this is good enough to take it (and I know Jb would really like this bug to be fixed for Thunderbird 15 before it goes to beta), but I'm not really satisfied with the code. This patch triggers a re-indexing of the IM conversation each time a new message is received. I wonder if this isn't a bit excessive; for busy IRC channels I would prefer if we could reindex a bit less frequently.

Is there an easy way to schedule an indexer job to be performed in a few seconds instead of asap, but to have it performed immediately if the user initiates a search?
Comment 3 Andrew Sutherland [:asuth] 2012-07-13 13:32:31 PDT
(In reply to Florian Quèze from comment #2)
> Andrew, I hesitated between requesting review or feedback, because (if there
> aren't issues that I haven't seen) I think this is good enough to take it
> (and I know Jb would really like this bug to be fixed for Thunderbird 15
> before it goes to beta), but I'm not really satisfied with the code. This
> patch triggers a re-indexing of the IM conversation each time a new message
> is received. I wonder if this isn't a bit excessive; for busy IRC channels I
> would prefer if we could reindex a bit less frequently.

Yeah, that doesn't sound particularly desirable.  (Even if you did the fulltext index on a per-message basis, that still would not be great from an overhead perspective...)

> Is there an easy way to schedule an indexer job to be performed in a few
> seconds instead of asap, but to have it performed immediately if the user
> initiates a search?

Nothing exists right now; the use-case doesn't really exist for e-mail.  I think it's totally reasonable to do the following however:

1) Create a gloda method that runs through the list of indexers and calls upcomingSearch() and maybe one for userDoneSearching().

2) Have the gloda search box call those methods on focus and blur.  And your chat search box if that does the same thing, maybe specifying the noun types it will search over?

3) Have your logic mark the conversations dirty (or whatever would cause an indexing sweep to index them next time at startup in event of a crash, etc.) and put them in a list that will get flushed to be indexing jobs when that method gets called.

The theory is that you can cram the data into the indexer faster than the user can type their search and hit enter.  It's worth noting that this is in some ways backwards from previous thoughts about having gloda pause the indexer when the user is performing a search in order to make the search results return more quickly...

If there's a better way to predict that the search is coming, I'm open to that.


A more idealized solution would be to perform an in-memory search of your conversations so that we don't have to bounce things off disk, but that would make everything more complicated and be a lot of work, etc.
Comment 4 Andrew Sutherland [:asuth] 2012-07-13 13:44:58 PDT
Comment on attachment 642003 [details] [diff] [review]
Patch

This all looks reasonable, and I'm indeed sure people will appreciate on-the-fly indexing.

Your observer names are pretty generic; I'd namespace them a little more, even just with "im-".

Not that gloda's use of generators is ever intuitive, but I was a little confused by the interplay of indexIMConversation and pushAndGo.  Please save off the return value of pushAndGo and return that as the return value of indexIMConversation to make it more clear what's going on there.  Likewise, please place a comment before your invocation of indexIMConversation that notes that it's initiating an async sub-job internally.  It's my hope this makes things less terrifying for newcomers; I'm sure you scratched your head a lot while first starting to deal with gloda ;)
Comment 5 Florian Quèze [:florian] [:flo] 2012-07-13 16:45:24 PDT
Created attachment 642141 [details] [diff] [review]
Patch v2

(In reply to Andrew Sutherland (:asuth) from comment #4)

> Your observer names are pretty generic;

Right, it's quite unfortunate.

> I'd namespace them a little more, even just with "im-".

I would if I added them now, but they were added a long time ago in Instantbird, way before we thought the IM back-end would be shared with Thunderbird. Changing them now would break most Instantbird add-ons.


> Not that gloda's use of generators is ever intuitive, but I was a little
> confused by the interplay of indexIMConversation and pushAndGo.

Sorry :-/.

> Please save
> off the return value of pushAndGo and return that as the return value of
> indexIMConversation to make it more clear what's going on there.  Likewise,
> please place a comment before your invocation of indexIMConversation that
> notes that it's initiating an async sub-job internally.

Done.
I also added a comment explaining the usage of the 'strange' aGlodaConv parameter and got rid of the shouldIndex variable and replaced it with an early return, which makes the code a little bit more readable. The early return wasn't practical at the time indexIMConversation was a generator and not a simple function.

> I'm sure you scratched your head
> a lot while first starting to deal with gloda ;)

I still do. Especially when dealing with generators and debugging my mistakes. Having "exception StopIterator" as the only error message without real location information in the error console when yield/return keywords have been mixed invalidly, or having absolutely no feedback when calling a generator like if it was a regular function (and not doing anything with the returned iterator) is really annoying.


Requesting another review from Patrick, for the chat/ changes (he already looked and said on IRC that they looked acceptable).
Comment 6 Patrick Cloke [:clokep] 2012-07-13 18:07:49 PDT
Comment on attachment 642141 [details] [diff] [review]
Patch v2

They still look OK. ;) Not ideal, but the changes are fine.
Comment 7 Florian Quèze [:florian] [:flo] 2012-07-14 06:43:41 PDT
https://hg.mozilla.org/comm-central/rev/90e7d05fd84f
Comment 8 Florian Quèze [:florian] [:flo] 2012-07-14 06:48:22 PDT
Comment on attachment 642141 [details] [diff] [review]
Patch v2

[Approval Request Comment]
User impact if declined: We commonly received the feedback about Gloda searches of IM conversations that it "doesn't work" because users test it by typing in the search box a few words that are in the conversation they have in front of them, and they are surprised it isn't found. Fixing this is important to avoid this perception that searching IM conversations is broken.

Risk to taking this patch (and alternatives if risky): Unfortunately, this patch could have a significant performance impact for users in very busy IRC channels or users receiving lots of tweets per minute (probably because they are tracking a common keyword). I think we still want this for Thunderbird 15, but I would prefer if it could have as much time as possible to be tested, so that people have time to send us feedback if the performance impact turns out to be more severe than expected.
Comment 9 Florian Quèze [:florian] [:flo] 2012-07-16 05:38:08 PDT
https://hg.mozilla.org/releases/comm-aurora/rev/a3b20101b19f

Note You need to log in before you can comment on or make changes to this bug.