Closed Bug 1103647 Opened 8 years ago Closed 8 years ago

Chat logs no longer being indexed by gloda

Categories

(Thunderbird :: Instant Messaging, defect)

defect
Not set
major

Tracking

(thunderbird38+ fixed)

RESOLVED FIXED
Thunderbird 38.0
Tracking Status
thunderbird38 + fixed

People

(Reporter: aleth, Assigned: nhnt11)

References

Details

(Keywords: regression)

Attachments

(1 file, 4 obsolete files)

Not sure since when, but the last indexed logs I have on my profile are from June, so this may (or may not) be related to bug 955292 which landed in early July.

When starting up TB, after an async delay (no need to open the chat tab!) gloda starts up and spits out numerous warnings of the form 

2014-11-23 23:18:14	gloda.indexer	WARN	Problem during [job:convFolderSweep id:null items:0 offset:0 goal:null], bailing: TypeError: can't convert undefined to object

thrown by indexer.js::1099. These are indexing jobs that fail.

Further investigation reveals the typeerror is produced by index_im.js:443. It appears it's the aCache parameter that is undefined, though not on every call.


It's urgent to check whether this bug exists in TB31 (whether it was caused by async logging or not).
A quick glance at the code suggests the calls where aCache is undefined are from index_im.js#569. This line was not touched by bug 955292 but likely should have been.
Flags: needinfo?(nhnt11)
Keywords: regression
Attached patch Patch (obsolete) — Splinter Review
It seems I did a rather shoddy job with the gloda changes in bug 955292.
This patch redoes those changes. I've tried to go back to the flow that existed before the patch from async logs, and adapted it to work with the Logs Service's Promise API.

Sorry about this, hopefully this patch takes care of it.
Assignee: nobody → nhnt11
Status: NEW → ASSIGNED
Flags: needinfo?(nhnt11)
Attachment #8529297 - Flags: review?(aleth)
Comment on attachment 8529297 [details] [diff] [review]
Patch

Review of attachment 8529297 [details] [diff] [review]:
-----------------------------------------------------------------

First pass - looks pretty good.

::: mail/components/im/modules/index_im.js
@@ -263,5 @@
>  
>      // Obtain a converter to convert our data to a UTF-8 encoded input stream.
>      let converter = Cc["@mozilla.org/intl/scriptableunicodeconverter"].createInstance(Ci.nsIScriptableUnicodeConverter);
>      converter.charset = "UTF-8";
> -

Nit: no need to remove this line

@@ -348,5 @@
> -      let fileInfo = yield OS.File.stat(conv.logFile);
> -      let lastModifiedTime = fileInfo.lastModificationDate.valueOf();
> -      if (Object.prototype.hasOwnProperty.call(cache, fileName) &&
> -          cache[fileName] == lastModifiedTime)
> -        return;

Why did you move this? It means an indexing job gets scheduled for each file. I realize the job itself now checks if it needs doing, but...

@@ +423,5 @@
>     * the conversation in the database. This id is required to later update the
>     * conversation in the database, so the caller dealing with ongoing
>     * conversation has to provide the aGlodaConv parameter, while the caller
>     * dealing with old conversations doesn't care. */
> +  indexIMConversation: Task.async(function* (aCallbackHandle, aLogPath, aCache, aGlodaConv) {

Please expand the comment to explain what exactly the aCache parameter is, as at first glance it appears odd first that it's actually the conversation object, and second that (given that) we set properties on it.
Attachment #8529297 - Flags: review?(aleth) → review-
Nihanth, any chance we can get this in for TB 38?
Blocks: 783619
Blocks: 1069845
Attached patch Patch v2 (obsolete) — Splinter Review
Move the last modified time checking stuff as requested and add comments.
Sorry for the horrible delay in getting this up.
Attachment #8529297 - Attachment is obsolete: true
Attached patch Patch v2.1 (obsolete) — Splinter Review
Forgot to fix that nit, sorry.
Attachment #8565773 - Attachment is obsolete: true
Attachment #8565774 - Flags: review?(aleth)
Comment on attachment 8565774 [details] [diff] [review]
Patch v2.1

Review of attachment 8565774 [details] [diff] [review]:
-----------------------------------------------------------------

Overall, this looks almost ready! Thanks for taking care of this blocker.

In testing, I encounter the following bug:
*************************
A coding exception was thrown and uncaught in a Task.

Full message: TypeError: this._knownConversations[convId] is undefined
Full stack: GlodaIMIndexer._beginIndexingJob/<@resource:///modules/index_im.js:353:9
TaskImpl_run@resource://gre/modules/Task.jsm:314:40
*************************
STR:
Click on a contact in the sidebar.
Click on "start a conversation".
Send a message.
Close the conversation from the context menu of the contact.

::: mail/components/im/modules/index_im.js
@@ +251,5 @@
> +   * object that maps a protocol name to an object representing the subtree
> +   * for that protocol. The structure is:
> +   * _knownFiles   -> protoObj     -> accountObj -> convObj
> +   * The corresponding keys in the above objects are:
> +   * protocol name -> account name -> conv name  -> file name -> last modified time

Please add how we deal with grouped logs, i.e. can a convObj contain multiple file names or not.

@@ +360,5 @@
> +      let logPath = conv.logFile;
> +      let fileName = OS.Path.basename(logPath);
> +      let lastModifiedTime =
> +        (yield OS.File.stat(logPath)).lastModificationDate.valueOf();
> +      if (conv.convObj[fileName] == lastModifiedTime)

This assumes conv.convObj.fileName always already exists. It doesn't.

@@ +365,1 @@
>          return;

Please add a comment above this early return. ("No need to index because...")

@@ +493,5 @@
>    _worker_indexIMConversation: function(aJob, aCallbackHandle) {
>      let glodaConv = {};
>      if (aJob.conversation.glodaConv)
>        glodaConv.value = aJob.conversation.glodaConv;
>      // indexIMConversation may initiate an async grokNounItem sub-job.

see below

@@ +585,5 @@
>      while (sessions.hasMoreElements()) {
>        let file = sessions.getNext().QueryInterface(Ci.nsIFile);
>        let fileName = file.leafName;
> +      if (!file.isFile() || !file.isReadable() || !fileName.endsWith(".json") ||
> +          aJob.convObj[fileName] == file.lastModifiedTime)

again, assumes aJob.convObj.fileName exists.

@@ +590,2 @@
>          continue;
>        // indexIMConversation may initiate an async grokNounItem sub-job.

I don't understand this comment, but it doesn't seem correct any more (indexImConversation is always async, right?)

@@ +590,5 @@
>          continue;
>        // indexIMConversation may initiate an async grokNounItem sub-job.
> +      this.indexIMConversation(aCallbackHandle, file.path,
> +                               file.lastModifiedTime, aJob.convObj)
> +          .then(() => GlodaIndexer.callbackDriver());

Is indexIMConversation ever called *without* a subsequent GlodaIndexer.callbackDriver call? If it isn't, just do it at the end of indexIMConversation.
Attachment #8565774 - Flags: review?(aleth) → feedback+
Attached patch Patch v3 (obsolete) — Splinter Review
Address review comments.
Explanation for comments not addressed: http://log.bezut.info/instantbird/150219/#m177
Attachment #8565774 - Attachment is obsolete: true
Attachment #8566454 - Flags: review?(aleth)
Comment on attachment 8566454 [details] [diff] [review]
Patch v3

Review of attachment 8566454 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!

::: mail/components/im/modules/index_im.js
@@ +252,5 @@
> +   * for that protocol. The structure is:
> +   * _knownFiles    -> protoObj      -> accountObj  -> convObj
> +   * The corresponding keys of the above objects are:
> +   * protocol names -> account names -> conv names  -> file names -> last modified time
> +   * convObj maps ALL previously indexed log files of a conv to their last modified times.

What I would like clarified in the last line of this comment is that "conv" isn't a single conversation, but (all the conversations with) a particular contact or MUC. Because in other places in the same file (e.g. "let conv = this._knownConversations[convId];") that's not the case. Also add something like "Note gloda knows nothing about the log grouping done by logger.js" (I'm asking for this since the question of how gloda treats split logs came up previously in another patch).

@@ +362,5 @@
> +      let logPath = conv.logFile;
> +      let fileName = OS.Path.basename(logPath);
> +      let lastModifiedTime =
> +        (yield OS.File.stat(logPath)).lastModificationDate.valueOf();
> +      if (conv.convObj[fileName] && conv.convObj[fileName] == lastModifiedTime) {

Please use hasOwnProperty as in the original code (or turn convObj into a Map :P).

@@ +589,5 @@
>      while (sessions.hasMoreElements()) {
>        let file = sessions.getNext().QueryInterface(Ci.nsIFile);
>        let fileName = file.leafName;
> +      if (!file.isFile() || !file.isReadable() || !fileName.endsWith(".json") ||
> +          (aJob.convObj[fileName] && aJob.convObj[fileName] == file.lastModifiedTime))

hasOwnProperty
Attachment #8566454 - Flags: review?(aleth) → review-
Attached patch Patch v4Splinter Review
Addressed review comments.
Attachment #8566454 - Attachment is obsolete: true
Attachment #8566657 - Flags: review?(aleth)
Comment on attachment 8566657 [details] [diff] [review]
Patch v4

Review of attachment 8566657 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for fixing this in time for TB 38!

This needs an additional review from a mailnews peer.
rkent: Since the gloda submodule currenty has no active peer, I hope you can take a quick look at this.
Attachment #8566657 - Flags: review?(rkent)
Attachment #8566657 - Flags: review?(aleth)
Attachment #8566657 - Flags: review+
Comment on attachment 8566657 [details] [diff] [review]
Patch v4

Actually, it doesn't need an additional review after all! It's for gloda, but the file isn't in the gloda submodule. :-)
Attachment #8566657 - Flags: review?(rkent)
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 38.0
Depends on: 1135588
You need to log in before you can comment on or make changes to this bug.