Closed Bug 1135588 Opened 10 years ago Closed 10 years ago

New conversations get indexed twice by gloda, leading to duplicate search results

Tracking

(thunderbird38+ fixed)

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

thunderbird38

fixed

People

(Reporter: aleth, Assigned: aleth)

References

Details

Attachments

(1 file, 5 obsolete files)

Patch 10 years ago (inactive) Nihanth Subramanya [:nhnt11] 1.15 KB, patch		Details \| Diff \| Splinter Review
cachesave.diff 10 years ago aleth [:aleth] 3.23 KB, patch		Details \| Diff \| Splinter Review
cachesave.diff v2 10 years ago aleth [:aleth] 3.29 KB, patch	aleth : review-	Details \| Diff \| Splinter Review
cachesave.diff v3 10 years ago aleth [:aleth] 5.24 KB, patch		Details \| Diff \| Splinter Review
cachesave.diff v4 10 years ago aleth [:aleth] 4.85 KB, patch	clokep : review-	Details \| Diff \| Splinter Review
glodafix.diff v5 10 years ago aleth [:aleth] 13.35 KB, patch	clokep : review+ rkent : approval-comm-aurora+	Details \| Diff \| Splinter Review

aleth [:aleth]

Assignee

Description

•

10 years ago

STR Start a conversation Say something (X) Close the conversation (context menu of the sidebar) Run a search for X: Everything looks good, a result for X appears. Now close and restart TB. Run a search for X: The result for X appears twice. Looks like the file is indexed again on restart, so some flag isn't getting set?

aleth [:aleth]

Assignee

Updated

•

10 years ago

tracking-thunderbird_esr38: ? → ---

aleth [:aleth]

Assignee

Updated

•

10 years ago

status-thunderbird38: --- → affected

status-thunderbird_esr38: affected → ---

Kent James (:rkent)

Comment 1

•

10 years ago

What is the plan to move forward with this?

(inactive) Nihanth Subramanya [:nhnt11]

Comment 2

•

10 years ago

There has been some progress on this. Some debugging has revealed that the source of the bug is a timer that is never reset. Seems like the issue has been around since chat landed in Tb. I'll be working on this and bug 1135291 this weekend and through next week.

(inactive) Nihanth Subramanya [:nhnt11]

Comment 3

•

10 years ago

Attached patch Patch (obsolete) — Details — Splinter Review

I haven't tested this yet, clobber build running at the moment.

Assignee: nobody → nhnt11

Status: NEW → ASSIGNED

aleth [:aleth]

Assignee

Comment 4

•

10 years ago

I remember testing this and it seemed a shutdown handler was required here so pending saves are ensured to haopen.

aleth [:aleth]

Assignee

Comment 5

•

10 years ago

Attached patch cachesave.diff (obsolete) — Details — Splinter Review

Since I'd already discussed this with nhnt11, it seemed easy to finish this patch so nhnt11 can focus on the chat reindexing bug.

Assignee: nhnt11 → aleth

Attachment #8577712 - Attachment is obsolete: true

Attachment #8578882 - Flags: review?(clokep)

aleth [:aleth]

Assignee

Comment 6

•

10 years ago

Not actually a regression - this bug has existed since gloda IM indexing was added.

Keywords: regression

aleth [:aleth]

Assignee

Comment 7

•

10 years ago

Attached patch cachesave.diff v2 (obsolete) — Details — Splinter Review

I guess we should cancel the timeout too, just in case, though I don't think they fire any more at that stage in the shutdown.

Attachment #8578882 - Attachment is obsolete: true

Attachment #8578882 - Flags: review?(clokep)

Attachment #8578896 - Flags: review?(clokep)

aleth [:aleth]

Assignee

Comment 8

•

10 years ago

Comment on attachment 8578896 [details] [diff] [review] cachesave.diff v2 This won't do as _saveCacheNow uses NetUtil.asyncCopy.

Attachment #8578896 - Flags: review?(clokep) → review-

aleth [:aleth]

Assignee

Comment 9

•

10 years ago

Attached patch cachesave.diff v3 (obsolete) — Details — Splinter Review

Apart from having to convert the save to OS.File so the shutdown blocker could wait upon its completion, there was another bug causing duplicates: if autojoined channels exist, the first save is scheduled on startup before the existing cache file has been read. If this save overwrites the cache file, we end up with duplicate entries in the gloda index.

Attachment #8578896 - Attachment is obsolete: true

Attachment #8578982 - Flags: review?(clokep)

Patrick Cloke [:clokep]

Comment 10

•

10 years ago

Comment on attachment 8578982 [details] [diff] [review] cachesave.diff v3 Review of attachment 8578982 [details] [diff] [review]: ----------------------------------------------------------------- This looks good! I wrote a couple of nits asking for more comments. :) I'm going to do a build and test this now. ::: mail/components/im/modules/index_im.js @@ +268,3 @@ > return; > this._cacheSaveTimer = setTimeout(this._saveCacheNow, 5000); > + if (this._shutdownBlockerAdded) Nit: Add an empty line here and add a comment about what this is doing? @@ +285,5 @@ > knownFiles: GlodaIMIndexer._knownFiles, > datastoreID: Gloda.datastoreID, > }; > > + let path = OS.Path.join(OS.Constants.Path.profileDir, "logs", kCacheFileName); Keep the comment "Asynchronously copy the data to the file." :)

aleth [:aleth]

Assignee

Comment 11

•

10 years ago

Attached patch cachesave.diff v4 (obsolete) — Details — Splinter Review

adds comments!

Attachment #8578982 - Attachment is obsolete: true

Attachment #8578982 - Flags: review?(clokep)

Attachment #8579020 - Flags: review?(clokep)

Patrick Cloke [:clokep]

Comment 12

•

10 years ago

Comment on attachment 8579020 [details] [diff] [review] cachesave.diff v4 I was able to still get duplicated indices with this. aleth was able to reproduce and has some ideas.

Attachment #8579020 - Flags: review?(clokep) → review-

aleth [:aleth]

Assignee

Comment 13

•

10 years ago

Attached patch glodafix.diff v5 — Details — Splinter Review

This was a little tricky to debug. First I fixed * autojoined channels got added to the cache before the cache was read from disk (which happened together with the log sweeping), which overwrote these new entries so they go missing * cache save should work when there is no existing file * we were receiving every new-text notification twice, scheduling twice as many indexing jobs as needed * some minor inconsistencies which didn't really matter but the problem was still there. It turned out to be a consequence of lastModifiedTime not being set on the indexing jobs of newly opened conversations (as opposed to old conversations found on log sweep). This led to weird behaviour and duplicates because if you set obj.property = undefined, hasOwnProperty.call(obj, "property") is true, but !!obj.property is false, and JSON.stringify(obj) doesn't include property at all. That way you could get conversations that were indexed but only sometimes found in the cache, and they would disappear on save.

Attachment #8579020 - Attachment is obsolete: true

Attachment #8580304 - Flags: review?(clokep)

Patrick Cloke [:clokep]

•

10 years ago

Status: ASSIGNED → RESOLVED

Closed: 10 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.