Closed Bug 1135291 Opened 9 years ago Closed 9 years ago

Update chat log entries added to Gloda since bug 955292 to use relative paths

Categories

(Thunderbird :: Search, defect)

x86
All
defect
Not set
normal

Tracking

(thunderbird36 affected, thunderbird37 affected, thunderbird38+ fixed, thunderbird39 fixed, thunderbird40 fixed, thunderbird_esr31 unaffected)

RESOLVED FIXED
Thunderbird 40.0
Tracking Status
thunderbird36 --- affected
thunderbird37 --- affected
thunderbird38 + fixed
thunderbird39 --- fixed
thunderbird40 --- fixed
thunderbird_esr31 --- unaffected

People

(Reporter: nhnt11, Assigned: nhnt11)

References

Details

Attachments

(1 file, 9 obsolete files)

Bug 955292 introduced a regression where chat log files were indexed against their full paths instead of path relative to logs directory.
Bug 1069845 fixes this for new files, but we need a way to remove all the now-garbage entries and re-index them.
Assignee: nobody → nhnt11
OS: Mac OS X → All
Version: unspecified → Trunk
How does chat hook into gloda? If this was a mail folder, there is a way to invalidate a folder and it will get reindexed.
What is the plan to make progress on this?
Note there's already a point in the existing code where we should really be throwing away the existing index:

https://dxr.mozilla.org/comm-central/source/mail/components/im/modules/index_im.js#542

If the gloda datastore ID changed, then because knownFiles isn't read from the cache, we'll reindex everything, which currently creates duplicate entries.
Attached patch Initial patch (obsolete) — Splinter Review
This patch finds entries in the database with full paths and simply removes them.

Once I figure out the best way to update these entries to use the correct path instead, I'll upload a new patch. I can't remember at the moment if logs that were indexed against their full paths are now being reindexed against their relative paths after bug 1135588. I'll do some research and follow up.

I haven't looked much into the datastore ID change thing yet. Should be as simple as executing a |DELETE * FROM imConversations| to clear out the table.

This patch uses a simple version flag in the cache to trigger the clearing out of bad entries. I haven't messed with Gloda's schema version yet (and am not sure if it's really necessary).
Attachment #8588268 - Flags: review?(aleth)
Comment on attachment 8588268 [details] [diff] [review]
Initial patch

Review of attachment 8588268 [details] [diff] [review]:
-----------------------------------------------------------------

For the record, why did you decide not to use the gloda version update mechanism for this? (I guess that would change the datastore ID?)

::: mail/components/im/modules/index_im.js
@@ +249,5 @@
>      this._shutdownBlockerAdded = true;
>      AsyncShutdown.profileBeforeChange.addBlocker("GlodaIMIndexer cache save",
>        () => {
> +        return Task.spawn(function* () {
> +          yield this.clearBadEntriesPromise;

What does yield null do?

@@ +680,5 @@
> +  clearBadEntriesPromise: null,
> +  clearBadEntries: function() {
> +    if (this.clearBadEntriesPromise)
> +      return;
> +    this.clearBadEntriesPromise = Task.spawn(function* () {

Does this block other gloda jobs from happening while this is going on? Should it?

As this is called from enable(), did you check the datastore is fully initialized and ready to use at that point?

@@ +687,5 @@
> +      let store = GlodaDatastore;
> +      let selectStatement = store._createAsyncStatement(
> +        "SELECT id, path FROM imConversations");
> +      let deleteStatement = store._createAsyncStatement(
> +        "DELETE FROM imConversations WHERE id = ?1");

Does this also delete any references to that id? i.e. somewhere I assume there is an index of words -> id...

Maybe sshagarwal already had that in his patch?

@@ +716,5 @@
> +                  handleResult: () => {},
> +                  handleError: aError =>
> +                    Cu.reportError("Error deleting bad entry:\n" + aError),
> +                  handleCompletion: asyncCompletionHandler
> +                });

Don't you have to remove the file from the cache too, so it gets picked up for a reindex on the next log sweep?
(In reply to aleth [:aleth] from comment #5)
> Comment on attachment 8588268 [details] [diff] [review]
> Does this also delete any references to that id? i.e. somewhere I assume
> there is an index of words -> id...

As discussed on IRC, it may be easier to simply update the path rather than deleting the entry and its references.
Attached patch Patch v2 (obsolete) — Splinter Review
This patch updates the entries to use the correct path instead of deleting them.

> For the record, why did you decide not to use the gloda version update
> mechanism for this? (I guess that would change the datastore ID?)

Haven't decided yet, just haven't dug around with that. Current way of doing it was easy, so that I could focus on the main issue first.

> What does yield null do?

Works fine, there's nothing to wait for is all. An event loop spin will happen.

> @@ +680,5 @@
> > +  clearBadEntriesPromise: null,
> > +  clearBadEntries: function() {
> > +    if (this.clearBadEntriesPromise)
> > +      return;
> > +    this.clearBadEntriesPromise = Task.spawn(function* () {
> 
> Does this block other gloda jobs from happening while this is going on?
> Should it?

No, and I don't see any reason why it should.

> As this is called from enable(), did you check the datastore is fully
> initialized and ready to use at that point?

I'm finding it really difficult to prove it from code, but from my testing, GlodaDatastore is available in enable(). I followed the code through a trail of 5+ files and got tired. The datastore is initialized when gloda initialized. I think we're fine.

> @@ +687,5 @@
> > +      let store = GlodaDatastore;
> > +      let selectStatement = store._createAsyncStatement(
> > +        "SELECT id, path FROM imConversations");
> > +      let deleteStatement = store._createAsyncStatement(
> > +        "DELETE FROM imConversations WHERE id = ?1");
> 
> Does this also delete any references to that id? i.e. somewhere I assume
> there is an index of words -> id...

Yes.

> Maybe sshagarwal already had that in his patch?

Probably. Deleting a row deletes all the full-text content as well.

> @@ +716,5 @@
> > +                  handleResult: () => {},
> > +                  handleError: aError =>
> > +                    Cu.reportError("Error deleting bad entry:\n" + aError),
> > +                  handleCompletion: asyncCompletionHandler
> > +                });
> 
> Don't you have to remove the file from the cache too, so it gets picked up
> for a reindex on the next log sweep?

Not relevant after this new patch since we aren't deleting the entries anymore.
Attachment #8588268 - Attachment is obsolete: true
Attachment #8588268 - Flags: review?(aleth)
Attachment #8588333 - Flags: review?(aleth)
Attached patch Patch v2.1 (obsolete) — Splinter Review
Update a couple of comments and change the "delete" statement error message to "update"
Attachment #8588333 - Attachment is obsolete: true
Attachment #8588333 - Flags: review?(aleth)
Attachment #8588334 - Flags: review?(aleth)
Comment on attachment 8588333 [details] [diff] [review]
Patch v2

Review of attachment 8588333 [details] [diff] [review]:
-----------------------------------------------------------------

This fixes the paths for me, making those search result hits viewable.

::: mail/components/im/modules/index_im.js
@@ +249,5 @@
>      this._shutdownBlockerAdded = true;
>      AsyncShutdown.profileBeforeChange.addBlocker("GlodaIMIndexer cache save",
>        () => {
> +        return Task.spawn(function* () {
> +          yield this.clearBadEntriesPromise;

Not sure you need to wait on this - won't it simply continue on the next restart as the cache file doesn't get saved with the new version set until it completes?

@@ +682,5 @@
> +  clearBadEntries: function() {
> +    if (this.clearBadEntriesPromise)
> +      return;
> +    this.clearBadEntriesPromise = Task.spawn(function* () {
> +      this._cacheSaveTimer = 1; // Prevent a cache save from getting scheduled.

Do you actually need this? The cache file doesn't change after all, until the very end.

Also, you never clear it, so it's preventing all cache saves at the moment, including your version flag.

@@ +713,5 @@
> +              let pathComponents = OS.Path.split(row.getString(1)).components;
> +              if (pathComponents.length > 4) {
> +                updateStatement.bindInt64Parameter(1, row.getInt64(0)); // id
> +                updateStatement.bindStringParameter(0,
> +                  pathComponents.slice(-4).join("/")); // Last 4 path components

Should this use OS.File.join?
Attachment #8588333 - Attachment is obsolete: false
Attachment #8588333 - Flags: feedback+
What will it take to move this forward?
Attached patch Patch v3 (obsolete) — Splinter Review
(In reply to aleth [:aleth] from comment #9)
> Comment on attachment 8588333 [details] [diff] [review]
> Patch v2
> 
> Review of attachment 8588333 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This fixes the paths for me, making those search result hits viewable.
> 
> ::: mail/components/im/modules/index_im.js
> @@ +249,5 @@
> >      this._shutdownBlockerAdded = true;
> >      AsyncShutdown.profileBeforeChange.addBlocker("GlodaIMIndexer cache save",
> >        () => {
> > +        return Task.spawn(function* () {
> > +          yield this.clearBadEntriesPromise;
> 
> Not sure you need to wait on this - won't it simply continue on the next
> restart as the cache file doesn't get saved with the new version set until
> it completes?

This makes sense, I've removed the yield.

> @@ +682,5 @@
> > +  clearBadEntries: function() {
> > +    if (this.clearBadEntriesPromise)
> > +      return;
> > +    this.clearBadEntriesPromise = Task.spawn(function* () {
> > +      this._cacheSaveTimer = 1; // Prevent a cache save from getting scheduled.
> 
> Do you actually need this? The cache file doesn't change after all, until
> the very end.
> 
> Also, you never clear it, so it's preventing all cache saves at the moment,
> including your version flag.

Okay, I guess it's unnecessary. I've removed it. For the record though, it wasn't preventing the cache save, since there's a direct call to saveCacheNow() after the db updates - that sets it back to null.
The version flag wasn't getting saved because I was using |this| in saveCacheNow instead of GlodaIMIndexer. That's fixed in this patch.

> @@ +713,5 @@
> > +              let pathComponents = OS.Path.split(row.getString(1)).components;
> > +              if (pathComponents.length > 4) {
> > +                updateStatement.bindInt64Parameter(1, row.getInt64(0)); // id
> > +                updateStatement.bindStringParameter(0,
> > +                  pathComponents.slice(-4).join("/")); // Last 4 path components
> 
> Should this use OS.File.join?

No, the paths in the database always use a forward slash - probably for portability reasons.

(In reply to Kent James (:rkent) from comment #10)
> What will it take to move this forward?

I'm sorry I've not been able to look at this, but hopefully it can be checked in soon.
Attachment #8588333 - Attachment is obsolete: true
Attachment #8588334 - Attachment is obsolete: true
Attachment #8588334 - Flags: review?(aleth)
Attachment #8596254 - Flags: review?(aleth)
Attached patch Patch v3.1 (obsolete) — Splinter Review
Split paths in the database on "/" instead of using OS.Path. All db paths use forward slashes.
Attachment #8596254 - Attachment is obsolete: true
Attachment #8596254 - Flags: review?(aleth)
Attachment #8596262 - Flags: review?(aleth)
Comment on attachment 8596262 [details] [diff] [review]
Patch v3.1

Review of attachment 8596262 [details] [diff] [review]:
-----------------------------------------------------------------

Looks almost ready! Just possibly some cleanup to be done.

::: mail/components/im/modules/index_im.js
@@ +257,5 @@
> +          if (!this._cacheSaveTimer)
> +            return;
> +          clearTimeout(this._cacheSaveTimer);
> +          return this._saveCacheNow();
> +        }.bind(this));

This change to the existing code makes no sense to me. Dead code?

@@ +677,5 @@
> +
> +  // Due to bug 1069845, some logs were indexed against their full paths instead
> +  // of their path relative to the logs directory. These garbage entries are
> +  // removed below.
> +  clearBadEntriesPromise: null,

Looks like this promise isn't being used any more.

@@ +678,5 @@
> +  // Due to bug 1069845, some logs were indexed against their full paths instead
> +  // of their path relative to the logs directory. These garbage entries are
> +  // removed below.
> +  clearBadEntriesPromise: null,
> +  clearBadEntries: function() {

Can we give this a more specific name for future clarity (e.g fixEntriesWithAbsolutePaths or something like that)

@@ +681,5 @@
> +  clearBadEntriesPromise: null,
> +  clearBadEntries: function() {
> +    if (this.clearBadEntriesPromise)
> +      return;
> +    this.clearBadEntriesPromise = Task.spawn(function* () {

If there is nothing waiting on the result of this Task, and it isn't blocking anything else, you can probably simplify this by removing the Task.

@@ +707,5 @@
> +            let row;
> +            while ((row = aResultSet.getNextRow())) {
> +              // If the path has more than 4 components, it is not relative to
> +              // the logs folder. Update it to use only the last 4 components.
> +              let pathComponents = row.getString(1).split("/");

Can you double check that the *incorrect* path entries we are trying to find used "/"-separated paths on all OS? I know that's the intended behaviour, but was it also true for the buggy behaviour? (where did the full paths come from?)

@@ +731,5 @@
> +        });
> +      });
> +
> +      this.cacheVersion = 1;
> +      yield this._saveCacheNow();

I think you can just call this._scheduleCacheSave, no need to yield. Or was there a particular reason for this?
Attachment #8596262 - Flags: review?(aleth) → review-
Attached patch Patch v4 (obsolete) — Splinter Review
I've addressed your review comments, but ran out of battery before I could re-test the patch. It should be fine though.

> If there is nothing waiting on the result of this Task, and it isn't
> blocking anything else, you can probably simplify this by removing the Task.

Considering that we need to wait till all the async SQL queries complete to do a _scheduleCacheSave(), it's probably cleanest to leave it as a task. I've simplified it to use Task.async() though.

> Can you double check that the *incorrect* path entries we are trying to find
> used "/"-separated paths on all OS? I know that's the intended behaviour,
> but was it also true for the buggy behaviour? (where did the full paths come
> from?)

Double checked, and it turns out the incorrect paths used OS-specific separators. For this patch I've made the assumption that the profile dir was not ported to a different OS - which means OS.Path.split() will work. I don't think this assumption is unfair, please let me know your thoughts.
Attachment #8596262 - Attachment is obsolete: true
Attachment #8600631 - Flags: review?(aleth)
Comment on attachment 8600631 [details] [diff] [review]
Patch v4

Review of attachment 8600631 [details] [diff] [review]:
-----------------------------------------------------------------

::: mail/components/im/modules/index_im.js
@@ +681,5 @@
> +
> +    // Number of async statements pending at any given time.
> +    let asyncStatementCount = 0;
> +
> +    yield new Promise((resolve, reject) => {

Fwiw, when I suggested getting rid of the Task, what I was thinking of was putting the cache save stuff into asyncCompletionHandler instead of the resolve() call (and do without the promise altogether) if that makes more sense.
Comment on attachment 8600631 [details] [diff] [review]
Patch v4

Review of attachment 8600631 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for getting this done!

asuth: I'd be very grateful if you could give this a quick look too as I'm not very familiar with the gloda datastore.
Attachment #8600631 - Flags: review?(bugmail)
Attachment #8600631 - Flags: review?(aleth)
Attachment #8600631 - Flags: review+
Please update the commit message and the bug title to better reflect what this patch actually does.
Comment on attachment 8600631 [details] [diff] [review]
Patch v4

This looks good in general, but you're going to want to issue a call to "store._beginTransaction()" before you issue any calls to updateStatement.excuteAsync(...) and a call to "store._commitTransaction()" after you've issued all the executeAsync calls.  (You do not need to wait for the executeAsync calls to complete.)

This is needed because since you aren't doing the fix-up as an indexer task, you aren't having your database manipulations wrapped by gloda's existing transaction management logic.  This matters because if your N executeAsync calls are issued outside a transaction, you are going to cause N fsync() calls to happen, which is going to be a nightmare from a performance perspective.  (Of course, since you aren't operating within the indexing task framework, there's a good chance you might overlap the indexing sweep mechanism or something and luck out.)

Note that I don't really have a strong feeling about this not being a strictly managed indexing task.  If gloda was being actively developed and used by other things, I'd probably care (or whoever was developing could care or not care as they chose :), but based on the current gloda code, this is harmless/fine and definitely an improvement over my earlier suggestion of doing this synchronously at startup as an update and messing with the gloda schema versions :)
Attachment #8600631 - Flags: review?(bugmail) → review+
Attached patch Patch v5 (obsolete) — Splinter Review
asuth: Thanks a lot for your input! This patch adds _begin/_commitTransaction().
aleth: I'm requesting review again for a sanity check of the change above, and the change below:

(In reply to aleth [:aleth] from comment #15)
> Comment on attachment 8600631 [details] [diff] [review]
> Patch v4
> 
> Review of attachment 8600631 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mail/components/im/modules/index_im.js
> @@ +681,5 @@
> > +
> > +    // Number of async statements pending at any given time.
> > +    let asyncStatementCount = 0;
> > +
> > +    yield new Promise((resolve, reject) => {
> 
> Fwiw, when I suggested getting rid of the Task, what I was thinking of was
> putting the cache save stuff into asyncCompletionHandler instead of the
> resolve() call (and do without the promise altogether) if that makes more
> sense.

Good idea, thanks.
Attachment #8600631 - Attachment is obsolete: true
Attachment #8600673 - Flags: review?(aleth)
Summary: Re-index chat log files since bug 955292 → Update chat log entries added to Gloda since bug 955292 to use relative paths
Attached patch Patch v5.01 (obsolete) — Splinter Review
Update commit message
Attachment #8600673 - Attachment is obsolete: true
Attachment #8600673 - Flags: review?(aleth)
Attachment #8600676 - Flags: review?(aleth)
Comment on attachment 8600676 [details] [diff] [review]
Patch v5.01

Review of attachment 8600676 [details] [diff] [review]:
-----------------------------------------------------------------

::: mail/components/im/modules/index_im.js
@@ +692,5 @@
> +      }
> +    };
> +
> +    asyncStatementCount++;
> +    store._beginTransaction();

Shouldn't you move this inside the handleResult of the select statement? Not sure the select statement can be part of the same transaction.
Comment on attachment 8600676 [details] [diff] [review]
Patch v5.01

Review of attachment 8600676 [details] [diff] [review]:
-----------------------------------------------------------------

I'm a bit concerned about a single transaction for all the statements which is only committed when SELECT completes, because gloda allows transactions to be nested, but rolls them all back if the outermost transaction doesn't complete [1]. All this is async, there's a lot going on after startup and the SELECT might not be that fast for people with large databases, so if I understand this correctly I'm not confident about this approach as it may catch unrelated stuff if e.g. TB shuts down early.

Looking at a test run on my dev profile, it appears that the result sets from SELECT are either empty or contain O(10) entries, likely as the entries with incorrect paths are grouped together in time.

So if you agree, I think it's probably a better tradeoff to have a single transaction per SELECT result set.

[1] https://dxr.mozilla.org/comm-central/source/mailnews/db/gloda/modules/datastore.js#1726

::: mail/components/im/modules/index_im.js
@@ +235,5 @@
>  });
>  
>  var GlodaIMIndexer = {
>    name: "index_im",
> +  cacheVersion: null,

Should this be 1 so we don't run the path fixer on new profiles.
Flags: needinfo?(bugmail)
Ah, so, for transaction correctness in the face of shutdown before this fully runs to completion, yes, what you want is to do do the following in handleCompletion:

  store.runPostCommit(asyncCompletionHandler);
  store._commitTransaction();

instead of the current:

  store._commitTransaction();
  asyncCompletionHandler();


This ensures that you don't erroneously update your version until we're sure the transaction has durably hit disk.  In the case of early shutdowns, we will err on the side of thinking things have not completed when they have in fact completed.  However, this process is idempotent, so things turn out okay.


In terms of row set results, etc., the constants that control the executeAsync result batching are:
#define MAX_MILLISECONDS_BETWEEN_RESULTS 75
#define MAX_ROWS_PER_RESULT 15

Practically speaking, the query in question is going to run very quickly; gloda uses a 32k page-size which means that hundreds of results will be available in-memory in one go for every page access and it's just that batching occurring.  Adding more transactions is just more likely to add I/O pauses due to fsync which can impact Thunderbird overall.  So I would suggest leaving the transactions as they are, except for the commit callback.
Flags: needinfo?(bugmail)
(In reply to Andrew Sutherland [:asuth] from comment #23)
> Ah, so, for transaction correctness in the face of shutdown before this
> fully runs to completion, yes, what you want is to do do the following in
> handleCompletion:
> 
>   store.runPostCommit(asyncCompletionHandler);
>   store._commitTransaction();

Great, this should fix that problem.

> In terms of row set results, etc., the constants that control the
> executeAsync result batching are:
> #define MAX_MILLISECONDS_BETWEEN_RESULTS 75
> #define MAX_ROWS_PER_RESULT 15
> 
> Practically speaking, the query in question is going to run very quickly;
> gloda uses a 32k page-size which means that hundreds of results will be
> available in-memory in one go for every page access and it's just that
> batching occurring.  Adding more transactions is just more likely to add I/O
> pauses due to fsync which can impact Thunderbird overall.  So I would
> suggest leaving the transactions as they are, except for the commit callback.

Thanks for the explanation!
(In reply to Andrew Sutherland [:asuth] from comment #23)
> Ah, so, for transaction correctness in the face of shutdown before this
> fully runs to completion, yes, what you want is to do do the following in
> handleCompletion:
> 
>   store.runPostCommit(asyncCompletionHandler);
>   store._commitTransaction();
> 
> instead of the current:
> 
>   store._commitTransaction();
>   asyncCompletionHandler();
> 
> 
> This ensures that you don't erroneously update your version until we're sure
> the transaction has durably hit disk.  In the case of early shutdowns, we
> will err on the side of thinking things have not completed when they have in
> fact completed.  However, this process is idempotent, so things turn out
> okay.

This is much better than the current approach that keeps a count of the pending async calls. Thanks! I'll upload a new patch soon.
Attached patch Patch v6 (obsolete) — Splinter Review
Use runPostCommit() to simplify things.
Attachment #8600676 - Attachment is obsolete: true
Attachment #8600676 - Flags: review?(aleth)
Attachment #8600925 - Flags: review?(aleth)
Comment on attachment 8600925 [details] [diff] [review]
Patch v6

Review of attachment 8600925 [details] [diff] [review]:
-----------------------------------------------------------------

::: mail/components/im/modules/index_im.js
@@ +694,5 @@
> +          if (pathComponents.length > 4) {
> +            updateStatement.bindInt64Parameter(1, row.getInt64(0)); // id
> +            updateStatement.bindStringParameter(0,
> +              pathComponents.slice(-4).join("/")); // Last 4 path components
> +            updateStatement.executeAsync({

Doesn't look like those handle methods are optional?
http://mxr.mozilla.org/mozilla-central/source/storage/public/mozIStorageStatementCallback.idl
Attached patch Patch v6.1Splinter Review
Default cacheVersion to 1, add back the handleCompletion and handleResult callbacks.
Attachment #8600925 - Attachment is obsolete: true
Attachment #8600925 - Flags: review?(aleth)
Attachment #8600971 - Flags: review?(aleth)
Comment on attachment 8600971 [details] [diff] [review]
Patch v6.1

Review of attachment 8600971 [details] [diff] [review]:
-----------------------------------------------------------------

Great, thanks!
Attachment #8600971 - Flags: review?(aleth) → review+
Attachment #8600971 - Flags: approval-comm-beta?
Attachment #8600971 - Flags: approval-comm-aurora?
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 40.0
You need to log in before you can comment on or make changes to this bug.