Closed Bug 894793 Opened 11 years ago Closed 10 years ago

Many open message tabs are gone after a crash or restart of Thunderbird

Categories

(Thunderbird :: General, defect)

17 Branch
x86
Windows 7
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 30.0

People

(Reporter: sdave, Assigned: alta88)

References

Details

(Keywords: dataloss, Whiteboard: [notacrash])

Attachments

(2 files, 6 obsolete files)

User Agent: Mozilla/5.0 (Windows NT 6.1; rv:22.0) Gecko/20100101 Firefox/22.0 (Beta/Release)
Build ID: 20130618035212

Steps to reproduce:

I open email messages in new tabs, but too often (once every couple of weeks or so) most (but not all) open message tabs (I generally have at least 40) are gone, simply gone(!), after a crash or restart of Thunderbird.  And I can't remember which messages I had left open for future response or action!    

Sometimes this has been preceded by erroneous message contents appearing in tabs with correct names.  That was good, though, because it tipped me off to write down the list of open tabs before a crash occurs. 

Am using latest release TBird 17 under Windows 7 Pro 32 Bit Service Pack 1.

Compacting folders seems to raise the risk of this problem happening.  (So I don't manually compact folders anymore when messages are open, but the problem still happens.)  (Now I'm going to turn automatic compacting off for all my [10 or so] email accounts as soon as I finish posting this bug.  Maybe that will help prevent the problem.)


Actual results:

Once every couple of weeks or so, most (but not all) open message tabs (I generally have at least 40) are gone, simply gone(!), after a crash or restart of Thunderbird.  And I can't remember which messages I had left open for future response or action! 




Expected results:

All open (email) message tabs (and windows, for that matter) should be restored upon a restart of Thunderbird.  

To help guard against the "I can't remember which messages I had open" consequence of the problem, TBird should have a function that automatically saves a list of the subjects/senders/(& other fields?) of all open messages, or lets the user manually save such a list.  That would allow users to find the messages that they were keeping open for future response/action.  (It would be super helpful even for messages that were manually closed.  And for those, Thunderbird should get something like Firefox's "Recently Closed Tabs" list).

Also, TBird should automatically tag all open messages with a tag called "open", and remove the tag when a message is manually closed.  That way, messages closed by crashes or during restarts would still have the "open" tag, and users could find them by searching for the tag.  Or Thunderbird could perform that search and re-open the messages automatically.
Keywords: crash
I am having the same issue.  I get the failed restore sometimes, but not all of the time.  At first I thought it was dependent on how many tabs I had open; this does not seem to be the case however.

I, too, organize my workflow around my open message tabs and need this fixed if at all possible.  I like the idea of adding an "open" tag to any open message that can then be hooked to an improved functionality.  

Also, it seems that in the past, session-savers/tabsaver plug ins had been available, but not for 17x.
(In reply to threemoons from comment #1)
> I am having the same issue.  

is this only after a crash? 
and are the only tabs lost, those that were opened within 5 minutes of the crash?

> I get the failed restore sometimes, but not all
> of the time.  At first I thought it was dependent on how many tabs I had
> open; this does not seem to be the case however.

can you be more specific. the current design is thunderbird t save state on 5 minute intervals, and normal shutdown.  ref: bug 408338 comment 18, comment 19


> I, too, organize my workflow around my open message tabs and need this fixed
> if at all possible.  I like the idea of adding an "open" tag to any open
> message that can then be hooked to an improved functionality.  

perhaps an addon is interfering? 
 
> Also, it seems that in the past, session-savers/tabsaver plug ins had been
> available, but not for 17x.

all thunderbird addon of this sort have been superseded by what is currently in the product.
Blocks: 408338
Severity: normal → critical
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(threemoons)
Flags: needinfo?(sdave)
Keywords: crashdataloss
Whiteboard: [notacrash]
(I did not intend to confirm - but I'll leave it there for now to avoid churn)
Wayne,

It happens even when I close the program normally; when I restart TB some of the previously-opened tabs just are not there.  

I am running it vanilla, no plug-ins.
Flags: needinfo?(threemoons)
This is easy to reproduce in a crash scenario.  Get Nightly Tester Tools extension, create some tabs, restart, then Crash Me before 5 minutes are up.  The reason is bizarre review passing code that deletes the session.json file after it's read.  There may be other invisible fails, due to bad json parse/stringify.

This patch removes the delete, catches/reports errors, saves a failed session.json file, uses async file writes.  I also think the interval can be safely decreased (the cutesy comment is low value).
Attached patch session.patch (obsolete) — Splinter Review
passes mozmill tests.
Assignee: nobody → alta88
Attachment #8376730 - Flags: review?(mconley)
m_kato, i just noticed your Bug 810172.  it seems we're doing similar things, perhaps they can be combined/resolved.  (removing the delete is important imo).

the test failures with async writes are solved as per the patch here..
Flags: needinfo?(m_kato)
Attached patch session.patch (obsolete) — Splinter Review
this patch combines the utf8 fix in m_kato's patch, and makes sure |this| isn't mistaken on the writeAtomic promise.then result.
Attachment #8376730 - Attachment is obsolete: true
Attachment #8376730 - Flags: review?(mconley)
Attachment #8376751 - Flags: review?(mconley)
Comment on attachment 8376751 [details] [diff] [review]
session.patch


shopping for smaller review queue.  ;)
Attachment #8376751 - Flags: review?(mconley) → review?(irving)
Comment on attachment 8376751 [details] [diff] [review]
session.patch

Review of attachment 8376751 [details] [diff] [review]:
-----------------------------------------------------------------

We're using async saving of state in a bunch of places in mozilla-central now, and we've found that failures caused by the application shutting down during async writes happened often enough that we needed to add a way of holding shutdown until the write completed.

See https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/AsyncShutdown.jsm, and use AsyncShutdown.profileBeforeChange.addBlocker() to make sure that (a) we trigger a write at shutdown time if the state has changed and (b) the most recent OS.File.writeAtomic promise resolves before the shutdown process goes past profileBeforeChange. Ping me on IRC if you have any questions.

::: mail/base/modules/sessionStoreManager.js
@@ +119,5 @@
> +      // bad file.
> +      let errorFile = "session_error_" +
> +                      (new Date().toISOString()).replace(/\D/g, "") + ".json";
> +      let errorFilePath = OS.Path.join(OS.Constants.Path.profileDir, errorFile);
> +      OS.File.move(this.sessionFile.path, errorFilePath);

This returns a promise and may fail, so:

OS.File.move(...).then(null, error => CU.reportError("Failed to rename " + this.sessionFile.path + " to " + errorFilePath + ": " + error));

ES6 arrow-function syntax tastes great with Promise.then(), but you can use function(error){...} if you prefer.

@@ +153,5 @@
>      if (data == this._currentStateString)
>        return;
>  
> +    OS.File.writeAtomic(this.sessionFile.path,
> +                        TextEncoder("utf-8").encode(data),

The OS.File documentation on MDN doesn't say this, but it encodes string arguments to UTF-8 by default; you don't need to do it in your code. See http://dxr.mozilla.org/mozilla-central/source/toolkit/components/osfile/modules/osfile_shared_front.jsm#390

/me wanders off to update MDN...

::: mail/test/mozmill/session-store/test-session-store.js
@@ +52,5 @@
>    return null;
>  }
>  
>  function waitForFileRefresh() {
> +  controller.sleep(acyncFileWriteDelay);

Define a smaller value for CONST asyncFileWriteDelay and then use sessionStoreManager._sessionAutoSaveTimerIntervalMS + asyncWriteDelay here.

I'm not a big fan of sleeping in tests like this, because it often leads to intermittent failures, but rewriting the code & tests to have the right hooks so that we're not timing dependent is likely to be far too big a job (see for example http://dxr.mozilla.org/mozilla-central/source/toolkit/mozapps/extensions/test/xpcshell/test_DeferredSave.js where I mock out OS.File)
Attachment #8376751 - Flags: review?(irving) → review-
Status: NEW → ASSIGNED
Component: Untriaged → General
Attached patch session.patch (obsolete) — Splinter Review
updated.  tests work consistently with asyncFileWriteDelayMS of 300.
Attachment #8376751 - Attachment is obsolete: true
Attachment #8383113 - Flags: review?(irving)
Attached patch session.patch (obsolete) — Splinter Review
Attachment #8383113 - Attachment is obsolete: true
Attachment #8383113 - Flags: review?(irving)
Attachment #8383116 - Flags: review?(irving)
Comment on attachment 8383116 [details] [diff] [review]
session.patch

Review of attachment 8383116 [details] [diff] [review]:
-----------------------------------------------------------------

This looks great over all. A couple of white space nits to clean up, declare _promise (sorry for not catching that in my first review), and it will be ready to land.

::: mail/base/modules/sessionStoreManager.js
@@ +121,5 @@
> +      let errorFile = "session_error_" +
> +                      (new Date().toISOString()).replace(/\D/g, "") + ".json";
> +      let errorFilePath = OS.Path.join(OS.Constants.Path.profileDir, errorFile);
> +      OS.File.move(this.sessionFile.path, errorFilePath)
> +             .then(null, error => CU.reportError("sessionStoreManager: failed to rename " +

Cu.reportError (lower case 'u')

@@ +122,5 @@
> +                      (new Date().toISOString()).replace(/\D/g, "") + ".json";
> +      let errorFilePath = OS.Path.join(OS.Constants.Path.profileDir, errorFile);
> +      OS.File.move(this.sessionFile.path, errorFilePath)
> +             .then(null, error => CU.reportError("sessionStoreManager: failed to rename " +
> +                                                 this.sessionFile.path + " to " + 

trailing white space

@@ +159,5 @@
>  
> +    this._promise = OS.File.writeAtomic(this.sessionFile.path, data,
> +                                        { tmpPath: this.sessionFile.path + ".tmp",
> +                                          flush: true })
> +                           .then(() => sessionStoreManager._currentStateString = data,

It's hard to decide exactly what indentation level makes semantic sense with promise.then() calls, but indenting them this far doesn't leave much room for accept/reject actions without lots of ugly line wrapping. I'd prefer to just indent .then( two spaces from the line below, like:

this._promise = whatever(parameters)
  .then(() => success path;
        () => failure path);

@@ +322,5 @@
>  };
> +
> +AsyncShutdown.profileBeforeChange.addBlocker(
> +  "sessionStoreManager: session.json",
> +  () => { return sessionStoreManager._promise; }

Document and initialize _promise to null in sessionStoreManager{}
Attachment #8383116 - Flags: review?(irving) → review-
Attached patch session.patch (obsolete) — Splinter Review
Attachment #8383116 - Attachment is obsolete: true
Attachment #8384804 - Flags: review?(irving)
Attached patch session.patchSplinter Review
Attachment #8384804 - Attachment is obsolete: true
Attachment #8384804 - Flags: review?(irving)
Attachment #8384828 - Flags: review?(irving)
Attachment #8384828 - Flags: review?(irving) → review+
Flags: needinfo?(sdave)
Flags: needinfo?(m_kato)
Keywords: checkin-needed
Blocks: 810172
https://hg.mozilla.org/comm-central/rev/b0b81ad1198d
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Flags: in-testsuite+
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 30.0
Attached patch session-test.patch (obsolete) — Splinter Review
some tests on some platforms appear to need a larger delay.
Attachment #8386173 - Flags: review+
Keywords: checkin-needed
https://hg.mozilla.org/comm-central/rev/2238918d2c5c

FWIW, timeouts and waits are inherently unreliable in automation. IMO, you should try to find a better way to test this. Is there an event you can listen for instead?
Keywords: checkin-needed
sure, async is a bear and these tests were inherently written not for the real life scenario, but to make them go fast in a sync situation.

see comment 10.  it may be necessary to redo the tests for async, to not be intermittent fails.  but the core fix should remain imo.
@irving, perhaps you can advise: the must haves of this bug are 1)session.json non data loss, 2)utf8.  async write is merely a nice to have, but if it becomes a blocker due to tests, then what is the right resolution?
tests adjusted to wait for the several places the 3pane is closed and writes the session file.

@ryanvm, is it possible to do this on try?  or with a dontbuild?
Attachment #8386173 - Attachment is obsolete: true
Attachment #8386272 - Flags: review+
I think for now we should backout and you can iterate on Try.
https://hg.mozilla.org/comm-central/rev/1cdabf9d6479
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: Thunderbird 30.0 → ---
what is the mechanism for requesting an all platform mozmill try run?
I've pushed your two patches to try: https://tbpl.mozilla.org/?tree=Thunderbird-Try&rev=e81d218b1872

If you do more work on Thunderbird or Mozilla products in general in the future, you may want to get a LDAP (try) account: http://www.mozilla.org/hacking/commit-access-policy/
mcsmurf, thanks for the try push.

alta88, here's a possible approach for making the tests deterministic. It's a little tricky, because we need to take into account both the delay timer and the actual async operations (both OS.File.writeAtomic and OS.File.

The basic strategy is to have sessionStoreManager.js have a single accessor that will always return a promise. If there are no async operations in progress (either none have started or all have completed), that promise will already be resolved. If an OS.File.move() or OS.File.writeAtomic() are in progress, the promise we return should resolve when that operation completes.

The test can use Components.utils.import to load sessionStoreManager.js and then call an accessor to get access to the promise; the test can then wait for the promise rather than using a timer.

If that explanation makes sense to you, and you want to put together a patch for it, go ahead; find me in irc #maildev if you have any questions.
@mcsmurf - thanks for the try push.  it looks like the wait for a window close does the trick.

@irving - yes i see how that can work, i guess the 'wait for the promise' is a yield?  anyway, it's a bit of work, and i'm rather closing out my Tb work.

requesting re-land.
Keywords: checkin-needed
https://hg.mozilla.org/comm-central/rev/5f23be6ef89c
https://hg.mozilla.org/comm-central/rev/367c80034370
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 30.0
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: