Don't resume/recover session when crashing after quit-application-granted

ASSIGNED
Assigned to

Status

()

Firefox
Session Restore
ASSIGNED
2 years ago
3 months ago

People

(Reporter: Kestrel, Assigned: Kestrel)

Tracking

(Blocks: 1 bug)

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Assignee)

Description

2 years ago
It doesn't make sense to resume/recover a session when there was a crash in the shutdown procedure after the user has quit and before the final session write. There can be a significant delay between the checkpoints quit-application-granted and sessionstore-final-state-write-complete (11 seconds in some cases) which may be interrupted and registered as a crash. The shutdown phase is particularly susceptible to crashes including the more recent Bug 1260461 and there's no need for it to interfere with session restoration.

From the user's perspective, once they have quit the session they don't want to see it again unless they have specifically configured the browser to resume it on startup. They don't care if it crashed when it last shutdown, they don't want to see about:sessionrestore's "This is embarrassing" message which is confusing in this context and they really don't want to accidentally click the wrong button and open up a compromising session or lose their session and not know how to get it back. Sessions that always resume can get very big and consequently take longer to shutdown, increasing the likelihood of being interrupted with a greater risk of data loss if the user doesn't restore it properly, so the fewer unnecessary about:sessionrestore's the better.

The final session write only saves the last few seconds of browsing data so in the case of a crash there is minimal data loss. It is not worth triggering about:sessionrestore over which doesn't help to indicate data loss or get it back anyway.

Looking at the past session crash detection logic before Crash Monitor (Bug 887780) it seems a crash was not detected if the session state was in the process of quitting.
(Assignee)

Updated

2 years ago
Depends on: 887780
(Assignee)

Comment 1

2 years ago
Created attachment 8742723 [details] [diff] [review]
bug1265654.patch

This patch changes the checkpoint used for session crash detection from sessionstore-final-state-write-complete to quit-application-granted so that post-quit crashes do not resume/recover the session.

Telemetry SHUTDOWN_OK is unchanged, still using the sessionstore-final-state-write-complete checkpoint.

(I will need help with a try push)
Attachment #8742723 - Flags: review?(mconley)
Comment on attachment 8742723 [details] [diff] [review]
bug1265654.patch

I'm very hesitant to do this, especially since I'm only partially familiar with the intricacies of SessionStore shutdown.

ttaubert, do you have a second to give some thoughts on this proposal?
Attachment #8742723 - Flags: feedback?(ttaubert)
Comment on attachment 8742723 [details] [diff] [review]
bug1265654.patch

Review of attachment 8742723 [details] [diff] [review]:
-----------------------------------------------------------------

I think this might work, but in general I'm not super excited about changing something rather fundamental to the shutdown process at a point where sessionstore isn't properly maintained, or at least has someone able to dedicate enough time to it. Another problem is that we have basically no startup/shutdown/restart tests for sessionstore so any failures here are very likely not covered by any of our test suites.

::: browser/components/sessionstore/nsSessionStartup.js
@@ +178,3 @@
>          // If the previous session finished writing the final state, we'll
> +        // assume shutdown was successful.
> +        shutdownSuccess = Boolean(checkpoints["sessionstore-final-state-write-complete"]);

nit: we don't use Boolean() anywhere, !! would be better.

@@ -173,3 @@
>          // If the previous session finished writing the final state, we'll
> -        // assume there was no crash.
> -        this._previousSessionCrashed = !checkpoints["sessionstore-final-state-write-complete"];

The "sessionstore-final-state-write-complete" notification was introduced for that single purpose, should remove it too if we take that patch.

@@ +212,5 @@
>  
>        // Report shutdown success via telemetry. Shortcoming here are
>        // being-killed-by-OS-shutdown-logic, shutdown freezing after
>        // session restore was written, etc.
> +      Services.telemetry.getHistogramById("SHUTDOWN_OK").add(shutdownSuccess);

This should probably not change. It's sessionstore telemetry anyway and if the definition of a "clean shutdown" changes then why not change it too?
Attachment #8742723 - Flags: feedback?(ttaubert)
Thinking about it, I guess this is a matter of informing the user about dataloss or not.

Like, if we get into a situation where the browser crashes or doesn't exit properly before hitting sessionstore-final-state-write-complete, I think we can safely assume that any updated session state since the last write has not been written, and anything in there has been lost.

Currently, what we do is show the user about:sessionrestore, and give them the ability to restore the last snapshot of the session that we had.

What this patch seems to do is to cause us to _ignore_ that data loss, and just use the last session write that we have available to us.

The only advantage that about:sessionrestore gives us, beyond alerting the user about the failed final write, is the ability to selectively restore some tabs or windows.

I'm not sure what to do here - and ttaubert is right, SessionStore is kind of un-owned at the moment. I'm going to rope in Yoric for his opinion too.

If Yoric can help us come to a conclusion, perhaps I can escalate and see how UX feels about the dataloss handling experience here.
Flags: needinfo?(dteller)
(Assignee)

Comment 5

2 years ago
(In reply to Tim Taubert [:ttaubert] from comment #3)
> I think this might work, but in general I'm not super excited about changing
> something rather fundamental to the shutdown process at a point where
> sessionstore isn't properly maintained, or at least has someone able to
> dedicate enough time to it. Another problem is that we have basically no
> startup/shutdown/restart tests for sessionstore so any failures here are
> very likely not covered by any of our test suites.

I appreciate your concerns but it sounds like another reason to make this change to shield users from problems with sessionstore shutdown due to any maintenance issues or lack of tests.

> The "sessionstore-final-state-write-complete" notification was introduced
> for that single purpose, should remove it too if we take that patch.

Will be removed in Part 2 if telemetry doesn't use it.

> This should probably not change. It's sessionstore telemetry anyway and if
> the definition of a "clean shutdown" changes then why not change it too?

I kept telemetry the same as I thought there was considerable value in knowing when a session really does shutdown properly, past bug discussion seemed to imply there was and without it bugs like Bug 1260461 would be less noticeable. The name "SHUTDOWN_OK" does not signify it is limited to session shutdown so using a later checkpoint in the shutdown procedure is more reflective of its name. It also has some statistical history that would be adversely affected by any change.

(In reply to Mike Conley (:mconley) - Needinfo me! from comment #4)
> What this patch seems to do is to cause us to _ignore_ that data loss, and
> just use the last session write that we have available to us.

Data loss is already ignored (ie about:sessionrestore is not shown) when resuming a crashed session that crashed once within the last six hours. Data loss is also minimal according to Bug 914581, comment 1:
> Note that data will not be corrupted.
> In the worst case scenario I can imagine, we will lose ~15 seconds of
> browsing.

The about:sessionrestore page does not inform the user about data loss, instead being more concerned with troubleshooting startup problems. It is an unnecessary obstacle and leaves a bad impression, especially when it appears every startup due to a consistent shutdown crash. 

Sessions crashing after quit and being resumed against the wishes of the user fails the Principle of Least Surprise.

When the original choice of checkpoint was considered, the consequences for session restore did not appear to be covered:

Bug 887780, Comment 4
> Remind me, which crash information exactly do we need for this feature? Does
> it need to be profile-before-change or would e.g. quit-application be
> sufficient?

Bug 887780, Comment 5
> Session store is using AsyncShutdown to add its final write as a blocker to
> profile-before-change, so we need to know that profile-before-change fully
> completed, not just started. quit-application would not be sufficient.

This logic makes sense for more thorough crash detection and is consistent with past behavior but overlooks issues with it being used to decide when to recover a session.
> I'm not sure what to do here - and ttaubert is right, SessionStore is kind of un-owned at the moment. I'm going to rope in Yoric for his opinion too.
>
> If Yoric can help us come to a conclusion, perhaps I can escalate and see how UX feels about the dataloss handling experience here.

Sorry about the delay, I'm just back from PTO.

I also agree with ttaubert that we need someone in charge of Session Restore before we make deep changes – and that we need have startup/shutdown tests. Benjamin, do you know if it would be feasible to allocate someone to this? Session Restore is safety-critical (as well as performance-critical) but basically unmanned.

Now, for this specific problem. If I recall correctly, the only difference between the final Session Restore write and the regular writes is that the final one makes more efforts to erase private data. In terms of experience, in almost all cases, there should be no dataloss (or at least no more than 15 seconds data loss) due to a shutdown crash. So I would be entirely satisfied with a flow that replaces about:sessionrestore, with a non-intrusive indication that Firefox has recovered from a crash.
Flags: needinfo?(dteller) → needinfo?(benjamin)

Comment 7

2 years ago
I am looking into who owns this long-term, but there isn't an immediately-clear answer. Expect more info soon.

Comment 8

2 years ago
Mike de Boer has stepped up to learn/take long-term ownership of session restore. He'll be learning from ttaubert and mconley along the way, and I suspect he won't be able to make a firm decision on something of this magnitude until a bit later. Redirecting NEEDINFO to him.
Flags: needinfo?(benjamin) → needinfo?(mdeboer)
Comment on attachment 8742723 [details] [diff] [review]
bug1265654.patch

Okay, I'm going to clear my review request here until we hear back from mikedeboer's take on how we want to handle this.
Attachment #8742723 - Flags: review?(mconley)
I'm agreeing with Yoric here and Tim's review comments to the patch still apply. I'm happy to accept a patch that makes this work IF there's a unit test attached like the one Mike Conley wrote in bug 1228120: http://searchfox.org/mozilla-central/source/testing/firefox-ui/tests/functional/sessionstore/test_restore_windows_after_restart.py
Blocks: 1330633
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(mdeboer)
Assignee: nobody → kestrel
Status: NEW → ASSIGNED
Flags: needinfo?(kestrel)
Flags: needinfo?(kestrel)
You need to log in before you can comment on or make changes to this bug.