Closed Bug 1696397 Opened 3 years ago Closed 3 years ago

Avoid shutdown hangs caused by Prompter.jsm

Categories

(Toolkit :: General, defect)

defect

Tracking

()

RESOLVED FIXED
88 Branch
Tracking Status
firefox88 --- fixed

People

(Reporter: jstutte, Assigned: jstutte)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

It seems that modal dialogs can block the shutdown progress.

And indeed openPromptSync spins the event loop without further condition.

We should investigate different options here.

Blocks: 1678330

It looks like this promise isn't being resolved or rejected during the quit-application shutdown phase, meaning that the application hangs when trying to quit: https://searchfox.org/mozilla-central/rev/002023eb262be9db3479142355e1675645d52d52/toolkit/components/prompts/src/Prompter.jsm#1054

I'm not entirely certain which promise step in the openPrompt code is the one which is hanging, as it's a bit difficult to tell right now, but I don't see any code which makes a conscious effort to ensure that if quit-application is fired, but I'm guessing we're hanging here (https://searchfox.org/mozilla-central/rev/002023eb262be9db3479142355e1675645d52d52/toolkit/components/prompts/src/Prompter.jsm#1174-1177) due to being in the parent process, but we could theoretically also be here: https://searchfox.org/mozilla-central/rev/002023eb262be9db3479142355e1675645d52d52/browser/base/content/browser.js#9379, waiting for the window modal dialog to be closed.

An "easy" option here might be to leave the nested event loop early with an error when we enter the quit-application phase, and perhaps put the entire Prompter.jsm into a new state where it can't open new prompts anymore due to being late enough we don't want to be showing the user more prompts, so we don't open a new prompt after this point.

ni? :pbz because I remember you recently touching this code.

Flags: needinfo?(pbz)

Thanks!
If openPrompt is hanging and doesn't resolve for some reason, I'd still expect spinEventLoopUntilOrShutdown to handle the shutdown and return?
Even for the case where a prompt is spawned in the shutdown phase it should exit the loop: https://searchfox.org/mozilla-central/rev/002023eb262be9db3479142355e1675645d52d52/xpcom/threads/nsThreadManager.cpp#762-767
Perhaps I'm misunderstanding spinEventLoopUntilOrShutdown here?

EDIT: Ok I've just seen https://bugzilla.mozilla.org/show_bug.cgi?id=1678330#c12 , so we can't rely on that method to always return on shutdown? In that case we should either try to fix that or add a comment to warn about this unexpected behavior.

In Nightly I can reproduce the shutdown hang if I close the browser window via the window close button, exit out of the resulting tab close warning dialog via esc and hit the close button again in a very short timeframe.

Flags: needinfo?(pbz) → needinfo?(nika)

(In reply to Paul Zühlcke [:pbz] from comment #2)

EDIT: Ok I've just seen https://bugzilla.mozilla.org/show_bug.cgi?id=1678330#c12 , so we can't rely on that method to always return on shutdown? In that case we should either try to fix that or add a comment to warn about this unexpected behavior.

Yeah, the method is unfortunately not reliable for that kind of thing because it won't cause the loop to exit when quit-application has been notified, spinEventLoopUntilOrShutdown only stops spinning at xpcom-shutdown, which is much later.

It looks like the method is only used in 3 places:

All of these places (perhaps with the exception of the browser-custom-element.js?) seem like they would benefit from having spinEventLoopOrShutdown aborting the event loop in quit-application, so perhaps we should fix this bug by changing the behaviour of that method? If we want to do that we'll just need to add more topics to the ShutdownObserveHelper used by that method here: https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/xpcom/threads/nsThreadManager.cpp#269-304

In Nightly I can reproduce the shutdown hang if I close the browser window via the window close button, exit out of the resulting tab close warning dialog via esc and hit the close button again in a very short timeframe.

Good to know we have a reliable STR here!

Flags: needinfo?(nika)

Sounds good! In order to avoid any problems with the order of notifications during quit-application we might want to react on quit-application-granted here.

Severity: -- → S2

Hi, I just gave it a try to make some more cleanup (see WIP patch), in the hope to harmonize things a bit more with AppShutdown and ShutdownPhases.

Ideally this would give us the possibility to remove the ShutdownObserveHelper entirely. But unfortunately there is also the "content-child-will-shutdown" notification case which still needs it? I know too little about a content process's shutdown sequence and if AppShutdown is involved there at all.

Thoughts?

BTW: There is also KillClearOnShutdown which has its own static status variable for the current shutdown phase, this could be further harmonized.

Flags: needinfo?(nika)
Flags: needinfo?(dothayer)
See Also: → 1425559

I spent some more time on code reading, my current understanding is:

But besides the conceptual similarities it still seems we have process-type-specific sets of shutdown phases and very different code paths to reach them. Thus it would be probably not really helping the ease of understanding if we just add some more phases to ShutdownPhase (breaking also the steadily increasing order of phases).

My conclusion would be then to choose between the following two options:

  • something on the lines of the WIP patch already attached here, without trying to further align the child process handling, too
  • just add topics to the original handling as suggested by Nika
Assignee: nobody → jstutte
Attachment #9207716 - Attachment description: Bug 1696397: WIP Add SpinEventLoopUntilOrQuit → Bug 1696397: Add SpinEventLoopUntilOrQuit and use it in Prompter and LoginManagerAuthPrompter. r?nika
Status: NEW → ASSIGNED

(In reply to Jens Stutte [:jstutte] from comment #7)

My conclusion would be then to choose between the following two options:

  • something on the lines of the WIP patch already attached here, without trying to further align the child process handling, too
  • just add topics to the original handling as suggested by Nika

So there is always a third way. I propose to remove the ShutdownObserveHelper entirely from nsThreadManager in favor of faking a ShutdownPhase::AppShutdownConfirmed from within child process shutdown. This feels less scary to me than having child process specific notification messages as a hard dependency of SpinEventLoopUntilOrQuit and maybe paves the way for more thorough harmonization work here.

Patch is up for review, thus clearing the ni's (but feel free to dissent on the patch, of course).

I also launched a try push FWIW.
:pbz, maybe you want to try it out with your STR (I was not able to reproduce here)?

Flags: needinfo?(nika)
Flags: needinfo?(dothayer)
See Also: → 1697745
Attachment #9207716 - Attachment description: Bug 1696397: Add SpinEventLoopUntilOrQuit and use it in Prompter and LoginManagerAuthPrompter. r?nika → Bug 1696397: Move SpinEventLoopUntilOrShutdown to -Quit and move the current shutdown state logic from nsThreadManager to AppShutdown; r?kmag,#xpcom-reviewers

(In reply to Jens Stutte [:jstutte] from comment #8)

:pbz, maybe you want to try it out with your STR (I was not able to reproduce here)?

Thanks for working on this! With the patch I still managed to end up in some weird modal state, but it no longer hangs on shutdown.

(In reply to Paul Zühlcke [:pbz] from comment #9)

With the patch I still managed to end up in some weird modal state, but it no longer hangs on shutdown.

Hmm, we might want to return from SpinEventLoopUntilOrQuit with an error in that case? Otherwise the calling code might continue to try something weird?

:pbz, I made builds on try that include the return of an NS_ERROR_ILLEGAL_DURING_SHUTDOWN (not in the patch on phabricator yet) in case we abort the loop. Maybe you want to check if this makes behave the calling code better?

Flags: needinfo?(pbz)

Thanks! I've tested with the latest try build you linked. I still managed to produce a shutdown hang:
https://crash-stats.mozilla.org/report/index/06b06173-0fc1-4301-884a-111f00210315
Looks like it's still hanging in spinEventloopUntil.

Not sure if related, but testing with the previous try build I ran into another crash (can't reproduce it anymore):
https://crash-stats.mozilla.org/report/index/7d10935c-4e37-442c-b882-de15e0210315

Flags: needinfo?(pbz) → needinfo?(jstutte)

Sorry, I might have switched up the builds. I've tested again with the latest build from try and the patch seems to fix the shutdown hang issue.

I can still get into a state where the browser window is partially disabled, we can track that separately though. There is Bug 1696793 which seems related. Maybe there are still issues that we haven't fixed yet.

Flags: needinfo?(jstutte)
See Also: → 1698526

Thanks for checking! So to remain consistent with the old behavior of the former SpinEventLoopUntilOrShutdown the patch now does not return an NS_ERROR_ILLEGAL_DURING_SHUTDOWN error in case of abort, but just NS_OK. However, I left a comment there to indicate that we might want to prefer a more explicit handling here.

Summarizing the status of this bug:

  1. The patch D107619 mitigates the shutdown hang as such, it seems. There remains only a question whether we should report an explicit error here (which would semantically fit with SpinEventLoopUntilOrQuit, as a caller probably wants to have a way to know, what caused the exit). There might be a wish to convert as much javascript SpinEventLoopUntil calls to SpinEventLoopUntilOrQuit as possible, too, but I would consider this out of scope here (and rather look out for other XPCOMSpinEventLoopStack annotations on shutdown hangs first).

  2. There still should be some action on the lines of

put the entire Prompter.jsm into a new state where it can't open new prompts anymore due to being late enough we don't want to be showing the user more prompts, so we don't open a new prompt after this point."

as of nika from comment 1. This might maybe just be an explicit handling of the aforementioned error code? :pbz, I'd propose, that this is handled by a separate patch, as patch D107619 can land as such and already improves things (and I am not capable of doing the needed javascript magic, too).

Flags: needinfo?(pbz)

If we don't throw an error the prompt return value will fall back to the default. For example if a browser component shows a confirmation prompt, upon shutdown it would be closed and the result would be "cancel" (not accept). I think that's a reasonable approach and I don't expect any behavior change switching from spinEventLoopUntilOrShutdown to SpinEventLoopUntilOrQuit.

If we throw an error it would still up to the prompt callers (all consumers of nsIPromptService) to catch it. So ensuring compatibility would be ensuring all consumers can handle this specific exception. Prompter.jsm may already throw for "Prompt aborted by user" (for example upon navigation), so I wouldn't expect that much breakage.

Landing patch D107619 sounds good to me and I think it can close this bug, since we're concerned with shutdown hangs here.

Flags: needinfo?(pbz)

(In reply to Paul Zühlcke [:pbz] from comment #16)

If we don't throw an error the prompt return value will fall back to the default. For example if a browser component shows a confirmation prompt, upon shutdown it would be closed and the result would be "cancel" (not accept). I think that's a reasonable approach and I don't expect any behavior change switching from spinEventLoopUntilOrShutdown to SpinEventLoopUntilOrQuit.

That sounds like a reasonable flow also to me, and probably we should not raise an error then and I can remove the XXX comments.

Landing patch D107619 sounds good to me and I think it can close this bug, since we're concerned with shutdown hangs here.

OK, obviously first I need an r+.

Depends on: 1699041
No longer depends on: 1699041
See Also: → 1699041
Blocks: 1699041
Pushed by jstutte@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0294b90300aa
Move SpinEventLoopUntilOrShutdown to -Quit and move the current shutdown state logic from nsThreadManager to AppShutdown; r=kmag,xpcom-reviewers
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 88 Branch
See Also: → 1610213
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: