Closed Bug 212316 Opened 22 years ago Closed 17 years ago

Mozilla must handle WM_ENDSESSION message to cleanly unload in case of exiting or restarting windows

Categories

(Core :: Widget: Win32, defect, P3)

x86
Windows XP
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: wd, Assigned: emaijala+moz)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 3 obsolete files)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5a) Gecko/20030709 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5a) Gecko/20030709 I've spawned this bug from bug 212251 When Microsoft Windows is exiting, it will send a WM_ENDSESSION message to the running applications. It is up to the application to handle the message and terminate cleanly. Due to the symptoms in bug 212251 (cache is reset) and a quick search on lxr, it appears that Mozilla is not handling the WM_ENDSESSION message that is being sent to it. Reproducible: Always Steps to Reproduce: 1.Open Mozilla 2.Restart MS Windows 3.Open Mozilla when windows starts back up 4.Go to: about:cache Actual Results: Disk cache has been reset to 0. Expected Results: Disk cache should be OK, since mozilla should handle the WM_ENDSESSION message and shut itself down cleanly. Since virtually all Windows applications handle WM_ENDSESSION properly, I know of few users who actually manually shut down every application before restarting Windows. Mozilla, though, does not seem to be one of these applications.
Blocks: 212251
.
Assignee: jaggernaut → law
Depends on: 14807
This could use a real owner, guys....
I think it's actually all handled in the WM_QUERYENDSESSION processing. http://lxr.mozilla.org/mozilla/source/xpfe/bootstrap/nsNativeAppSupportWin.cpp#959
I think that the WM_QUERYENDSESSION is sent first, which can let an application cancel the shutdow/logoff process. But then after that, a WM_ENDSESSION is called, where the applications should actually perform their own clean shutdown. It appears that mozilla is not performing that second part. Or are you saying that mozilla is unloading itself uncleanly when it receives the WM_QUERYENDSESSION? There's some more info about WM_ENDSESSION at: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/sysinfo/base/wm_endsession.asp
I understand what WM_ENDSESSION is for, yes. I think, though, that from looking at the WM_QUERYENDSESSION handling that it's actually unloading there. If the application wasn't exiting properly when shutting down Windows, wouldn't you see the "This application is not responding, close it now?" prompt? I never see that.
I'm not much of a coder, but it does look like Mozilla is unloading itself when it receives a WM_QUERYENDSESSION message. But I believe that's incorrect behavior. WM_QUERYENDSESSION is used to ask applications if it's OK to exit windows, and that's it. For example, if you have unsaved work in MS Word... it'll halt the shutdown/restart process until you answer the dialog that's open asking if you want to save your work. Now, if you don't answer that question for a certain amount of time (60 seconds?), you will get a Windows dialog saying that the application is not responding, and if you want to wait, cancel logoff, or terminate the application. From what I can gather, Mozilla is prematurely unloading when it receives the WM_QUERYENDSESSION, and it's doing it uncleanly. (Thus the disk cache size of zero upon restart). Mozilla should unload itself upon receiving the WM_ENDSESSION, and it should do it cleanly.
Note: When restarting windows when mozilla is open, I get the following Assert and then mozilla crashes: ###!!! ASSERTION: LocalStoreImpl not thread-safe: '_mOwningThread.GetThread() == PR_GetCurrentThread()', file f:/mozilla_source/mozilla/rdf/datasource/src/nsLoc alStore.cpp, line 285 Break: at file f:/mozilla_source/mozilla/rdf/datasource/src/nsLocalStore.cpp, li ne 285 I believe the crash is causing the disk cache to be reset, rather than the killall routine being performed on WM_QUERYENDSESSION
.... please stop believing, it's a waste of my time. if mozilla crashes before disk cache politely shuts down then the cache will kill itself the next time it starts. that's just the way it's designed. it has nothing to do with anything. if mozilla crashes or otherwise decides to run its shutdown procedure on the wrong thread then you'll get the assert you saw. it's a symptom of something else going wrong (e.g. using ctrl-c to kill your mozilla). and i'm willing to bet that it too has nothing to do with anything (here). of course a crash will cause mozilla to skip whatever graceful shutdown plans it has. -- What would be useful? well at the very least how about adding some logging code to see if WM_ENDSESSION/WM_QUERYENDSESSION are reached in the places where our code tries to handle them. this would be a good exercise for you.
Sorry if I'm wasting your time. Just trying to help... Immediately after the assert, I get a dialog: mozilla.exe - Application Error The exception Breakpoint A breakpoint has been reached. (0x80000003) occurred in the application at location 0x77f9180c. Click OK to terminate the program The console output is: Break: at file f:/mozilla_source/mozilla/rdf/datasource/src/nsLocalStore.cpp, li ne 285 Which is the same line as the assert. For me, Mozilla never gets to the code that handles WM_QUERYENDSESSION. (And there is no code that handles WM_ENDSESSION, but I'm thinking that might be a trivial issue now that I've discovered the breakpoint)
This really isn't helping. How about spy++ logs?
Summary: Mozilla must handle WM_ENDSESSION message to cleanly unload in case of exiting or restarting windows → lMozilla must handle WM_ENDSESSION message to cleanly unload in case of exiting or restarting windows
Summary: lMozilla must handle WM_ENDSESSION message to cleanly unload in case of exiting or restarting windows → Mozilla must handle WM_ENDSESSION message to cleanly unload in case of exiting or restarting windows
More info: The section of code here: http://lxr.mozilla.org/seamonkey/source/xpfe/bootstrap/nsNativeAppSupportWin.cpp#959 Which is supposed to handle WM_QUERYENDSESSION is never called upon Mozilla receiving that very WM message. Mozilla does see the message in nsWindow::ProcessMessage, but it never makes it to the code in the lxr URL above.
To facilitate testing this bug without having to actually exit or restart windows, I've made a quick and dirty VB6 app (with source) to send a WM_QUERYENDSESSION message to mozilla. It makes the assumption that the window title is "Mozilla {Build ID: 0000000000}"
out of curiosity, do you have MOZ_NO_REMOTE=1 in your env?
No, I don't have MOZ_NO_REMOTE set
nsNativeAppSupportWin creates a window for itself. AppShell handles a different window. nsNativeAppSupportWin gets the messages sent to MozillaMessageWindow. My current build actually crashes before getting WM_QUERYENDSESSION in either place.
Yes, if I actually go through the logoff/restart process, mozilla does end up crashing. (thus bug 212251 ) If I use the VB app to simulate WM_QUERYENDSESSION, I get no crash in Mozilla, but it doesn't handle the message at all.
Looks like Windows won't send WM_QUERYENDSESSION to command line apps, and a Mozilla debug build is considered one. With an optimized build the message is received and handled in nsNativeAppSupportWin. Try for example opening Composer, writing something and doing logoff. We shouldn't need to handle WM_ENDSESSION explicitly.
Correct, see bug 14807 comment 44 - note also that goQuitApplication may irrevocably close several windows before finding one that prompts.
bug 235220 may be another thing going wrong here - dup?
Product: Core → Mozilla Application Suite
(In reply to comment #0) > Actual Results: > Disk cache has been reset to 0. My actual results: Firefox displays a « Firefox hasn't been properly shutdown » and I'm prompt to restore the session or not. I can't try to capture the dialog next time it happens, say tomorrow, if needed. As the problem always occur I took the bad habit of exiting Firefox manually.
(In reply to comment #20) > My actual results: > Firefox displays a « Firefox hasn't been properly shutdown » and I'm prompt > to restore the session or not. > > I can't try to capture the dialog next time it happens, say tomorrow, if > needed. As the problem always occur I took the bad habit of exiting Firefox > manually. Here is the message of the attached Firefox "Restore Previous Session" dialog screenshot : « Your last Firefox session closed unexpectedly. You can restore the tabs and windows from your previous session, or start a new session if you think the problem was related to a page you were viewing. ».
Dupe of bug 333907?
Depends on: 333907
I suspect the current problem is that while WM_ENDSESSION is handled, the window proc is returning before shutdown is complete (enough). After WM_ENDSESSION has been processed, Windows may kill the app at will. nsIAppStartup::Quit will initiate the quit, but it takes a while until it's actually complete.
It seems I was wrong and we just shoot ourselves down when trying to process WM_ENDSESSION. I'll try to find how it actually happens.
Assignee: law → emaijala
Status: NEW → ASSIGNED
Component: UI Design → Widget: Win32
OS: Windows 2000 → Windows XP
Product: SeaMonkey → Core
Attached patch Patch v1 (obsolete) — Splinter Review
Here is patch that should do it (once again). According to some extensive research this is what happens during logoff/shutdown: 1. Windows sends WM_QUERYENDSESSION to all windows. It stops only if a window returns FALSE. 2. Windows sends WM_ENDSESSION to all windows. wParam is TRUE if no FALSE was returned in the first step, indicating the session is ending. In this case the window is doomed when this message is handled and could be closed at any time by Windows. The message window in nsNativeAppSupportWin seems to be getting the messages last. This means it's way too late to try to handle WM_QUERYENDSESSION there because all windows would be dead already. Therefore it's now done in nsWindow when the first window gets WM_QUERYENDSESSION. When WM_ENDSESSION is received, the corresponding window is destroyed right away. This will probably make us exit already, but if that for some reason doesn't happen, nsNativeAppSupportWin will call nsAppStartup::Quit() that will make it happen. To aid nsAppStartup do that, WindowWatcher will now refuse to open new windows after receiving quit-application message.
Attachment #333073 - Flags: superreview?(roc)
Attachment #333073 - Flags: review?(neil)
Flags: wanted1.9.1?
Forgot to mention that this won't work in a debug build. Windows doesn't send WM_QUERYENDSESSION or WM_ENDSESSION to console programs.
(In reply to comment #25) > Here is patch that should do it (once again). According to some extensive > research this is what happens during logoff/shutdown: > > 1. Windows sends WM_QUERYENDSESSION to all windows. It stops only if a window > returns FALSE. > > 2. Windows sends WM_ENDSESSION to all windows. wParam is TRUE if no FALSE was > returned in the first step, indicating the session is ending. In this case the > window is doomed when this message is handled and could be closed at any time > by Windows. Minor correction here, not that it should affect the patch. This is how it used to work in either Win 3.1 or Win 95 (I forget). It used to be that WM_QUERYENDSESSION was sent to all top-level windows, and if any returned false then the shutdown would be halted. The behavior now is WM_QUERYENDSESSION is sent to one app, and if it respond true it is immediately sent WM_ENDSESSION. "When an application returns TRUE for this message, it receives the WM_ENDSESSION message, regardless of how the other applications respond to the WM_QUERYENDSESSION message."
I thought so too, but I was seeing WM_QUERYENDSESSION being sent to all windows one by one.
Comment on attachment 333073 [details] [diff] [review] Patch v1 So the cache is being destroyed even with this patch. Will need some more work..
Attachment #333073 - Attachment is obsolete: true
Attachment #333073 - Flags: superreview?(roc)
Attachment #333073 - Flags: review?(neil)
Dean is actually correct. I must have screwed up something in my previous tests. WM_QUERYENDSESSION is only sent once if the first window handles is.
I'll take that back. It seems WM_QUERYENDSESSION is _sometimes_ sent to multiple windows. Strange but needs to be accounted for.
After a lot of further debugging, Windows seems so eager to kill us that we have to resort to some ugly things to do a clean-enough shutdown.
Attached patch Patch with a different approach (obsolete) — Splinter Review
It's pretty difficult to make AppShell quit in time for a clean shutdown before Windows kills the process. Trying to keep windows alive while performing a clean shutdown tends to lead to crashes and Windows terminates the process in the middle anyway. Therefore taking a different approach by just faking a shutdown and not really trying at all. To accomplish this: 1. nsWindow sends the required notifications including a new quit-application-forced which tells nsAppStartup to not try to close any windows upon profile-change-teardown. 2. nsNativeAppSupportWin doesn't try anything. 3. A bit of cleanup including removal of bogus comments from nsAppStartup::Quit.
Attachment #333734 - Flags: review?(neil)
Comment on attachment 333734 [details] [diff] [review] Patch with a different approach >+ obsServ->NotifyObservers(nsnull, "profile-change-net-teardown", nsnull); >+ obsServ->NotifyObservers(nsnull, "profile-change-teardown", nsnull); >+ obsServ->NotifyObservers(nsnull, "profile-before-change", nsnull); You shouldn't pass nsnull to these notifications (SeaMonkey has some old code which assumes you're following the spec), did you mean "shutdown-persist"?
Attachment #333734 - Flags: review?(neil) → review+
Comment on attachment 333734 [details] [diff] [review] Patch with a different approach >+ // If we can quit, do it right away. Let's fake a shutdown sequence without >+ // actually closing windows etc. to avoid Windows killing us in the >+ // middle. A proper shutdown would require having a chance to pump some >+ // messages. Unfortunately Windows won't let us do that. Bug 212316. >+ if (sCanQuit) >+ { >+ nsCOMPtr<nsIObserverService> obsServ = >+ do_GetService("@mozilla.org/observer-service;1"); >+ obsServ->NotifyObservers(nsnull, "quit-application-granted", nsnull); >+ obsServ->NotifyObservers(nsnull, "quit-application-forced", nsnull); >+ obsServ->NotifyObservers(nsnull, "quit-application", nsnull); >+ obsServ->NotifyObservers(nsnull, "profile-change-net-teardown", nsnull); >+ obsServ->NotifyObservers(nsnull, "profile-change-teardown", nsnull); >+ obsServ->NotifyObservers(nsnull, "profile-before-change", nsnull); >+ obsServ->NotifyObservers(nsnull, "xpcom-shutdown", nsnull); >+ } >+ } The thing is, this doesn't actually quit. If something else aborts the quit we'll be left in an inconsistent state.
Attachment #333734 - Flags: review+ → review-
(In reply to comment #31) > I'll take that back. It seems WM_QUERYENDSESSION is _sometimes_ sent to > multiple windows. Strange but needs to be accounted for. Is it sent to multiple windows if you cancel the quit in the first window?
>>+ obsServ->NotifyObservers(nsnull, "profile-change-net-teardown", nsnull); >>+ obsServ->NotifyObservers(nsnull, "profile-change-teardown", nsnull); >>+ obsServ->NotifyObservers(nsnull, "profile-before-change", nsnull); >You shouldn't pass nsnull to these notifications (SeaMonkey has some old code >which assumes you're following the spec), did you mean "shutdown-persist"? Ok, will fix the params. What's "shutdown-persist"? I was just following the docs at <http://developer.mozilla.org/en/docs/Observer_Notifications>? > The thing is, this doesn't actually quit. If something else aborts the quit > we'll be left in an inconsistent state. Since we return TRUE from WM_QUERYENDSESSION, Windows will kill us regardless of whether another program aborts the shutdown. It's no problem to move it to the first WM_ENDSESSION though, if you think that's better. >> I'll take that back. It seems WM_QUERYENDSESSION is _sometimes_ sent to >> multiple windows. Strange but needs to be accounted for. >Is it sent to multiple windows if you cancel the quit in the first window? Yes, I believe that happens too, though not always which is puzzling.
Oops, got the shutdown-persist thing.
Attached patch Patch v2 (obsolete) — Splinter Review
This patch should fix the comments. I moved the actual shutdown simulation to WM_ENDSESSION to be sure and more logical.
Attachment #333734 - Attachment is obsolete: true
Attachment #334112 - Flags: review?(neil)
(In reply to comment #37) >>>+ obsServ->NotifyObservers(nsnull, "profile-change-net-teardown", nsnull); >>>+ obsServ->NotifyObservers(nsnull, "profile-change-teardown", nsnull); >>>+ obsServ->NotifyObservers(nsnull, "profile-before-change", nsnull); >>You shouldn't pass nsnull to these notifications (SeaMonkey has some old code >>which assumes you're following the spec), did you mean "shutdown-persist"? >Ok, will fix the params. What's "shutdown-persist"? I was just following the >docs at <http://developer.mozilla.org/en/docs/Observer_Notifications>? You're supposed to pass either "shutdown-persist" or "shutdown-cleanse". I guess someone needs to update the docs at some point. >>The thing is, this doesn't actually quit. If something else aborts the quit >>we'll be left in an inconsistent state. >Since we return TRUE from WM_QUERYENDSESSION, Windows will kill us regardless >of whether another program aborts the shutdown. It's no problem to move it to >the first WM_ENDSESSION though, if you think that's better. Oh, really? The docs certainly suggest you should defer cleanup.
Comment on attachment 334112 [details] [diff] [review] Patch v2 >+ *aRetValue = 0; I just noticed that nsWindow::ProcessMessage does this early on.
Attachment #334112 - Flags: review?(neil) → review+
(In reply to comment #40) > >Since we return TRUE from WM_QUERYENDSESSION, Windows will kill us regardless > >of whether another program aborts the shutdown. It's no problem to move it to > >the first WM_ENDSESSION though, if you think that's better. > Oh, really? The docs certainly suggest you should defer cleanup. MSDN states that "When an application returns TRUE for this message, it receives the WM_ENDSESSION message, regardless of how the other applications respond to the WM_QUERYENDSESSION message.". It doesn't say that wParam will be TRUE, but that's how it seems to be. Anyway, the new version eliminates any possibility of trouble here.
Comment on attachment 334112 [details] [diff] [review] Patch v2 I'll remove the extraneous retvalue setting.
Attachment #334112 - Flags: superreview?(roc)
Comment on attachment 334112 [details] [diff] [review] Patch v2 I'm going to throw the review to bsmedberg since he knows a lot more about shutdown than I do. Benjamin, see comments #25 and #33 ...
Attachment #334112 - Flags: superreview?(roc) → superreview?(benjamin)
(In reply to comment #42) > MSDN states that "When an application returns TRUE for this message, it > receives the WM_ENDSESSION message, regardless of how the other applications > respond to the WM_QUERYENDSESSION message.". And then it says "Each application should return TRUE or FALSE immediately upon receiving this message, and defer any cleanup operations until it receives the WM_ENDSESSION message."
(In reply to comment #45) > And then it says "Each application should return TRUE or FALSE immediately upon > receiving this message, and defer any cleanup operations until it receives the > WM_ENDSESSION message." > And that's where it differs from the MSDN version (Visual Studio 2005) I have installed (should have checked online).
We should fix this; sounds like we're close.
Flags: wanted1.9.1? → wanted1.9.1+
Priority: -- → P3
Comment on attachment 334112 [details] [diff] [review] Patch v2 I think that this approach suffers the same problem as bug 333907 comment 13 and a couple following. If we need additional safety here, I would like to consider the approaches I mentioned in bug 333907 comment 17 and the patch as attachment 267716 [details] [diff] [review], that is, we never dispatch the WM_ENDSESSION message, so Windows doesn't forcefully shut down the app.
Attachment #334112 - Flags: superreview?(benjamin) → superreview-
(In reply to comment #48) > (From update of attachment 334112 [details] [diff] [review]) > I think that this approach suffers the same problem as bug 333907 comment 13 > and a couple following. Not really. XPCOM isn't really shut down, but we just tell others it is. Right after that Windows kills us. Definitely. If you think it's better, we can probably commit a suicide by calling exit(). > If we need additional safety here, I would like to consider the approaches I > mentioned in bug 333907 comment 17 and the patch as attachment 267716 [details] [diff] [review], that is, > we never dispatch the WM_ENDSESSION message, so Windows doesn't forcefully shut > down the app. I'm afraid that won't work. The UI thread seems to be getting WM_ENDSESSION first and I'm not sure anything good happens if we don't dispatch it there (I believe Windows would consider the app hung). And because of that we can't make AppShell quit in time. If we do dispatch it, we're dead as soon as it's handled and AppShell doesn't receive anything or if it does, it won't have time to handle it.
It's definitely better to commit suicide than to rely on Windows terminating us. But I'm still seriously concerned about firing xpcom-shutdown from this code at all, because lots of code makes state assumptions that won't hold up. I think we should stop after the profile-before-change notification and abort at that point.
Attached patch Patch v2.1Splinter Review
We went through a couple of ideas with bsmedberg but couldn't find more elegant solutions. Here's the patch with removal of xpcom-shutdown notification and addition of _exit(0). Assuming neil is still ok with this..
Attachment #334112 - Attachment is obsolete: true
Attachment #335421 - Flags: superreview?
Attachment #335421 - Flags: review+
Attachment #335421 - Flags: superreview? → superreview?(benjamin)
Attachment #335421 - Flags: superreview?(benjamin) → superreview+
Pushed to mozilla-central.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
(In reply to comment #51) >Assuming neil is still ok with this.. JFTR that's fine by me.
No longer blocks: 484442
Depends on: 484442
I have tested for the issue several times on Windows XP Professional with Service Pack 3 with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b5pre) Gecko/20090428 Shiretoko/3.5b5pre by following the steps posted on the first post and I can confirm that the issue does not occur anymore. However, I am new into this and I did not find how to change the keyword to "verified1.9.1"...
Blocks: 1771443
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: