Last Comment Bug 662444 - Shutdown by exit(0), skip GCs and other unneeded operations
: Shutdown by exit(0), skip GCs and other unneeded operations
Status: NEW
[Snappy:p1]
: addon-compat
Product: Core
Classification: Components
Component: XPCOM (show other bugs)
: unspecified
: All All
: -- normal with 44 votes (vote)
: ---
Assigned To: Rafael Ávila de Espíndola (:espindola) (not reading bugmail)
:
Mentors:
: 407981 625875 688912 709873 (view as bug list)
Depends on: 722243 758343 826377 834945 AsyncShutdown 1056130 696399 697988 702815 702848 706647 711076 711447 728653 731741 732173 732368 752642 753461 826029
Blocks: zombieproc 444915 823262 826143 1129455
  Show dependency treegraph
 
Reported: 2011-06-06 17:58 PDT by (dormant account)
Modified: 2015-07-25 21:40 PDT (History)
97 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments

Description (dormant account) 2011-06-06 17:58:44 PDT
Firefox shutdown is slow and sometimes deadlocks. Explicit exit is encouraged by microsoft. We should only do the full shutdown in debug builds.
Comment 1 Boris Zbarsky [:bz] 2011-06-06 18:44:42 PDT
Except for the parts where we save user data during shutdown, right?
Comment 2 Ben Turner (not reading bugmail, use the needinfo flag!) 2011-06-06 18:53:31 PDT
Remove profile locks, etc. :)
Comment 3 Benjamin Smedberg [:bsmedberg] 2011-06-07 07:14:55 PDT
Given the current architecture, we at least need to get through profile-before-change. We also need to audit statics so that destructors don't expect a full shutdown sequence.
Comment 4 Frank Nestel 2011-06-11 05:37:25 PDT
A completely non technical user remark. Shutting down seems to get slower the longer Firefox (using the latest 5 from the beta channel) is open: A 1 day old firefox takes 950MB virtual memory and about 60 seconds to shutdown on my machine. During this shutdown, virtual memory usage peaks to 1020 MB (all monitored with Process Explorer on windows). Restarting Firefox with the same set of tabs, is about 660 MB virtual, takes only 15 seconds for shutdown and peaks only to 680 MB.
Comment 5 Frank Nestel 2011-06-11 05:40:04 PDT
Sorry, wandering through the buglists, I realize, that my comment could also be a unqualified duplicate of bug 661820.
Comment 6 Alex Faaborg [:faaborg] (Firefox UX) 2011-09-23 17:25:34 PDT
*** Bug 688912 has been marked as a duplicate of this bug. ***
Comment 7 (dormant account) 2011-09-28 11:58:05 PDT
Reassigning to Rafael since Kevin moved on to other projects. I'm hoping to have at-least a proof of concept implementation by end of the year.
Comment 8 Ted Mielczarek [:ted.mielczarek] 2011-09-28 12:09:48 PDT
Related, OS X 10.6 supports an API called "Sudden Termination", where you can indicate to the OS that it's okay to SIGKILL your process instead of asking for a graceful shutdown:
http://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSProcessInfo_Class/Reference/Reference.html#//apple_ref/doc/uid/20000316-SW3

You can toggle it on and off at any time, so that if you have outstanding data that needs to be written you aren't killed while that's happening.
Comment 9 (dormant account) 2011-11-18 14:02:38 PST
*** Bug 407981 has been marked as a duplicate of this bug. ***
Comment 10 Martijn Wargers [:mwargers] (gone per 2016-05-31 :-( ) 2011-11-18 16:12:22 PST
Isn't bug 407981 about saving user data on shutdown being slow?
This seems to be about something else.
Comment 11 (dormant account) 2011-11-18 16:31:40 PST
(In reply to Martijn Wargers [:mw22] (QA - IRC nick: mw22) from comment #10)
> Isn't bug 407981 about saving user data on shutdown being slow?
> This seems to be about something else.

No that bug is about doing "unneeded operations" on shutdown which this bug aims to solve. In particular it described symptoms of running GC/Cycle-collection when firefox is paged out... That wont happen once this bug is fixed.
Comment 12 Terrell Kelley 2011-11-19 02:40:59 PST
Would it be possible to save some user data ahead of time? Perhaps during idle? That might also help a little. And is there any way to discourage user data from being cached out in the first place?

I've had ideas on this for quite a while, but, since I'm not actually a programmer, I wasn't sure if I should say anything.
Comment 13 (dormant account) 2011-11-28 11:29:52 PST
*** Bug 625875 has been marked as a duplicate of this bug. ***
Comment 14 :aceman 2012-01-05 12:21:46 PST
If the shutdown can deadlock (per comment 0), fixing this could help those tons of bugs filed about firefox/thunderbird not properly exiting after shutdown (users finding that out by seeing the process running or subsequest start telling them profile is locked (ff running)).
Comment 15 Siddharth Agarwal [:sid0] (inactive) 2012-01-05 23:06:53 PST
Yes, quite possibly.
Comment 16 Joe Drew (not getting mail) 2012-02-19 12:40:34 PST
*** Bug 709873 has been marked as a duplicate of this bug. ***
Comment 17 Jorge Villalobos [:jorgev] 2012-02-28 13:25:53 PST
Several add-ons depend on xpcom* shutdown observers, so I'm flagging this for addon-compat. Is there an expected milestone for this to land?
Comment 18 (dormant account) 2012-02-28 14:01:05 PST
(In reply to Jorge Villalobos [:jorgev] from comment #17)
> Several add-ons depend on xpcom* shutdown observers, so I'm flagging this
> for addon-compat. Is there an expected milestone for this to land?

I optimistically aim for Firefox 14.
Comment 19 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-02-29 12:34:42 PST
I was checking if the code needed to be refactored to allow us to quit after profile-before-change, but it looks like it is already in a good shape from the notification side at least.

The shutdown looks like:

~ScopedXPCOMStartup():
   calls nsXREDirProvider::DoShutdown()
   calls NS_ShutdownXPCOM()

nsXREDirProvider::DoShutdow():
   sends the profile-* messages
   NS_ShutdownXPCOM sends the xpcom-* messages and does its own shutdown

So I guess we can call this bug done when ~ScopedXPCOMStartup() looks like

~ScopedXPCOMStartup():
   calls nsXREDirProvider::DoShutdown()
   if (FastExit)
     _exit(0);
   calls NS_ShutdownXPCOM()

I will open a bug about what exactly the "if (FastExit)" should be and try to start moving it up.
Comment 20 Willy_ Foo_Foo 2012-05-11 07:40:31 PDT
IMHO
can we save bookmarks when updated/changed(saves a lot of shutdown time)
With this we can increase the bookmarks(nested)to unlimited(yes some people use it & currently i have over 1k so too 50+ of my colleagues which makes FF more slow)
Moreover, the file should NOT include:
a. Favicons
b. Bookmark Description, which is typically very long and useless! & saving them at shutdown causes more slowness
& preferably without favicons or bookmark description field, both of which dramatically increase the size of the bookmarks file.
Comment 21 Marco Bonardo [::mak] 2012-05-11 07:56:14 PDT
(In reply to magnumarchonbasileus from comment #20)
> IMHO
> can we save bookmarks when updated/changed(saves a lot of shutdown time)

We are already doing this, unless you mean the json or html backup. The latter is mostly deprecated and not enabled by default, the former usually happens at idle.
Comment 22 Willy_ Foo_Foo 2012-05-13 00:15:07 PDT
(In reply to Marco Bonardo [:mak] from comment #21)
> (In reply to magnumarchonbasileus from comment #20)
> > IMHO
> > can we save bookmarks when updated/changed(saves a lot of shutdown time)
> 
> We are already doing this, unless you mean the json or html backup. The
> latter is mostly deprecated and not enabled by default, the former usually
> happens at idle.

Yes precisely
Thank-you for the info :)
Comment 23 Willy_ Foo_Foo 2012-05-13 00:17:01 PDT
This also needs to be done?

Moreover, the file should NOT include:
a. Favicons
b. Bookmark Description, which is typically very long and useless! & saving them at shutdown causes more slowness
& preferably without favicons or bookmark description field, both of which dramatically increase the size of the bookmarks file.
Comment 24 Benjamin Smedberg [:bsmedberg] 2012-05-24 10:49:30 PDT
Jorge, the changes addon authors may need to make for this bug are based on the ordering at https://wiki.mozilla.org/XPCOM_Shutdown:

* The following observer topics will usually not happen in release builds:
** xpcom-will-shutdown
** xpcom-shutdown
** xpcom-shutdown-threads
** xpcom-shutdown-gc
** xpcom-shutdown-loaders

Extension authors *should* make sure that all user data persists to disk by the end of the profile-before-change notification.

Extension authors *should not* perform memory-cleanup activities during profile-before-change, but can do so later in the xpcom-shutdown-* notifications.
Comment 25 Marco Bonardo [::mak] 2012-05-24 12:24:13 PDT
Rafael, we should probably make the Storage threads merge (the spinning you added recently) earlier, right now they merge at xpcom-shutdown-threads, so we may lose important work. Should not be a big deal, especially once we have bug 722243.
The problem is that we'd need to do those just after profile-before-change, but the first notification there is xpcom-will-shutdown, that looks like is going away.
Comment 26 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-05-24 12:36:35 PDT
(In reply to Marco Bonardo [:mak] from comment #25)
> Rafael, we should probably make the Storage threads merge (the spinning you
> added recently) earlier, right now they merge at xpcom-shutdown-threads, so
> we may lose important work. Should not be a big deal, especially once we
> have bug 722243.
> The problem is that we'd need to do those just after profile-before-change,
> but the first notification there is xpcom-will-shutdown, that looks like is
> going away.

So, the spinning as is is a debug check.

The idea is that db users are still responsible for calling AsyncClose and making sure any scheduled important writes hit the disk before or during profile-before-change.

If there is some part of the code that is not waiting for its writes, the spinning will make sure the writes do happen late and write poisoning (once enabled) will catch them. In other words, the spinning prevents us from exiting without even starting executing an async statement that would write to disk.

Your proposal, if I understand it correctly, is to change the semantics to say that a db user is only responsible for calling AsyncClose. Firefox will guarantee that all async statements not explicitly marked as cancellable will be executed.

I agree that that is a much easier to use API, but would like some more comments on it if possible. Would you mind opening a bug for just this change?
Comment 27 Marco Bonardo [::mak] 2012-05-24 12:54:21 PDT
(In reply to Rafael Ávila de Espíndola (:espindola) from comment #26)
> So, the spinning as is is a debug check.

I meant the previous one, in bug 744294.

> The idea is that db users are still responsible for calling AsyncClose and
> making sure any scheduled important writes hit the disk before or during
> profile-before-change.

I partially disagree here, in the sense I don't think we should ask each user to spin the events loop on his own.  What they should do is mark work as cancelable.

> Your proposal, if I understand it correctly, is to change the semantics to
> say that a db user is only responsible for calling AsyncClose. Firefox will
> guarantee that all async statements not explicitly marked as cancellable
> will be executed.

Right.

> I agree that that is a much easier to use API, but would like some more
> comments on it if possible. Would you mind opening a bug for just this
> change?

Sure.
Comment 28 Jo Hermans 2012-05-29 04:06:03 PDT
*** Bug 759206 has been marked as a duplicate of this bug. ***
Comment 29 Wayne Mery (:wsmwk, NI for questions) 2012-06-24 09:19:47 PDT
*** Bug 759206 has been marked as a duplicate of this bug. ***
Comment 30 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-09-12 16:59:09 PDT
I removed bug 751899 from the dependencies because it is OK to skip shutting down NSS.
Comment 31 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2013-01-02 18:10:45 PST
Some comments on this bug talk about exit(0), some about _exit(0). As bug 826029 shows, _exit(0) can be quiet a bit harder, so lets restrict this bug to exit(0).

I have reported 826143 to track _exit(0).
Comment 32 Benjamin Smedberg [:bsmedberg] 2013-01-03 10:11:43 PST
I don't understand comment 31. That bug seems to be a write from exit() not from _exit(). _exit should be the more abrupt (and therefore safer) function which does not call C++ static destructors or module _fini functions. I think we should only target _exit and should not target exit ever.
Comment 33 :Ehsan Akhgari (busy, don't ask for review please) 2013-01-03 10:35:03 PST
(In reply to comment #32)
> I don't understand comment 31. That bug seems to be a write from exit() not
> from _exit(). _exit should be the more abrupt (and therefore safer) function
> which does not call C++ static destructors or module _fini functions. I think
> we should only target _exit and should not target exit ever.

_exit will not cause C++ destructors, atexit functions, etc to be run, therefore code that relies on such mechanisms for persisting data to disk will not run and the result could be a dataloss.
Comment 34 Mike Hommey [:glandium] 2013-01-03 10:36:56 PST
Note we currently rely on atexit to delete the profile lock file, so using _exit would effectively prevent re-running firefox.
Comment 35 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2013-01-03 10:46:42 PST
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #32)
> I don't understand comment 31. That bug seems to be a write from exit() not
> from _exit(). _exit should be the more abrupt (and therefore safer) function
> which does not call C++ static destructors or module _fini functions. I
> think we should only target _exit and should not target exit ever.

Bug 826029 found a write that happens during exit() but not during _exit(). The point of having write poisoning is precisely to make sure we don't skip writes, and that one looks a hard to move earlier.
Comment 36 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2013-01-03 10:47:58 PST
(In reply to Mike Hommey [:glandium] from comment #34)
> Note we currently rely on atexit to delete the profile lock file, so using
> _exit would effectively prevent re-running firefox.

Interesting! I wonder if we should poison unlink too.
Comment 37 :Ehsan Akhgari (busy, don't ask for review please) 2013-01-03 10:55:05 PST
(In reply to comment #36)
> (In reply to Mike Hommey [:glandium] from comment #34)
> > Note we currently rely on atexit to delete the profile lock file, so using
> > _exit would effectively prevent re-running firefox.
> 
> Interesting! I wonder if we should poison unlink too.

I think we should, it's a write like any other!
Comment 38 Mike Hommey [:glandium] 2013-01-03 11:22:30 PST
(In reply to Mike Hommey [:glandium] from comment #34)
> Note we currently rely on atexit to delete the profile lock file, so using
> _exit would effectively prevent re-running firefox.

Bad recollection, it's a static destructor now (it used to be atexit, but that made problems with dlclosing libxul). But that doesn't change much from the perspective of _exit.
Comment 39 Benjamin Smedberg [:bsmedberg] 2013-01-03 11:30:46 PST
Which static destructor is that? Under normal (non-exit) operation, the profile should be unlocked explicitly: http://hg.mozilla.org/mozilla-central/annotate/6955309291ee/toolkit/xre/nsAppRunner.cpp#l3920 and there shouldn't be any subsequent I/O from static destructors or atexit handlers related to the profile lock.

Any patches to exit() early should also make sure that the profile is explicitly unlocked and that we're not relying on static destructors/atexit handlers.

I'm really nervous about calling C++ static destructors while we're in the C++ stack... that's why I've been talking about _exit() or the Windows equivalent TerminateProcess, specifically so that we *will* skip C++ static destructors which potentially assume things about the stack being unwound.
Comment 40 Mike Hommey [:glandium] 2013-01-03 12:52:29 PST
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #39)
> Which static destructor is that? Under normal (non-exit) operation, the
> profile should be unlocked explicitly:
> http://hg.mozilla.org/mozilla-central/annotate/6955309291ee/toolkit/xre/
> nsAppRunner.cpp#l3920 and there shouldn't be any subsequent I/O from static
> destructors or atexit handlers related to the profile lock.

http://hg.mozilla.org/mozilla-central/annotate/6955309291ee/profile/dirserviceprovider/src/nsProfileLock.cpp#l374
Comment 41 Benjamin Smedberg [:bsmedberg] 2013-01-03 12:57:08 PST
Yes, but... why is mPidLockList non-empty by the time we get to that destructor? It should be a no-op (it should be being removed at http://hg.mozilla.org/mozilla-central/annotate/6955309291ee/profile/dirserviceprovider/src/nsProfileLock.cpp#l659)
Comment 42 Ben Turner (not reading bugmail, use the needinfo flag!) 2013-09-06 07:01:32 PDT
Hey folks, what's the status here? Is this project still prioritized?
Comment 43 Ben Turner (not reading bugmail, use the needinfo flag!) 2013-09-06 07:03:27 PDT
Also, anyone know why bug 911820 (worker doing IO via ctypes) doesn't trigger some kind of abort with write poisoning?
Comment 44 David Rajchenbach-Teller [:Yoric] (please use "needinfo") 2013-09-18 08:07:09 PDT
I suspect that bug 916078 will be necessary to make exit(0) work consistently. Please inform me if you need to prioritize some components.
Comment 45 Terrell Kelley 2013-09-18 18:11:40 PDT
Not quite sure where to stick this, so I'm putting it here on the meta bug.

I just recently learned that Firefox does not write directly to places.sqlite, but to other files. Perhaps we could save some time on shutdown by merging those files not during shutdown but at idle.
Comment 46 David E. Ross 2014-08-09 10:44:59 PDT Comment hidden (typo)
Comment 47 David E. Ross 2014-08-09 10:45:46 PDT Comment hidden (typo)

Note You need to log in before you can comment on or make changes to this bug.