Closed Bug 782072 Opened 12 years ago Closed 2 years ago

Regression in browser process shutdown time

Categories

(Core :: JavaScript: GC, defect, P3)

17 Branch
x86_64
All
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox17 + fixed

People

(Reporter: kolubinowicki, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: regression, reproducible, Whiteboard: [Snappy][js:p2])

Attachments

(4 files)

Regression in browser process shutdown time.

Steps to reproduce:
Close browser window with large number of tabs, wait until process disappear in task manager.

Browser closing time (process disappear in task manager) for test session with 1000 tabs:

OK: 13 seconds:

17.0a1 (2012-08-01) http://hg.mozilla.org/mozilla-central/rev/582d4c67b3a7

BAD: 1 min 16 seconds:

17.0a1 (2012-08-02) http://hg.mozilla.org/mozilla-central/rev/588424024294

PUSHLOG:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=582d4c67b3a7&tochange=588424024294

Bug also confirmed by user on: http://forums.mozillazine.org/viewtopic.php?f=23&t=2513069
Keywords: regression
Huh, nothing in that window sticks out as an obvious culprit. Could you provide a list of tabs to open that makes this reproducible for you, so we can bisect the range?
It is easily reproducible for me, with as low as one number of tab, even if I close this bug tab (which is only opened now) and close Firefox, I can see in Task Manager that it takes about more than 20 seconds before firefox.exe *32 process vanishes. I does not tried to reproduce this behaviour yet on new profile but I can confirm this happening with my main profile.
I am using latest 08/11 Nightly and I have only Addons installed:
1. Adblock Plus Development Build 2.1.3a3534
2. Element Hiding Helper Development Build 1.2.4a429

And I had turned off Hardware Acceleration (if relevant piece of info). Any further info or help require you can ask me.
I tried uninstalling all addons and run the Nightly in safemode. Still I can reproduce this behavior on Windows 7 x86_64 .
Its not only that the time taken is more, the memory usage also increases in this time.
When only one tab opened, Fx uses around 230 mb, when I close Firefox, the memory increases to 370 mb in around 30 seconds and then the firefox.exe ends.
Agree with Optimizer that before vanishing from Task Manager, Firefox process memory increases. In my case, 132 MB raised to 183 MB and then process ended.
If somebody who can reproduce the problem could run an hg bisect on the range in comment 0, that would be super helpful.
Bug 782117 suggests this could be related to telemetry (although the regression ranges are different by about 10 days).
To confirm that this was due to Telemetry, I unchecked both the send <> data options from the Options > Advanced tab.

Then I closed Firefox, the memory still increased, and after increasing by around 100 mb, the firefox.exe just got stuck at that memory neither increasing, nor decreasing, nor closing. I had to kill it manually.

But when I restarted, and after that everytime, the shutdown became instantaneous. So I think the first time it got confused as the options were unchecked but the service for telemetry was still running (as the change requires a startup, maybe)
Turning off the Telemetry aka Performance Data fixed the issue for me, but I don't have to manually kill process like Optimizer mentioned in Comment 7, it closed automatically after increasing memory for a while.
Component: General → Telemetry
Product: Core → Toolkit
Please retest with Telemetry turned on once Bug 782117 landed.
Depends on: 782117
Attached image Browser shutdown time
Finally I found the reason of regression.

First I spotted huge regression of nightly shutdown in my daily 'dirty' working profile (lot of tabs, many add-ons) then found a regression range. Regression range was confirmed on clean profile/install but additionally with large session restored not as usual by copying sessionstore.js file, but by importing session by add-on Session Manager: 

https://addons.mozilla.org/en-US/firefox/addon/session-manager/

***
So..Regression hurts browser with addon, but why process shutdown time was faster with addon than a stock clean browser ?! 
***


Results (restored 1000 tabs, clean profile, telemetry on):

17.0a1 (2012-08-01) http://hg.mozilla.org/mozilla-central/rev/582d4c67b3a7
	
	With addon enabled!:

		4346 ms (Telemetry.ShutdownTime.txt from profile directory)

		9 s (Process exit in task manager)


	Addon disabled:

 		46769 ms (Telemetry.ShutdownTime.txt from profile directory)

		51 s (Process exit in task manager)

17.0a1 (2012-08-02) http://hg.mozilla.org/mozilla-central/rev/588424024294

	With add-on enabled:

		45376 ms (Telemetry.ShutdownTime.txt from profile directory)

		50 s (Process exit in task manager)


	Add-on disabled:

 		42291 ms (Telemetry.ShutdownTime.txt from profile directory)

		48 s (Process exit in task manager)



To verify that patch from bug 782117  don't fix regression (Comment 9)
18.0a1 (2012-08-28) http://hg.mozilla.org/mozilla-central/rev/8af2ff9c6018
	
	With add-on enabled:

		43057 ms (Telemetry.ShutdownTime.txt from profile directory)

		47 s (Process exit in task manager)


	Add-on disabled:

 		44863 ms (Telemetry.ShutdownTime.txt from profile directory)

		48 s (Process exit in task manager)
Whiteboard: [Snappy]
We're about a week and a half from merging Aurora to Beta so (because of bug 782117) putting Nathan on this bug with Tara as QA contact so we can get the right eyes on this and determine whether this is still something we should be tracking for FF 17.
Assignee: nobody → nfroyd
QA Contact: taras.mozilla
If I'm reading the comments right, this has nothing to do with Telemetry and bug 782117, given the regression range in comment 10.

kolubinowicki, can you try capturing a profile with the built-in profiler:

https://developer.mozilla.org/en-US/docs/Performance/Profiling_with_the_Built-in_Profiler

at the beginning and end of the regression range?  I really don't see anything in the regression range that should affect shutdown time, particularly a regression when certain addons are *enabled*.  A profile would be most helpful to see what's going on.
(In reply to Nathan Froyd (:froydnj) from comment #13)
>(..)can you try capturing a profile with the built-in profiler:
> 
> https://developer.mozilla.org/en-US/docs/Performance/
> Profiling_with_the_Built-in_Profiler
> 

I hope that during browser restarts Gecko profiler works as expected (no needs of using some external profiling programs?)

Steps to reproduce profiling:

-Download and unpack archive that contains: portable firefox + test session + addons (files corresponds to regression range date).

http://dl.dropbox.com/u/23326175/bug/bug782072.7z

-Close any opened firefox window
-Run FirefoxPortable.exe from selected directory
-Install: https://addons.mozilla.org/eN-US/firefox/addon/gecko-profiler/ 
-Restart browser

	Timeframe of profiling:
	-start profiling
	-close browser window
	-wait until firefox process is closed in Task manager
	-start browser 
	-after session is restored stop collecting data

Results:

17.0a1 (2012-08-01) http://hg.mozilla.org/mozilla-central/rev/582d4c67b3a7
Session manager disabled/ no addon:

	Time to close with profiling enabled: 79161 ms (Telemetry.ShutdownTime.txt)
	http://people.mozilla.com/~bgirard/cleopatra/?report=f352da1534305e1c13d0f2697f990c0e48f35b8e


17.0a1 (2012-08-01) http://hg.mozilla.org/mozilla-central/rev/582d4c67b3a7
With add-on enabled

	Time to close with profiling enabled: 7128 ms (Telemetry.ShutdownTime.txt)
	http://people.mozilla.com/~bgirard/cleopatra/?report=d553f5a4dd03ea0219b5aae5fa2d53d22a071c15


17.0a1 (2012-08-02) http://hg.mozilla.org/mozilla-central/rev/588424024294
With add-on enabled

	Time to close with profiling enabled: 83759 ms (Telemetry.ShutdownTime.txt)
	http://people.mozilla.com/~bgirard/cleopatra/?report=d81338ce9b0dba180b02159189621159646d2de5
or
	Time to close with profiling enabled: 80415 ms (Telemetry.ShutdownTime.txt)
	http://people.mozilla.com/~bgirard/cleopatra/?report=ef7086681a4c18b91c0076dbc118d7991b50a23b
STR
 1. Start with newly created clean profile and exit Firefox
 2. Copy sessionstore.js(in bug782072.7z of comment #14) to the profile
 3. Start Firefox with the profile
 4. Click "Restore Previous Session"on about:home tab and wait until the utilization of the CPU is settled down
 5. Exit Firefox by clicking x button
     --- observe process in Task Manager

Actual results:
  It takes about 20-30 sec to close Firefox process

Expected results:
  Process should disappear immediately

Regression window(cached m-c)
Good:
http://hg.mozilla.org/mozilla-central/rev/e5000bb5fa03
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Firefox/17.0 ID:20120801011538
Bad:
http://hg.mozilla.org/mozilla-central/rev/bcfaabb6bac8
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Firefox/17.0 ID:20120801025038
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e5000bb5fa03&tochange=bcfaabb6bac8

Regression window(cached m-i)
Good:
http://hg.mozilla.org/integration/mozilla-inbound/rev/3fc0a0b7fdda
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Firefox/17.0 ID:20120731182538
Bad:
http://hg.mozilla.org/integration/mozilla-inbound/rev/9abc2abb27b3
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Firefox/17.0 ID:20120731205637
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=3fc0a0b7fdda&tochange=9abc2abb27b3

In local build
Last good: 3fc0a0b7fdda
First bad: c894763d1bbc
Triggered by:
c894763d1bbc	Bill McCloskey — Bug 778993 - Separate runtime's gcMallocBytes from compartment's gcMallocBytes Bug 778993 - Separate runtime's gcMallocBytes from compartment's gcMallocBytes (r=gwagner)
Blocks: 778993
Status: UNCONFIRMED → NEW
Ever confirmed: true
Component: Telemetry → JavaScript Engine
Product: Toolkit → Core
Version: Trunk → 17 Branch
Bill, do you have an idea of what could be going on here?
------------------!
Please first check GRAPH from attachment! 
------------------!

Original regression?:

03.05.2012 
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/15.0 Firefox/15.0a1
Built from http://hg.mozilla.org/mozilla-central/rev/807403a04a6a

 Session manager disabled

  6s  (Test session) 
  23s (Other session+add-ons)

 Session manager enabled 

  11s (Test session)
  17s (Other session+add-ons)

04.05.2012
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/15.0 Firefox/15.0a1
Built from http://hg.mozilla.org/mozilla-central/rev/9ebf3dc839c5

 Session manager disabled

  23s (Test session) 
  58s (Other session+add-ons)

 Session manager enabled 
  25s (Test session)
  63s (Other session+add-ons)

Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=807403a04a6a&tochange=9ebf3dc839c5
It doesn't make a lot of sense that that patch would have regressed anything. I thought we killed the browser before doing any final GCs.
(In reply to Bill McCloskey (:billm) from comment #18)
> It doesn't make a lot of sense that that patch would have regressed
> anything. I thought we killed the browser before doing any final GCs.

That is the plan in bug 662444, but at the moment we still call GC, etc just like we do to leak-check debug builds.
(In reply to Bill McCloskey (:billm) from comment #18)
> It doesn't make a lot of sense that that patch would have regressed
> anything. I thought we killed the browser before doing any final GCs.

It looks like the patch (comment 15) regressed browser shutdown time only with Session manager enabled, but from the graph:

https://bug782072.bugzilla.mozilla.org/attachment.cgi?id=668859

the 'original' regression range(see comment 17)is different and increased shutdown time of  browser with or without Session manager and is caused by another patch.
(In reply to kolubinowicki from comment #20)
> (In reply to Bill McCloskey (:billm) from comment #18)
> > It doesn't make a lot of sense that that patch would have regressed
> > anything. I thought we killed the browser before doing any final GCs.
> 
> It looks like the patch (comment 15) regressed browser shutdown time only
> with Session manager enabled, 

I tested in comment #15 without add-ons (Please read STR carefully)
Bill/Nathan - is bug 778993 worth this regression, regardless of the reason it's tickling this bug?
Keywords: reproducible
I cannot reproduce the results given in comment 17.

I tried with two builds: one before bug 778993 landed and one after it.  I set up two profiles, |before| and |after|, and quit Firefox after creating them.  I copied the sessionstore.js file into the relevant profile directories ($HOME/AppData/Roaming/Mozilla/Firefox/Profiles/$PROFILE).  Each profile was started up separately (not simultaneously), allowed to quiesce, and then shutdown.  In both cases, the Firefox window disappeared almost instantly, but Task Manager showed quite some activity for over a minute.  Since I started things from the command line, I timed until the command prompt returned.  (I verified that this is the same as waiting for it to disappear from Task Manager.)  Both builds took about ~1:12 to return to the command prompt.  Even if there was a difference of a few seconds, it's hardly the sort of shutdown regression that's being claimed in the bug.

This was all done on a Asus EeePC, 2x 1.6GHz Atom processor, Windows 7, 2GB RAM, mechanical hard drive.  So it's not as though the computer is winning any speed awards.

The builds were done with a standard nightly config, though they were built on try.  I don't know whether that could potentially affect the timings or not.

So it doesn't look like Bill's changes were responsible.  Unfortunately, that puts us back at square one for determining which changes are responsible.
Umm,
I can reproduce almost same result of comment#15 again and again.
I am sure that c894763d1bbc Bug 778993 regressed this.
Oops!  My comment 23 refers to the steps given in comment 15, not comment 17.  So we have a problem.

Bill, can you please try to reproduce with the steps given in comment 15?
Flags: needinfo?(wmccloskey)
I can also reproduce on ubuntu12.04.1.
I confirmed that regression range is as same as  comment #15.
And in clocal build(32bit ubuntu12.04.1), c894763d1bbc Bug 778993 causes this.
OS: Windows 7 → All
That's strange, because I see no difference here on a 64-bit Ubuntu build.
I tried this on a 64-bit Linux opt build. It took 13 seconds to shut down in both cases. I'll try to measure if we do more GCs before versus after.
Flags: needinfo?(wmccloskey)
(In reply to Bill McCloskey (:billm) from comment #28)
> I tried this on a 64-bit Linux opt build. It took 13 seconds to shut down in
> both cases. I'll try to measure if we do more GCs before versus after.

You should try with new profile(do not install addons , do not change preferences).
You should measure time to disappear process with System Monitor and stopwatch.
Yes, I used a fresh profile. One difference from your STR is that, after clicking the close box, it put up a dialog box and I had to click "Save and Quit". I timed how long it took between clicking "Save and Quit" and the process exiting.

I also checked how many GCs we do, and how many compartments we collect in each of them. It was basically the same.
I just tried this again with a 32-bit Linux opt build and I got about 12 seconds in both cases.
(In reply to Bill McCloskey (:billm) from comment #31)
> I just tried this again with a 32-bit Linux opt build and I got about 12
> seconds in both cases.


32-bit Linux opt build? builds from before and after landing c894763d1bbc?
(In reply to Alice0775 White from comment #32)
> 32-bit Linux opt build? builds from before and after landing c894763d1bbc?

Yes, I tried revisions 3fc0a0b7fdda and c894763d1bbc.

Does Firefox shut down instantly with 3fc0a0b7fdda on Linux, Alice?

On Linux, I just measured the time between quitting firefox and when I get a fresh command line prompt (i.e., when the process exits).
(In reply to Bill McCloskey (:billm) from comment #33)
> (In reply to Alice0775 White from comment #32)

> Does Firefox shut down instantly with 3fc0a0b7fdda on Linux, Alice?
Yes. I can start Firefox almost instantly after the Browser(3fc0a0b7fdda) window disappeared.


Build from 3fc0a0b7fdda: 
Elapsed Time: topic
-: Click X to close Firefox
0sec: Click "Close Tabs" of "Confirm close" dialog
3Sec: Browser Window disappeared
4-5Sec: Process disappeared in System Monitor

Build from c894763d1bbc
Elapsed Time: topic
-: Click X to close Firefox
0sec: Click "Close Tabs" of "Confirm close" dialog
3Sec: Browser Window disappeared
15-20Sec: Process disappeared in System Monitor
Alice, what does about:buildconfig say for your Linux and Windows builds?  And which sessionstore.js file are you using from the .7z file?  I count three, two of which are the same.  We're clearly getting very different answers; perhaps there's something in the test setups that are quite different.
(In reply to Nathan Froyd (:froydnj) from comment #35)
> Alice, what does about:buildconfig say for your Linux and Windows builds? 


[Linux]
Build platform
target
i686-pc-linux-gnu
Build tools
Compiler 	Version 	Compiler flags
gcc 	gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) 	-pedantic -Wall -Wpointer-arith -Wdeclaration-after-statement -Werror=return-type -Wtype-limits -Wempty-body -Wno-unused -Wno-overlength-strings -Wcast-align -Wno-long-long -fno-strict-aliasing -ffunction-sections -fdata-sections -pthread -pipe -DNDEBUG -DTRIMMED -g -Os -freorder-blocks -fomit-frame-pointer
c++ 	gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) 	-pedantic -Wall -Wpointer-arith -Woverloaded-virtual -Werror=return-type -Wtype-limits -Wempty-body -Wno-ctor-dtor-privacy -Wno-overlength-strings -Wno-invalid-offsetof -Wno-variadic-macros -Wcast-align -Wno-long-long -fno-exceptions -fno-strict-aliasing -fno-rtti -ffunction-sections -fdata-sections -fno-exceptions -std=gnu++0x -pthread -pipe -DNDEBUG -DTRIMMED -g -Os -freorder-blocks -fomit-frame-pointer
Configure arguments

--disable-crashreporter --enable-application=browser --enable-libxul --disable-xpconnect-idispatch --disable-activex --disable-activex-scripting --disable-accessibility


[Windows7]
Build platform
target
i686-pc-mingw32
Build tools
Compiler 	Version 	Compiler flags
d:/srcinbound/ff-dbg/_virtualenv/Scripts/python.exe -O /d/srcinbound/build/cl.py cl 	16.00.40219.01 	-TC -nologo -W3 -Gy -Fdgenerated.pdb -we4553 -DNDEBUG -DTRIMMED -Zi -UDEBUG -DNDEBUG -O1 -Oy
d:/srcinbound/ff-dbg/_virtualenv/Scripts/python.exe -O /d/srcinbound/build/cl.py cl 	16.00.40219.01 	-TP -nologo -W3 -Gy -Fdgenerated.pdb -wd4800 -we4553 -GR- -DNDEBUG -DTRIMMED -Zi -UDEBUG -DNDEBUG -O1 -Oy
Configure arguments

--enable-application=browser --disable-tests --enable-optimize --disable-mochitest --disable-xpconnect-idispatch --disable-activex --disable-activex-scripting --disable-accessibility --enable-jemalloc

> And which sessionstore.js file are you using from the .7z file?  I count
> three, two of which are the same.

bug782072.7z\bug782072\bad_17.0a1 (2012-08-02) 588424024294\FirefoxPortable\Data\profile\sessionstore.js in bug782072.7z of comment #14

>  We're clearly getting very different
> answers; perhaps there's something in the test setups that are quite
> different.

Steps to reproduce
1. Start Firefox -P
2. Create a profile as usual procedure
3. Start Nightly -P  and choose the created Profile above, Choose "No" on  "Default browser" dialog
4. Quit Nightly: Click "x" button at top of browser, Click "Close tabs"  on  "Confirm close" dialog

5. Copy(overwrite) sessionstore.js into the profile folder
6. Start Nightly -P  and choose the profile folder, Choose "No" on  "Default browser" dialog
7. Switch "Nightly Start Page"(2nd tab) and Click "Restore Previous Session"
8. Wait to settle down CPU usage 

9. Quit Nightly: Click "x" button at top of browser, Click "Close tabs"  on  "Confirm close" dialog
10.Observe browser window disappear and Process disappear in System Monitor
Attached file WinDbg Log
FYI

This is WinDbg Log of Firefox17Beta(http://hg.mozilla.org/releases/mozilla-beta/rev/957d1e48ef80)

Log when I killed Firefox from WinDbg while the process is still running after the browser window disappeared.
I backed out bug 778993 in bug 806087 because of some regressions related to sync. Also, I accidentally referred to this bug in the commit message.
And by bug 806087, you really meant bug 806644, right? :-P
Nevermind, got a little mixed up myself.
This should be fixed by the backout in bug 806644
I had to back out bug 778993, but now I've relanded it with some fixes.

Alice, could you see if the problem went away with the backout, and then see if it reappears now with the relanding?
(In reply to Bill McCloskey (:billm) from comment #42)
> I had to back out bug 778993, but now I've relanded it with some fixes.
> 
> Alice, could you see if the problem went away with the backout, and then see
> if it reappears now with the relanding?

(In reply to Bill McCloskey (:billm) from comment #19)
> Relanded:
> https://hg.mozilla.org/integration/mozilla-inbound/rev/63defe9bc7d5
> 
> I also decided to change the malloc quota for an individual compartment to
> be half that of the runtime. This seems to help in practice with sync.

The problem came back again :(
Good build:  4-5Sec
Bad build: 15-20Sec

Good:
http://hg.mozilla.org/integration/mozilla-inbound/rev/7c56dd925c25
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/19.0 Firefox/19.0 ID:20121102145145

Bad:
http://hg.mozilla.org/integration/mozilla-inbound/rev/63defe9bc7d5
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/19.0 Firefox/19.0 ID:20121102150845
Whiteboard: [Snappy] → [Snappy][js:p2]

The bug assignee didn't login in Bugzilla in the last 7 months, so the assignee is being reset.

Assignee: froydnj+bz → nobody

Looks like this was fixed by a backout almost a decade ago.

Status: NEW → RESOLVED
Closed: 2 years ago
Component: JavaScript Engine → JavaScript: GC
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: