Open Bug 1445020 Opened 7 years ago Updated 2 years ago

Slow shutdown of debug Firefox (~30s) builds after loading http://talend.surge.sh

Categories

(Core :: JavaScript: GC, defect, P2)

60 Branch
defect

Tracking

()

Performance Impact none
Tracking Status
firefox61 --- fix-optional

People

(Reporter: whimboo, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Attached file marionette test
While investigating a geckodriver issue (https://github.com/mozilla/geckodriver/issues/962) I noticed a very slow shutdown of my local debug Firefox (artifact build) when just a page gets loaded, and Firefox is immediately quit afterward. Steps to reproduce: 1) Build a debug version of Firefox (if necessary use artifact build) 2) Save the attachment to the local disk 3) Run: `mach marionette-test --gecko-log - -vv path_to_test Here an excerpt from the log: [Parent 13676, Main Thread] WARNING: '!mMainThread', file /builds/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 517 [Parent 13676, Main Thread] WARNING: '!mMainThread', file /builds/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 517 [Parent 13676, Main Thread] WARNING: '!mMainThread', file /builds/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 517 [Parent 13676, Main Thread] WARNING: '!mMainThread', file /builds/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 517 [Parent 13676, Main Thread] WARNING: '!mMainThread', file /builds/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 517 [Parent 13676, Main Thread] WARNING: '!mMainThread', file /builds/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 517 [Parent 13676, Main Thread] WARNING: '!mMainThread', file /builds/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 517 [Parent 13676, Main Thread] WARNING: '!mMainThread', file /builds/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 517 --DOMWINDOW == 7 (0x11b21b800) [pid = 13676] [serial = 2] [outer = 0x0] [url = about:blank] --DOMWINDOW == 6 (0x1168c7c00) [pid = 13676] [serial = 7] [outer = 0x0] [url = about:blank] --DOMWINDOW == 5 (0x128e78000) [pid = 13676] [serial = 8] [outer = 0x0] [url = about:blank] --DOMWINDOW == 4 (0x12714ec00) [pid = 13676] [serial = 5] [outer = 0x0] [url = about:blank] --DOMWINDOW == 3 (0x124577800) [pid = 13676] [serial = 3] [outer = 0x0] [url = chrome://browser/content/browser.xul] --DOMWINDOW == 2 (0x10e278800) [pid = 13676] [serial = 1] [outer = 0x0] [url = chrome://browser/content/hiddenWindow.xul] --DOMWINDOW == 1 (0x12e69a000) [pid = 13676] [serial = 12] [outer = 0x0] [url = chrome://extensions/content/dummy.xul] --DOMWINDOW == 0 (0x12e69c400) [pid = 13676] [serial = 9] [outer = 0x0] [url = chrome://extensions/content/dummy.xul] --DOMWINDOW == 4 (0x114884000) [pid = 13678] [serial = 1] [outer = 0x0] [url = http://talend.surge.sh/components/?selectedKind=ActionDropdown&selectedStory=default&full=0&addons=1&stories=1&panelRight=0&addonPanel=storybook%2Factions%2Factions-panel] --DOMWINDOW == 3 (0x1159d6000) [pid = 13678] [serial = 5] [outer = 0x0] [url = http://talend.surge.sh/components/iframe.html?selectedKind=ActionDropdown&selectedStory=default] The hang actually happens after the `dummy.xul` log entry is printed. So something in that page seem to cause a slow shutdown. I was not able to reproduce it outside of the test harness yet.
Flags: needinfo?(amarchesini)
Sorry, you don't need Marionette for that. For testing manually I accidentally used the wrong build. So here the updated steps: 1) Start a debug build of Firefox 2) Load http://talend.surge.sh/components/?selectedKind=ActionDropdown&selectedStory=default&full=0&addons=1&stories=1&panelRight=0&addonPanel=storybook%2Factions%2Factions-panel 3) Quit Firefox
It seems that the content process is busy doing GC. Here some stack traces: https://pastebin.mozilla.org/9079864 https://pastebin.mozilla.org/9079865 https://pastebin.mozilla.org/9079866 Eventually, the content process is killed by the parent one. This is the last stack trace on the main thread: https://pastebin.mozilla.org/9079869
Flags: needinfo?(amarchesini)
Flags: needinfo?(continuation)
The shutdown CC we do for leakchecking purposes is finishing an incremental GC in progress, which is apparently taking a really long time. Jon, maybe you could take a look? The page itself doesn't look ridiculous so I wouldn't expect a GC to take so long.
Component: DOM → JavaScript: GC
Flags: needinfo?(continuation) → needinfo?(jcoppeard)
Sure. A couple of those stack traces are in AtomIsPinnedInRuntime() and in particular in some debug assertions that might be quite heavy (AtomIsPinnedInRuntime is itself only used in an assertion). Bug 1445196 should improve this case considerably.
Flags: needinfo?(jcoppeard)
Summary: Slow shutdown of debug Firefox (~30s) with Marionette test → Slow shutdown of debug Firefox (~30s) builds after loading http://talend.surge.sh
Priority: -- → P2
Does this still happen now that bug 1445196 has landed?
Flags: needinfo?(hskupin)
No, the hang is still present with: mozversion application_buildid: 20180319144107 mozversion application_changeset: 4f1014eb5039bdfdd7a39fb7785d102df1994a6f
Flags: needinfo?(hskupin)
(In reply to Henrik Skupin (:whimboo) from comment #6) OK, can you get some stack traces for this?
(In reply to Jon Coppeard (:jonco) from comment #7) > (In reply to Henrik Skupin (:whimboo) from comment #6) > OK, can you get some stack traces for this? I will forward to Andrea given that I don't know how to do it. Also please note that this is kinda easy to reproduce. See comment 1.
Flags: needinfo?(amarchesini)
With the last nighthly debug build, I see this crash: WARNING: YOU ARE LEAKING THE WORLD (at least one JSRuntime and everything alive inside it, that is) AT JS_ShutDown TIME. FIX THIS! [Parent 15252, Main Thread] ###!!! ASSERTION: Component Manager being held past XPCOM shutdown.: 'cnt == 0', file /home/baku/Sources/m/barfoo/src/xpcom/build/XPCOMInit.cpp, line 1034 firefox: fccache.c:534: FcCacheFini: Assertion `fcCacheChains[i] == NULL' failed.
Flags: needinfo?(amarchesini) → needinfo?(jcoppeard)
(In reply to Andrea Marchesini [:baku] from comment #9) That's a separate issue: we recently made this crash where it previously printed a warning.
Flags: needinfo?(jcoppeard)
Whiteboard: [qf]
Tentatively qf-'ing because this is for a debug build. Hey whimboo, do we have any suspicion here that there's any opt-build / real-end-user impact here?
Flags: needinfo?(hskupin)
(In reply to Mike Conley (:mconley) (:⚙️) (Totally backlogged on reviews and needinfos) from comment #11) > Tentatively qf-'ing because this is for a debug build. Hey whimboo, do we > have any suspicion here that there's any opt-build / real-end-user impact > here? This is happening in the shutdown GCs we only do in debug builds for leak checking purposes, IIRC.
So is it fair to say that there's likely a real leak here that our opt-users will be suffering?
Flags: needinfo?(continuation)
Or am I interpreting comment 12 incorrectly?
(In reply to Mike Conley (:mconley) (:⚙️) (Totally backlogged on reviews and needinfos) from comment #13) > So is it fair to say that there's likely a real leak here that our opt-users > will be suffering? If you close the browser without closing a tab first (basically for any page), we leak the page. This has been true for a long time, and I don't think it corresponds to anything that will affect a user, as we only really care if a page goes away when you close a tab and then continue to use the browser.
Flags: needinfo?(continuation)
It looks like that there is nothing I can add here.
Flags: needinfo?(hskupin)
Whiteboard: [qf] → [qf-]
Blocks: 1633342
No longer blocks: 1633342
Blocks: 1633342
No longer blocks: 1435343
Performance Impact: --- → -
Whiteboard: [qf-]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: