Closed Bug 1401379 Opened 7 years ago Closed 7 years ago

Synchronize outer window and docshell object lifetimes

Categories

(Core :: DOM: Core & HTML, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox58 --- fixed

People

(Reporter: nika, Assigned: nika)

References

Details

Attachments

(7 files, 10 obsolete files)

4.84 KB, patch
Details | Diff | Splinter Review
6.79 KB, patch
Details | Diff | Splinter Review
3.11 KB, patch
Details | Diff | Splinter Review
1.13 KB, patch
Details | Diff | Splinter Review
1.73 KB, patch
Details | Diff | Splinter Review
6.69 KB, patch
Details | Diff | Splinter Review
3.39 KB, patch
smaug
: review+
Details | Diff | Splinter Review
khuey's plan for bug 558192 was to move the outer window's methods and state into the nsDocShell, and change consumers of the code to using nsDocShell. This should work as there is a 1-1 relationship between nsDocShell and outer window objects. Unfortunately, they can have different lifetimes, meaning outer nsGlobalWindow objects can outlive their docshells and vice-versa. It would be much easier to move methods from nsGlobalWindow into nsDocShell if we didn't have to worry about checking for these disconnected states. To solve this, I am currently planning to make outer nsGlobalWindows share a reference count with their nsDocShell, such that they have the same lifetime, which should make it much easier to move methods over.
What is your timeline for this? I was hoping to get bug 1293277 reviewed and landed in the next few weeks. I'm worried significant docshell/window changes might make this difficult. Or maybe its not that big a change?
Priority: -- → P3
The idea here is that if we're going to be moving outer window data from nsGlobalWindow into the DocShell we're going to need to make sure we can hold strong references to the docshell in places where we currently hold the outer window, which means that nsDocShell needs to be cycle collected. This patch set does the work to make nsDocShell cycle collected, and then adds a strong edge from nsGlobalWindow to the docshell which will need cycle collection to break.
Attachment #8911975 - Flags: review?(bugs)
See Also: → 372107
Comment on attachment 8911975 [details] [diff] [review] Part 1: Make nsDocShell and nsDocLoader cycle collected >+NS_IMPL_CYCLE_COLLECTION_INHERITED(nsDocShell, >+ nsDocLoader, >+ mSessionStorageManager, >+ mScriptGlobal, >+ mContentViewer, >+ mParentWidget) mParentWidget sure isn't cycle collectable, nor is mContentViewer. >+NS_IMPL_CYCLE_COLLECTION(nsDocLoader, >+ mDocumentRequest, >+ mLoadGroup, >+ mChildrenInOnload) >+ Are mDocumentRequest and mLoadGroup really cycle collectable? I don't think so. Cycle collect only cycle collectable member variables.
Attachment #8911975 - Flags: review?(bugs) → review+
Comment on attachment 8911976 [details] [diff] [review] Part 2: Add more cycle collector edges for nsDocShell >+NS_IMPL_CYCLE_COLLECTION_CLASS(TabChild) >+ >+NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN_INHERITED(TabChild, TabChildBase) >+ NS_IMPL_CYCLE_COLLECTION_UNLINK(mWebNav) >+NS_IMPL_CYCLE_COLLECTION_UNLINK_END >+ >+NS_IMPL_CYCLE_COLLECTION_TRAVERSE_BEGIN_INHERITED(TabChild, TabChildBase) >+ NS_IMPL_CYCLE_COLLECTION_TRAVERSE(mWebNav) >+NS_IMPL_CYCLE_COLLECTION_TRAVERSE_END >+ >+NS_IMPL_CYCLE_COLLECTION_TRACE_BEGIN_INHERITED(TabChild, TabChildBase) >+NS_IMPL_CYCLE_COLLECTION_TRACE_END I don't understand all these changes. mWebNav doesn't point to a cycle collectable object, so why all this work?
Attachment #8911976 - Flags: review?(bugs) → review-
Attachment #8911977 - Flags: review?(bugs) → review+
Attachment #8911978 - Flags: review?(bugs) → review+
(In reply to Olli Pettay [:smaug] from comment #7) > I don't understand all these changes. mWebNav doesn't point to a cycle > collectable object, so why all this work? mWebNav is a nsIWebNavigation. I'd believe you if you told me that the mWebNav on TabChild is always, say, a nsWebBrowser, but that interface is also on nsDocShell, and I was being paranoid and trying to add every possible edge.
Well, you can easily check the only place where value is assigned to mWebNav. Another thing, need to think if DocLoader/Shell should be skippable.
(In reply to Ben Kelly [:bkelly] from comment #1) > What is your timeline for this? I was hoping to get bug 1293277 reviewed > and landed in the next few weeks. I'm worried significant docshell/window > changes might make this difficult. Or maybe its not that big a change? FYI bkelly, The scope of this bug ended up changing a lot and I doubt this will be a big disruptive change for you. This bug now simply makes nsDocShell cycle collected, and adds a strong edge from nsGlobalWindow back to nsDocShell.
Attachment #8911975 - Attachment is obsolete: true
Attachment #8911976 - Attachment is obsolete: true
I removed the unnecessary changes to TabChild
Attachment #8912303 - Flags: review?(bugs)
(In reply to Olli Pettay [:smaug] from comment #9) > Another thing, need to think if DocLoader/Shell should be skippable. It wouldn't hurt, though I'd guess that only the docloader/shell would be in the graph. Wouldn't it just be "is the window I hold onto skippable?"
Comment on attachment 8912303 [details] [diff] [review] Part 2: Add more cycle collector edges for nsDocShell We don't have more docshell member variables on cycle collectable objects?
Attachment #8912303 - Flags: review?(bugs) → review+
(In reply to Olli Pettay [:smaug] from comment #14) > Comment on attachment 8912303 [details] [diff] [review] > Part 2: Add more cycle collector edges for nsDocShell > > We don't have more docshell member variables on cycle collectable objects? I thought so before I tried doing the next thing which I wanted to do, which was remove the mDocShell = nullptr and mScriptGlobal = nullptr calls from nsDocShell/nsGlobalWindow, which caused us to start leaking windows/docshells like mad. Not sure why yet. This patch by itself doesn't seem to cause leaks, but to be safe I figure I'll hold off from landing it until I can fix the case where we don't break the link manually.
MozReview-Commit-ID: 8Ze3KrgpaTQ
Attachment #8913819 - Flags: review?(bugs)
MozReview-Commit-ID: 1hteVsTlTvd
Attachment #8913820 - Flags: review?(bugs)
MozReview-Commit-ID: CnAiOaAOTkw
Attachment #8913821 - Flags: review?(bugs)
MozReview-Commit-ID: CyNanVzyyNp
Attachment #8913822 - Flags: review?(bugs)
Comment on attachment 8913819 [details] [diff] [review] Part 5: Don't check for a docshell to determine if a window is alive >+ bool IsCleanedUp() { { to its own line
Attachment #8913819 - Flags: review?(bugs) → review+
Attachment #8913820 - Flags: review?(bugs) → review+
Attachment #8913821 - Flags: review?(bugs) → review+
Attachment #8913822 - Flags: review?(bugs) → review+
Pushed by michael@thelayzells.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/e83481bd6963 Part 1: Make nsDocShell and nsDocLoader cycle collected, r=smaug https://hg.mozilla.org/integration/mozilla-inbound/rev/9b5b03b2114d Part 2: Add more cycle collector edges for nsDocShell, r=smaug https://hg.mozilla.org/integration/mozilla-inbound/rev/4249adc2593a Part 3: Make nsGlobalWindow's reference to nsDocShell strong, r=smaug https://hg.mozilla.org/integration/mozilla-inbound/rev/ef2dfdd829bc Part 4: Be consistent with nsDocLoader's canonical ISupports, r=smaug https://hg.mozilla.org/integration/mozilla-inbound/rev/0eb4a7009f43 Part 5: Don't check for a docshell to determine if a window is alive, r=smaug https://hg.mozilla.org/integration/mozilla-inbound/rev/8a26cdedeb58 Part 6: Cycle collect nsWebBrowser, r=smaug https://hg.mozilla.org/integration/mozilla-inbound/rev/436ecd081ffa Part 7: Don't try to break nsDocShell <-> nsGlobalWindow references when cleaning up, r=smaug https://hg.mozilla.org/integration/mozilla-inbound/rev/318b85f85682 Part 8: Avoid using mDocShell to check if the window has been closed, r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/4249adc2593a + // This reference is used by nsGlobalWindow. + // nsIDocShell* mDocShell; + nsCOMPtr<nsIDocShell> mDocShell; Is there a reason for the middle line?
Backed out for failing wpt-reftest /infrastructure/reftest-wait.html and unexpected passes in /webvtt/ and leaks in browser-chrome, all on OS X and Windows: https://hg.mozilla.org/integration/mozilla-inbound/rev/abf8e429f50d041ad747820df96f4e9a671d75c0 https://hg.mozilla.org/integration/mozilla-inbound/rev/09bed3f1e1adc2070648fee56fb5230c9018bfca https://hg.mozilla.org/integration/mozilla-inbound/rev/b04f52783ed5e7dc39882a95c2d0db48b2fd8cce https://hg.mozilla.org/integration/mozilla-inbound/rev/9d008ed9e7e4a8b0e0c9264ce0e7125a959a7e2e https://hg.mozilla.org/integration/mozilla-inbound/rev/4dea37e579a793624f1c9d4068c7b80f23853035 https://hg.mozilla.org/integration/mozilla-inbound/rev/200ae65803f738eec82653428f3a7dd54f045fc3 https://hg.mozilla.org/integration/mozilla-inbound/rev/643c9e4af60dadd89bbe794769a01c3ae968c8fc https://hg.mozilla.org/integration/mozilla-inbound/rev/7fd443a4bc1472a32986bf2c1517b2d1ecce7251 Follow-up push which ran many of the failing tests (at least Wr also show up on the initial one): https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=ddbf0b0a03d8a8ff27170c59654f1445bce4437d&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable Failure log Wr: https://treeherder.mozilla.org/logviewer.html#?job_id=134429428&repo=mozilla-inbound TEST-UNEXPECTED-FAIL | /infrastructure/reftest-wait.html | Testing http://web-platform.test:8000/infrastructure/reftest-wait.html == http://web-platform.test:8000/infrastructure/reftest-wait-ref.html TEST-UNEXPECTED-PASS | /webvtt/rendering/cues-with-video/processing-model/basic.html | Testing http://web-platform.test:8000/webvtt/rendering/cues-with-video/processing-model/basic.html == http://web-platform.test:8000/webvtt/rendering/cues-with-video/processing-model/basic-ref.html TEST-UNEXPECTED-PASS | /webvtt/rendering/cues-with-video/processing-model/bidi/bidi_ruby.html | Testing http://web-platform.test:8000/webvtt/rendering/cues-with-video/processing-model/bidi/bidi_ruby.html == http://web-platform.test:8000/webvtt/rendering/cues-with-video/processing-model/bidi/bidi_ruby-ref.html TEST-UNEXPECTED-PASS | /webvtt/rendering/cues-with-video/processing-model/bidi/u0041_first.html | Testing http://web-platform.test:8000/webvtt/rendering/cues-with-video/processing-model/bidi/u0041_first.html == http://web-platform.test:8000/webvtt/rendering/cues-with-video/processing-model/bidi/u0041_first-ref.html TEST-UNEXPECTED-PASS | /webvtt/rendering/cues-with-video/processing-model/bidi/u06E9_no_strong_dir.html | Testing http://web-platform.test:8000/webvtt/rendering/cues-with-video/processing-model/bidi/u06E9_no_strong_dir.html == http://web-platform.test:8000/webvtt/rendering/cues-with-video/processing-model/bidi/u06E9_no_strong_dir-ref.html Failure log browser-chrome: https://treeherder.mozilla.org/logviewer.html#?job_id=134430319&repo=mozilla-inbound > TEST-UNEXPECTED-FAIL | leakcheck | default process: 610618 bytes leaked (Annotators, Array, AsyncLatencyLogger, CSPService, CSSMediaRule, ...)
Flags: needinfo?(nika)
(In reply to :Ms2ger (⌚ UTC+1/+2) from comment #22) > Is there a reason for the middle line? Nope - I'll fix that when I re-land.
The XPCshell failures like https://treeherder.mozilla.org/logviewer.html#?job_id=134451633&repo=mozilla-inbound are also from the patches for this bug: 18:15:56 INFO - TEST-START | xpcshell.ini:toolkit/components/extensions/test/xpcshell/test_ext_contentscript_xrays.js 18:15:56 INFO - mozcrash Saved minidump as Z:\task_1506967341\build\blobber_upload_dir\18579176-5d0a-4e2b-93b1-3ce091a0452e.dmp 18:15:56 INFO - mozcrash Saved app info as Z:\task_1506967341\build\blobber_upload_dir\18579176-5d0a-4e2b-93b1-3ce091a0452e.extra 18:15:56 WARNING - PROCESS-CRASH | xpcshell-remote.ini:toolkit/components/extensions/test/xpcshell/test_ext_permissions.js | application crashed [@ js::`dynamic atexit destructor for 'sProtectedRegions'' + 0x45] 18:15:56 INFO - Crash dump filename: c:\users\genericworker\appdata\local\temp\xpc-other-yaltfs\18579176-5d0a-4e2b-93b1-3ce091a0452e.dmp 18:15:56 INFO - Operating system: Windows NT 18:15:56 INFO - 10.0.15063 18:15:56 INFO - CPU: amd64 18:15:56 INFO - family 6 model 63 stepping 2 18:15:56 INFO - 8 CPUs 18:15:56 INFO - GPU: UNKNOWN 18:15:56 INFO - Crash reason: EXCEPTION_BREAKPOINT 18:15:56 INFO - Crash address: 0x7ff8a8b2b0d5 18:15:56 INFO - Assertion: Unknown assertion type 0x00000000 18:15:56 INFO - Process uptime: 19 seconds 18:15:56 INFO - Thread 0 (crashed) 18:15:56 INFO - 0 xul.dll!js::`dynamic atexit destructor for 'sProtectedRegions'' + 0x45 18:15:56 INFO - rax = 0x0000000000000000 rdx = 0x0000000000000000 18:15:56 INFO - rcx = 0x00000000ffffffff rbx = 0x000002d0fa83f7a8 18:15:56 INFO - rsi = 0x000002d0fa83f310 rdi = 0x0000fa63dde6bfd7 18:15:56 INFO - rbp = 0x000002d0fa83f870 rsp = 0x000000f5d19ff460 18:15:56 INFO - r8 = 0x000000f5d19fde18 r9 = 0x000002d0fc3566ed 18:15:56 INFO - r10 = 0x0000000000000000 r11 = 0x000000f5d19ff2e0 18:15:56 INFO - r12 = 0x0000000000000000 r13 = 0x0000000000000040 18:15:56 INFO - r14 = 0x000000f5d19ff568 r15 = 0x000002d0fa83f310 18:15:56 INFO - rip = 0x00007ff8a8b2b0d5 18:15:56 INFO - Found by: given as instruction pointer in context 18:15:56 INFO - 1 ucrtbase.dll!SHATransform + 0x5a3 18:15:56 INFO - rbx = 0x000002d0fa83f7a8 rbp = 0x000002d0fa83f870 18:15:56 INFO - rsp = 0x000000f5d19ff490 r12 = 0x0000000000000000 18:15:56 INFO - r13 = 0x0000000000000040 r14 = 0x000000f5d19ff568 18:15:56 INFO - r15 = 0x000002d0fa83f310 rip = 0x00007ff8d2dc8673 18:15:56 INFO - Found by: call frame info 18:15:56 INFO - 2 ucrtbase.dll!SHATransform + 0x6ae 18:15:56 INFO - rbx = 0x000002d0fa83f7a8 rbp = 0x000002d0fa83f870 18:15:56 INFO - rsp = 0x000000f5d19ff520 r12 = 0x0000000000000000 18:15:56 INFO - r13 = 0x0000000000000040 r14 = 0x000000f5d19ff568 18:15:56 INFO - r15 = 0x000002d0fa83f310 rip = 0x00007ff8d2dc877e 18:15:56 INFO - Found by: call frame info 18:15:56 INFO - Thread 1 18:15:56 INFO - 0 ntdll.dll!NtWaitForWorkViaWorkerFactory + 0x14 18:15:56 INFO - rax = 0x000000f5d1bff922 rdx = 0x000002d0fa841c30 18:15:56 INFO - rcx = 0x000000f5d1bffa40 rbx = 0x000002d0fa841c30 18:15:56 INFO - rsi = 0x0000000000000010 rdi = 0x000002d0fa841fb0 18:15:56 INFO - rbp = 0x0000000000000000 rsp = 0x000000f5d1bff8e8 18:15:56 INFO - r8 = 0x0000000000000001 r9 = 0x0000000000000010 18:15:56 INFO - r10 = 0x0000000000000000 r11 = 0x0000000000000246 18:15:56 INFO - r12 = 0x0000000000000000 r13 = 0x000002d0fa814570 18:15:56 INFO - r14 = 0x00007ff8d6252a60 r15 = 0x00007ff8d6252f00 18:15:56 INFO - rip = 0x00007ff8d62b8c44 18:15:56 INFO - Found by: given as instruction pointer in context 18:15:56 INFO - 1 ntdll.dll!TppWorkerThread + 0x293 18:15:56 INFO - rbx = 0x000002d0fa841c30 rbp = 0x0000000000000000 18:15:56 INFO - rsp = 0x000000f5d1bff8f0 r12 = 0x0000000000000000 18:15:56 INFO - r13 = 0x000002d0fa814570 r14 = 0x00007ff8d6252a60 18:15:56 INFO - r15 = 0x00007ff8d6252f00 rip = 0x00007ff8d6251563 18:15:56 INFO - Found by: call frame info 18:15:56 INFO - 2 kernel32.dll!LdrpLoadResourceFromAlternativeModule + 0x27c 18:15:56 INFO - rbx = 0x000002d0fa841c30 rbp = 0x0000000000000000 18:15:56 INFO - rsp = 0x000000f5d1bffc00 r12 = 0x0000000000000000 18:15:56 INFO - r13 = 0x000002d0fa814570 r14 = 0x00007ff8d6252a60 18:15:56 INFO - r15 = 0x00007ff8d6252f00 rip = 0x00007ff8d5042774 18:15:56 INFO - Found by: call frame info 18:15:56 INFO - Thread 2 18:15:56 INFO - <no frames> 18:15:56 INFO - Thread 3 18:15:56 INFO - <no frames> 18:15:56 INFO - Thread 4 18:15:56 INFO - <no frames> 18:15:56 INFO - Thread 5 18:15:56 INFO - <no frames> 18:15:56 INFO - Thread 6 18:15:56 INFO - <no frames> 18:15:56 INFO - Thread 7 18:15:56 INFO - <no frames> 18:15:56 INFO - Thread 8 18:15:56 INFO - <no frames> 18:15:56 INFO - Thread 9 18:15:56 INFO - <no frames> 18:15:56 INFO - Thread 10 18:15:56 INFO - <no frames>
I'm pretty stuck with fixing the leaks caused by this patch (e.g. https://treeherder.mozilla.org/logviewer.html#?job_id=137655664&repo=try&lineNumber=4264). The main reason is bug 1409466 which makes this very intermittent leak much harder to debug. ni? :mccr8 who might have some ideas of how I could attack this.
Flags: needinfo?(nika) → needinfo?(continuation)
One approach would be to have some global table that tracks the set of live nsGlobalWindows. Then, late in shutdown (like in nsTraceRefCnt::Shutdown or some such) print out this table along with the pid. Then you can use the ++DOMWINDOW stuff to figure out which test is creating the window. Hopefully it is consistently the same test. Then disable every test except the one that is leaking, maybe copying it 3 or 4 times, and run it locally to see if you can reproduce it more easily.
Flags: needinfo?(continuation)
I want to land this so I can move on with the nsGlobalWiudow splitting, so I'm moving the part which stops us from breaking the link between nsGlobalWindow and nsDocShell into a different bug. MozReview-Commit-ID: Bui924n8Kvn
Attachment #8911977 - Attachment is obsolete: true
Attachment #8911978 - Attachment is obsolete: true
Attachment #8912302 - Attachment is obsolete: true
Attachment #8912303 - Attachment is obsolete: true
Attachment #8913819 - Attachment is obsolete: true
Attachment #8913820 - Attachment is obsolete: true
Attachment #8913821 - Attachment is obsolete: true
Attachment #8913822 - Attachment is obsolete: true
MozReview-Commit-ID: 3j9jfLv7MO2
MozReview-Commit-ID: 1hteVsTlTvd
MozReview-Commit-ID: AsyFAgKJkxq
Attachment #8924159 - Flags: review?(bugs)
Pulling this down from the comment on Part 1 so it's visible. -- I want to land this so I can move on with the nsGlobalWiudow splitting, so I'm moving the part which stops us from breaking the link between nsGlobalWindow and nsDocShell into a different bug.
Comment on attachment 8924159 [details] [diff] [review] Part 7: Make nsBrowserStatusFilter cycle collected >-NS_IMPL_ISUPPORTS(nsBrowserStatusFilter, >- nsIWebProgress, >- nsIWebProgressListener, >- nsIWebProgressListener2, >- nsISupportsWeakReference) >+NS_IMPL_CYCLE_COLLECTION(nsBrowserStatusFilter, >+ mListener, >+ mTarget, >+ mTimer) nsITimer impl isn't cycle collectable, so no need to add mTimer.
Attachment #8924159 - Flags: review?(bugs) → review+
Pushed by michael@thelayzells.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/20de4dbd790d Part 1: Make nsDocShell and nsDocLoader cycle collected, r=smaug https://hg.mozilla.org/integration/mozilla-inbound/rev/0d8cf3a730a9 Part 2: Add more cycle collector edges for nsDocShell, r=smaug https://hg.mozilla.org/integration/mozilla-inbound/rev/ea6d85ed8e47 Part 3: Make nsGlobalWindow's reference to nsDocShell strong, r=smaug https://hg.mozilla.org/integration/mozilla-inbound/rev/f2f5d6a4c558 Part 4: Be consistent with nsDocLoader's canonical ISupports, r=smaug https://hg.mozilla.org/integration/mozilla-inbound/rev/affdafd5d1ae Part 5: Don't check for a docshell to determine if a window is alive, r=smaug https://hg.mozilla.org/integration/mozilla-inbound/rev/d8d3d54dea29 Part 6: Cycle collect nsWebBrowser, r=smaug https://hg.mozilla.org/integration/mozilla-inbound/rev/9715db005f4a Part 7: Make nsBrowserStatusFilter cycle collected, r=smaug
See Also: 372107
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: