Closed Bug 932898 Opened 11 years ago Closed 11 years ago

Bring back the shutdown leak detector

Categories

(Testing :: General, defect)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED
mozilla28

People

(Reporter: emorley, Assigned: ttaubert)

References

Details

(Whiteboard: [MemShrink:P1])

Attachments

(3 files, 6 obsolete files)

The cycle collection analysis added by bug 728294 should have caught the recent issues in bug 932781 and dependents, but didn't. My money is on bug 863447 regressing it: https://hg.mozilla.org/mozilla-central/rev/bed3081376ca Either way, we must fix it & then the inevitable hoards of leaks we have accumulated in the meantime, before the trees can reopen.
(This bug is one of three holding the trees closed - bug 932781 comment 11)
Severity: critical → blocker
Whiteboard: [MemShrink]
(In reply to Ed Morley [:edmorley UTC+1] from comment #3) > The cycle collection analysis added by bug 728294 should have caught the > recent issues in bug 932781 and dependents, but didn't. > > My money is on bug 863447 regressing it: > https://hg.mozilla.org/mozilla-central/rev/bed3081376ca Try push w/ backout: https://tbpl.mozilla.org/?tree=Try&rev=b86779999d73 I'll try this locally as well but am not sure that'll actually be faster than this try push...
(In reply to :Gijs Kruitbosch from comment #3) > (In reply to Ed Morley [:edmorley UTC+1] from comment #3) > > The cycle collection analysis added by bug 728294 should have caught the > > recent issues in bug 932781 and dependents, but didn't. > > > > My money is on bug 863447 regressing it: > > https://hg.mozilla.org/mozilla-central/rev/bed3081376ca > > Try push w/ backout: https://tbpl.mozilla.org/?tree=Try&rev=b86779999d73 > > I'll try this locally as well but am not sure that'll actually be faster > than this try push... FWIW, locally at least this doesn't help
Tim and I are looking at this
Assignee: nobody → ttaubert
Status: NEW → ASSIGNED
I wonder, isn't bug 932867 a different case of leak than what the CC graph analyzer is supposed to catch? I thought the analyzer catches when there is some JS property that holds a reference to a DOM Element or Window that has been removed from the DOM Tree. How ever the FrameWorker stuff inserts an iframe into the hiddenDOMWindow and just never uses it again. So it's a valid DOM Element kept alive but the hiddenWindow, we just never use it again. Is that a leak we should catch?
That is not a "leak". We just end up creating more and more iframes, but never remove them from DOM. We could and should perhaps count window objects which are alive, and make sure that count doesn't increase much over the time.
Bug 839193 (which turns up in khuey's log [1] as well) seems to have introduced a test leak by never removing observers, right? Shouldn't that be detected by our leak detection? It also doesn't show up in about:cc. [1] https://khuey.pastebin.mozilla.org/3378934
(In reply to Tim Taubert [:ttaubert] from comment #8) > Bug 839193 (which turns up in khuey's log [1] as well) seems to have > introduced a test leak by never removing observers, right? Shouldn't that be > detected by our leak detection? It also doesn't show up in about:cc. > > [1] https://khuey.pastebin.mozilla.org/3378934 Status update: Tim and I found that enabling allTraces() did make the test suite report a bunch of stuff as leaking, but that is likely overreporting (ie reporting things which aren't actually leaks). Just blanket using that doesn't seem like it'd help us here. Tim also noticed that when http://mxr.mozilla.org/mozilla-central/source/dom/base/nsGlobalWindow.cpp#1629 runs, if (MOZ_UNLIKELY(cb.WantDebugInfo())) is false. This by itself may not affect things too much, but it's strange that that flag is not present, as AFAICT it shouldn't be.
(In reply to Tim Taubert [:ttaubert] from comment #8) > Bug 839193 (which turns up in khuey's log [1] as well) seems to have > introduced a test leak by never removing observers, right? Shouldn't that be > detected by our leak detection? It also doesn't show up in about:cc. What kind of observers? We try to mark strong observers black "certainly alive", since we have tons of stuff using observer service and we need to get the normal CC graphs small. It is a thin line between a leak and valid observer service use.
I guess we should put back part 1 from bug 728294. That would catch some leaks. But we need something where we check the number of windows more often.
fwiw, I don't think bug 863447 has blame here, it is not faking or hiding anything, we care to detect things that are indefinitely leaked, not things that are "leaked" for some milliseconds while async work completes.
(In reply to Marco Bonardo [:mak] from comment #12) > fwiw, I don't think bug 863447 has blame here, it is not faking or hiding > anything, we care to detect things that are indefinitely leaked, not things > that are "leaked" for some milliseconds while async work completes. I agree. This appears to be a different "type" of leak that we weren't checking before. Based on that I don't think we should block reopening the tree on this.
+1. This is an old feature we should bring back but this could take a little while to get right with all the try runs etc. I don't think we should block trees any further.
This seems to work fine and is mostly code that we removed in bug 728294. I'm not sure how to detect if we're in a debug build (which isn't actually necessary as we just don't have any lines to parse) and we should not run with webapprtChrome=true.
Attachment #825175 - Flags: feedback?(ted)
Both tests are leaking according to the shutdown leak detector. They're not anymore with these fixes applied.
Attachment #825177 - Flags: review?(bugs)
Attachment #825177 - Flags: review?(bugs) → review+
Whiteboard: [MemShrink] → [MemShrink] [leave open]
Here's a try run with most of the recent leak fixes and the leak detection patch: https://tbpl.mozilla.org/?tree=Try&rev=2a58c0f29f88
Added some code that removes the false positives caused by the sidebar and the BrowserNewTabPreloader.
Attachment #825175 - Attachment is obsolete: true
Attachment #825175 - Flags: feedback?(ted)
Attachment #825296 - Flags: feedback?(ted)
Comment on attachment 825296 [details] [diff] [review] Bring back the shutdown leak detector, v2 Review of attachment 825296 [details] [diff] [review]: ----------------------------------------------------------------- ::: build/automationutils.py @@ +558,5 @@ > + if line[2:11] == "DOMWINDOW": > + self._logWindow(line) > + elif line[2:10] == "DOCSHELL": > + self._logDocShell(line) > + elif line.startswith("TEST-START"): Kinda wish this was using the existing currentTest stuff, but it's not that big of a deal. ::: testing/mochitest/runtests.py @@ +833,5 @@ > args.append('-foreground') > if testUrl: > args.append(testUrl) > > + #if debug build and not webapprtChrome You can say if mozinfo.info["debug"] and not options.webapprtChrome: @@ +1052,5 @@ > ### output processing > > class OutputHandler(object): > """line output handler for mozrunner""" > + def __init__(self, harness, utilityPath, symbolsPath=None, dump_screen_on_timeout=True, shutdownLeaks = None): nit: style here seems to be no spaces around =
Attachment #825296 - Flags: feedback?(ted) → feedback+
Here's a new patch that fixes the false positive for the TabView window. It doesn't address ted's comments, yet.
Attachment #825296 - Attachment is obsolete: true
Updating the summary to what seems to be the current goal.
Summary: Work out why the cycle collection leak analysis from bug 728294 is no longer working → Bring back the shutdown leak detector
Severity: blocker → critical
Depends on: 933699
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #20) > > + #if debug build and not webapprtChrome > > You can say if mozinfo.info["debug"] and not options.webapprtChrome: 'options' isn't available in runApp() so I think I'll just pass a 'webapprtChrome' argument?
Ah. Yeah, do that.
Depends on: 932880
Patch addresses Ted's comments and should fix some more false positives. Needs fx-team tip (until that is merged).
Attachment #825366 - Attachment is obsolete: true
Depends on: 933551
Depends on: 934091
Latest patch that needs the patch from bug 934091 to work.
Attachment #826089 - Attachment is obsolete: true
Some more leak fixes.
Attachment #826307 - Flags: review?(jaws)
Comment on attachment 826307 [details] [diff] [review] Fix leaks in /browser/base/content/test/general/ >+ function windowClosed(subject) { >+ if (subject == win) { >+ Services.obs.removeObserver(newDocumentShown, "document-shown"); >+ Services.obs.removeObserver(windowClosed, "domwindowclosed"); >+ } >+ } >+ >+ // Make sure to remove the 'document-shown' observer in case the window >+ // is being closed right after it was opened to avoid leaking. > Services.obs.addObserver(newDocumentShown, "document-shown", false); >+ Services.obs.addObserver(windowClosed, "domwindowclosed", false); Can you use the unload event?
(In reply to Dão Gottwald [:dao] from comment #29) > >+ // Make sure to remove the 'document-shown' observer in case the window > >+ // is being closed right after it was opened to avoid leaking. > > Services.obs.addObserver(newDocumentShown, "document-shown", false); > >+ Services.obs.addObserver(windowClosed, "domwindowclosed", false); > > Can you use the unload event? I just tried and it doesn't work. Looks like we're closing the window way too early to even receive an unload event.
Depends on: 934206
Attachment #826307 - Flags: review?(jaws) → review+
Comment on attachment 826283 [details] [diff] [review] Bring back the shutdown leak detector, v5 Review of attachment 826283 [details] [diff] [review]: ----------------------------------------------------------------- ::: build/automationutils.py @@ +603,5 @@ > + self.leakedWindows[id] = self._parseValue(line, "url") > + > + def _logDocShell(self, line): > + created = line[:2] == "++" > + id = self._parseValue(line, "pid") + "." + self._parseValue(line, "id") FWIW, I was using this and found a problem here - self._parseValue may return None, causing a TypeError trying to add a string and None. This was probably caused by me applying just this patch - there's no "pid" in docShell lines in my output. (The symptoms were also really bizarre - exceptions weren't written anywhere and it seemed to cause the Fx process to simply hang writing to stdout - I doubt they are related to this patch though, but it sure confused me and took me quite some time to work out the problem was here...)
(In reply to Mark Hammond [:markh] from comment #31) > FWIW, I was using this and found a problem here - self._parseValue may > return None, causing a TypeError trying to add a string and None. This was > probably caused by me applying just this patch - there's no "pid" in > docShell lines in my output. Good point, I will address this in the next version of the patch and add some better error handling.
Together with the patches from bug 934091 and bug 934206 we have a green try run without any leaks \o/. I modified the error handling to explicitly print an error message should we somehow ever get interleaved log lines again. I added some cleanup to browser-test.js to avoid false positives for features that explicitly may keep DOMWindows and DocShells alive until shutdown. https://tbpl.mozilla.org/?tree=Try&rev=571ad96c3957
Attachment #826283 - Attachment is obsolete: true
Attachment #827536 - Flags: review?(ted)
Small fix for Android mochitests: added a default value for the new |webapprtChrome| argument in build/automation.py.in. All running fine now: https://tbpl.mozilla.org/?tree=Try&rev=6dc945b3212e
Attachment #827536 - Attachment is obsolete: true
Attachment #827536 - Flags: review?(ted)
Attachment #829885 - Flags: review?(ted)
Latest try push with the leak detector: https://tbpl.mozilla.org/?tree=Try&rev=21a5d107d49f
Whiteboard: [MemShrink] [leave open] → [MemShrink:P1] [leave open]
Blocks: 937997
ttaubert, how close is this to landing?
No longer blocks: 937997
(In reply to Nicholas Nethercote [:njn] from comment #40) > ttaubert, how close is this to landing? This is waiting for review and should be good to go then.
ted: eight day review ping. This is an important patch. Thanks.
Comment on attachment 829885 [details] [diff] [review] Bring back the shutdown leak detector, v7 Review of attachment 829885 [details] [diff] [review]: ----------------------------------------------------------------- Sorry, I was travelling last week and have been generally behind on reviews.
Attachment #829885 - Flags: review?(ted) → review+
Whiteboard: [MemShrink:P1] [leave open] → [MemShrink:P1]
Sorry about that. I pushed to try before but ran only mochitest-bc tests.
Pushed to try again with some minor fixes to account for metro: https://tbpl.mozilla.org/?tree=Try&rev=04ab4ad2f6ba
Pushed to try again with a smaller and better fix: https://tbpl.mozilla.org/?tree=Try&rev=db126ca7ee5a Seems all green, waiting a little more to make sure.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
Depends on: 943474
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: