Closed Bug 1329836 Opened 8 years ago Closed 7 years ago

Intermittent leakcheck | default process: 2432027 bytes leaked (AbstractThread, AbstractWatcher, AddonPathService, Animation, AnimationEffectReadOnly, ...)

Categories

(DevTools :: about:debugging, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1348547

People

(Reporter: intermittent-bug-filer, Assigned: gbrown)

References

Details

(Keywords: intermittent-failure, memory-leak, Whiteboard: [stockwell fixed])

Keywords: mlk
This happens on OSX and Windows, always e10s, always debug (of course).

There were a few isolated failures in February and early March, then a sudden increase in frequency March 17.

This leak is consistently reported at the end of devtools/client/aboutdebugging/test test runs. Many leaked URLs and leaked objects are reported.
and now this is becoming frequent enough that we should look into this.

looking at the first failure on March 17th [0], I see this:
09:51:37     INFO - GECKO(1740) | WARNING: YOU ARE LEAKING THE WORLD (at least one JSRuntime and everything alive inside it, that is) AT JS_ShutDown TIME.  FIX THIS!
09:51:37     INFO - GECKO(1740) | [Parent 1740] ###!!! ASSERTION: Component Manager being held past XPCOM shutdown.: 'cnt == 0', file /home/worker/workspace/build/src/xpcom/build/XPCOMInit.cpp, line 1045


Looking at this, I did some retrigger to help narrow this down:
https://treeherder.mozilla.org/#/jobs?repo=autoland&filter-searchStr=osx%20debug%20dt&tochange=5e664d9cc14d3331264be607ae27c650352066b6&fromchange=bb5c698bff0a4d73ee015230818faacc587b07d5&group_state=expanded&selectedJob=84636043

it will take a while to get the full data, but I can follow up in here when we have it all.


[0] https://archive.mozilla.org/pub/firefox/tinderbox-builds/autoland-macosx64-debug/autoland_yosemite_r7-debug_test-mochitest-e10s-devtools-chrome-1-bm106-tests1-macosx-build197.txt.gz
Product: Core → Firefox
Whiteboard: [stockwell needswork]
Component: General → Developer Tools: about:debugging
:jdescottes, can you take a look at this?  With this being so recent and frequent it should be easier to debug.  I suspect this is the same root cause as bug 1348547 when we find it.
Flags: needinfo?(jdescottes)
possibly this came from bug 1348112?  collecting more retriggers to help prove that.
with more retriggers, that is likely the suspect, although it is hard to reproduce this failure- lets see if looking at that update and/or the tests can help result in figuring this out.
It is very unlikely that Bug 1348112 is the responsible for this regression. 
This bug only landed new files in devtools, which are not used at the moment.

The other suspect would be Bug 1345932, which is one of the only bugs touching to about:debugging.
It landed on 16 march, right before this spiked.
Flags: needinfo?(jdescottes)
:gbrown, as this is our #1 failure on the tree, can you take a look at this?
Flags: needinfo?(gbrown)
I don't have much experience with leaks, so I'd be happy to hand this off to someone else, but I'll give it a shot.

I'll try to reproduce on try, then see if backing out suspect changesets helps. (Bug 1345932, perhaps https://hg.mozilla.org/integration/autoland/rev/5e664d9cc14d3331264be607ae27c650352066b6, ...)
Assignee: nobody → gbrown
Flags: needinfo?(gbrown)
For the record I did some retriggers on autoland before / after the changesets from Bug 1345932 were merged:
- https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=eb818ab6e5797c51bd7efabaa378ba54ae596529&selectedJob=86915238
- https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=5f4fd0bbc9a0797cd87e3b47a4845c99f2dfcc7c&selectedJob=86915281

And I didn't get any failure for both jobs, so maybe it is unrelated to this bug.
It looks like the increase in content processes is responsible for the increase in frequency of this leak.

https://treeherder.mozilla.org/#/jobs?repo=try&author=gbrown@mozilla.com&tochange=101430a5866394f8cac4e7943c08d707a54f125a&fromchange=ac47424769e7ac4c043846809a189f3c02466088

The top-most push is 1 content process, the middle 4 content processes, the bottom 2 content processes. I got 3 leaks with 4 content processes, 0 leaks with 1 or 2 content processes. (I manipulated dom.ipc.processCount in all.js for this).

https://bugzilla.mozilla.org/show_bug.cgi?id=1348547#c17 is worth quoting here:

(Blake Kaplan (:mrbkap) from comment #17)
> That is very surprising. Service worker debugging doesn't work in e10s-multi
> and these tests attempt to force a single content process [1][2]. So bug
> 1336398 really should have little to no effect on these tests. That suggests
> a bug in our process selector.
> 
> [1]
> http://searchfox.org/mozilla-central/rev/
> 72fe012899a1b27d34838ab463ad1ae5b116d76b/devtools/client/aboutdebugging/test/
> browser_service_workers_status.js#15
> [2]
> http://searchfox.org/mozilla-central/rev/
> 72fe012899a1b27d34838ab463ad1ae5b116d76b/devtools/client/aboutdebugging/test/
> head.js#425,433
See Also: → 1348547
:gbrown, do you know what is causing these leaks?   this is quite frequent and I would like to see us fix the issue or disable the offending tests this week- do you need help bisecting the leak to the root test failures?
Flags: needinfo?(gbrown)
I do not know the root cause of the leaks. 

I have not bisected to find a test or group of tests that is correlated with the leak.

The leak increased in frequency with additional content processes and goes away with fewer content processes - comment 21.

There is a potential fix under discussion in bug 1348547. I'll be working on that today.
Flags: needinfo?(gbrown)
(In reply to Geoff Brown [:gbrown] from comment #26)
> I have not bisected to find a test or group of tests that is correlated with
> the leak.

Except of course, we know it is in the aboutdebugging tests, and there's https://bugzilla.mozilla.org/show_bug.cgi?id=1348547#c17.
I am hoping that the change landing in bug 1348547 will have a big impact on this bug.
(In reply to Geoff Brown [:gbrown] from comment #29)
> I am hoping that the change landing in bug 1348547 will have a big impact on this bug.

And indeed it is happening much less frequently...but leaks continue, and look just the same. I'll bisect the aboutdebugging tests...
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2cac2be219d351ea504c03d45f4c7a35a9d7a2f7 has 3 such instances. In every case, the leaks follow bug 1256544 - "browser_service_workers_start.js | Test timed out".
:gbrown, can we disable browser_service_workers_start.js to mitigate this failure?
Flags: needinfo?(gbrown)
All recent failures have been on try. I'm just keeping this open to monitor for a few more days, but I'm hoping I can dup this to bug 1348547.
Flags: needinfo?(gbrown)
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
Whiteboard: [stockwell needswork] → [stockwell fixed]
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.