Closed Bug 1317290 Opened 3 years ago Closed 3 years ago

Intermittent leakcheck | default process: 16970 bytes leaked (BackstagePass, CallbackObject, CondVar, ConsoleReportCollector, DOMEventTargetHelper, ...)

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox51 --- unaffected
firefox52 --- unaffected
firefox53 + fix-optional

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: intermittent-failure, memory-leak, regression, Whiteboard: [MemShrink][stockwell fixed])

Keywords: mlk
this error is going to spike when we show leaks- right now we only report a leak when another test fails in the same job that runs.
From one of the logs:

INFO - Leaked URLs:
INFO - http://127.0.0.1:8888/dummy-gmp-manager.xml

Lots of "GC found live XXX" messages too.
Bug 1298027 landed just before this started happening on trunk, which touches CDM shutdown. That seems plausibly-related given comment 11.
Blocks: 1298027
Component: General → Audio/Video: Playback
Flags: needinfo?(bvandyk)
Retriggers confirm that it isn't bug 1298027, sorry for the false alarm. In another run, it was http://127.0.0.1:8888/experiments-dummy/manifest leaking instead.

For the mochitest-10 failures, they're all happening in the toolkit/components/extensions/test/mochitest directory. Looks like more retriggers are in order...
Component: Audio/Video: Playback → General
Flags: needinfo?(bvandyk)
No longer blocks: 1298027
I appear to have hit a bit of a retrigger stalemate here :(. Retriggers on Autoland show it starting on a merge from m-c:
https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=f9c01ced5685b4a1e8af1de4d3089b298afc0507

However, when I try retriggering over the range in inbound represented by that merge, I can't hit it! Even *more* frustrating is that literally the push after Luke's does hit the leak! But that patch was backed out and didn't re-land until 53 and this clearly landed prior to the merge and uplifted to Aurora as well.
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=debug%20tc-M(10)&tochange=5d9a785a37c4&fromchange=41f004509646&group_state=expanded

If anybody else has ideas, I'm all ears. I could try retriggers on m-c to see if they start on the same merge from inbound, but I'm not sure what that'll prove even if they do.
Retriggers on m-c have been equally useless. We could try setting XPCOM_MEM_LOG_CLASSES to get a stack for leaking objects as well, but I'm worried it'll be difficult to ascertain which leaked object is relevant here vs. just coming along for the ride.
See Also: → 1326136
I did a Try push with ConsoleReportCollector stack tracing enabled, and I'm getting the below stack:

INFO - 370 @0x7f1327641ec0 (1 references; 0 from COMPtrs)
INFO - allocation stack:
INFO - #00: mozilla::ConsoleReportCollector::AddRef [dom/console/ConsoleReportCollector.cpp:15]
INFO - #01: nsCOMPtr<nsIConsoleReportCollector>::nsCOMPtr [xpcom/glue/nsCOMPtr.h:466]
INFO - #02: mozilla::net::HttpBaseChannel::HttpBaseChannel [xpcom/string/nsTString.h:33]
INFO - #03: mozilla::net::nsHttpChannel::nsHttpChannel [netwerk/protocol/http/nsHttpChannel.cpp:280]
INFO - #04: mozilla::net::nsHttpHandler::NewProxiedChannel2 [mfbt/RefPtr.h:191]
INFO - #05: mozilla::net::nsIOService::NewChannelFromURIWithProxyFlagsInternal [netwerk/base/nsIOService.cpp:773]
INFO - #06: mozilla::net::nsIOService::NewChannelFromURIWithProxyFlags2 [netwerk/base/nsIOService.cpp:883]
INFO - #07: mozilla::net::nsIOService::NewChannelFromURI2 [netwerk/base/nsIOService.cpp:687]
INFO - #08: NS_NewChannelInternal [netwerk/base/nsNetUtilInlines.h:179]
INFO - #09: NS_NewChannel [netwerk/base/nsNetUtilInlines.h:273]
INFO - #10: mozilla::dom::XMLHttpRequestMainThread::CreateChannel [dom/xhr/XMLHttpRequestMainThread.cpp:2525]
INFO - #11: mozilla::dom::XMLHttpRequestMainThread::Open [dom/xhr/XMLHttpRequestMainThread.cpp:1603]
INFO - #12: mozilla::dom::XMLHttpRequestMainThread::Open [xpcom/string/nsTSubstring.h:95]
INFO - #13: mozilla::dom::XMLHttpRequestBinding::open [obj-firefox/dom/bindings/XMLHttpRequestBinding.cpp:406]
INFO - #14: mozilla::dom::GenericBindingMethod [dom/bindings/BindingUtils.cpp:2916]
INFO - #15: js::CallJSNative [js/src/jscntxtinlines.h:240]
INFO - #16: js::InternalCallOrConstruct [js/src/vm/Interpreter.cpp:457]
INFO - #17: Interpret [js/src/vm/Interpreter.cpp:2928]
INFO - #18: js::RunScript [js/src/vm/Interpreter.cpp:374]
INFO - #19: js::InternalCallOrConstruct [js/src/vm/Interpreter.cpp:475]
INFO - #20: js::Call [js/src/vm/Interpreter.cpp:521]
INFO - #21: js::PromiseObject::create [js/src/builtin/Promise.cpp:1326]
INFO - #22: PromiseConstructor [js/public/RootingAPI.h:773]
INFO - #23: js::CallJSNative [js/src/jscntxtinlines.h:240]
INFO - #24: js::CallJSNativeConstructor [js/src/jscntxtinlines.h:272]
INFO - #25: InternalConstruct [js/src/vm/Interpreter.cpp:554]
INFO - #26: Interpret [js/src/vm/Interpreter.cpp:2920]
INFO - #27: js::RunScript [js/src/vm/Interpreter.cpp:374]
INFO - #28: js::InternalCallOrConstruct [js/src/vm/Interpreter.cpp:475]
INFO - #29: js::Call [js/src/vm/Interpreter.cpp:521]
INFO - #30: js::Wrapper::call [js/src/proxy/Wrapper.cpp:165]
INFO - #31: js::CrossCompartmentWrapper::call [js/src/proxy/CrossCompartmentWrapper.cpp:333]
INFO - #32: js::Proxy::call [js/src/proxy/Proxy.cpp:400]
INFO - #33: js::proxy_Call [js/src/proxy/Proxy.cpp:689]
INFO - #34: js::CallJSNative [js/src/jscntxtinlines.h:240]
INFO - #35: js::InternalCallOrConstruct [js/src/vm/Interpreter.cpp:445]
INFO - #36: Interpret [js/src/vm/Interpreter.cpp:2928]
INFO - #37: js::RunScript [js/src/vm/Interpreter.cpp:374]
INFO - #38: js::InternalCallOrConstruct [js/src/vm/Interpreter.cpp:475]
INFO - #39: Interpret [js/src/vm/Interpreter.cpp:2928]
INFO - #40: js::RunScript [js/src/vm/Interpreter.cpp:374]
INFO - #41: js::InternalCallOrConstruct [js/src/vm/Interpreter.cpp:475]
INFO - #42: js::Call [js/src/vm/Interpreter.cpp:521]
INFO - #43: js::Wrapper::call [js/src/proxy/Wrapper.cpp:165]
INFO - #44: js::CrossCompartmentWrapper::call [js/src/proxy/CrossCompartmentWrapper.cpp:333]
INFO - #45: js::Proxy::call [js/src/proxy/Proxy.cpp:400]
INFO - #46: js::proxy_Call [js/src/proxy/Proxy.cpp:689]
INFO - #47: js::CallJSNative [js/src/jscntxtinlines.h:240]
INFO - #48: js::InternalCallOrConstruct [js/src/vm/Interpreter.cpp:445]
INFO - #49: js::jit::DoCallFallback [js/src/jit/BaselineIC.cpp:4482]

https://public-artifacts.taskcluster.net/DO4NuR8ERQ65wKzU-4UqKA/0/public/logs/live_backing.log

At bz' suggestion, I'm doing another Try push with XMLHttpRequestMainThread stack tracing.
Component: General → DOM
I'll try taking a look at this, once I get a working machine with a Firefox build environment.
I believe bug 1325438 might be the same underlying issue. So coordinate with Shawn Huang as he has been taking a look over there.
See Also: → 1325438
Whiteboard: [MemShrink]
This is very rare on Aurora, so I'm just going to mark it unaffected.

[Tracking Requested - why for this release]: very frequent leak regression.
The tricky thing with this is going to be that while it is happening quite frequently, it isn't happening very much in any given test. It looks like in bug 1325149 comment 8 kmag was able to reproduce what might be a similar leak using chaos mode.
That leak happened to be triggered by a background update request. I suspect some of the other leaks that seem to show up in random tests may be related as well, so it might help to log and modify the timing of those requests.

I'm particularly starting to suspect nsSearchService.js, based on bug 1325947, since those leaks seem to consistently include both RegExp statics and XHR prototypes, and that was the only code running in a BackstagePass that I found using both.
The half dozen or so logs I looked at all had http://127.0.0.1:8888/dummy-gmp-manager.xml as their leaked URL, which seems suspiciously similar.

Maybe one of these requests is somehow firing off after we've begun to shut down, so we fail to clean up the request properly? I've seen that kind of thing causing a few intermittent leaks.
_addObservers does have some kind of OS.File.profileBeforeChange.addBlocker thing, so maybe that could be bad? Though you'd hope that the blocker would produce an error in that case.
(In reply to Andrew McCreight [:mccr8] from comment #29)
> The half dozen or so logs I looked at all had
> http://127.0.0.1:8888/dummy-gmp-manager.xml as their leaked URL, which seems
> suspiciously similar.

Hm. Those requests should have been disabled in bug 1325149. It looks like that change got backed out by a merge botch in bug 1325464 and then fixed in bug 1326305 today. So this issue may go away if that's the only cause.

Either way, though, it would be good to know why it's actually causing leaks. I thought it was only because of bug 1325158, but this makes it look like there's another cause.
It looks like this stopped on the 6th, so I think the relanding mentioned in comment 31 fixed this. Thanks for working on these leaks, Kris.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WORKSFORME
I suspect that this is the same as bug 1325947, and both were fixed by disabling those update requests in tests.

But even though the intermittent failures are gone, I still think we need to find out why the leaks were happening the first place. It's possible that this is still causing real problems in production.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
You should file a separate bug for that, then. (See also bug 1326136 which may be happening still.)
OK, we can fix the underlying issue in bug 1326136.
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → WORKSFORME
Adjusting tracking flags for the release health dashboard. We can track this issue using the Bug in Comment 36.
Whiteboard: [MemShrink] → [MemShrink][stockwell fixed]
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.