remote/test/browser/ browser-chrome tests cause leakcheck to fail

NEW
Unassigned

Status

defect
P3
normal
4 months ago
2 days ago

People

(Reporter: ato, Unassigned)

Tracking

(Blocks 1 bug, {leave-open, memory-leak})

Details

()

Attachments

(3 obsolete attachments)

After I enabled the remote agent in the default Nightly build, we
started seeing leaks in the browser-chrome tests as seen here:

https://treeherder.mozilla.org/#/jobs?repo=autoland&group_state=expanded&searchStr=debug%2Cbrowser-chrome&fromchange=b9a625eff9e301f700196de977f916e7f77a895f&tochange=afb947a236befa6eb6f23783405bd9c334aa4129

I will submit a new patch to unhook the remote agent from the default
build, until we have had time to resolve this. Backing out the
original patchset would be detrimental to investigating this
effectively on try.

Assignee: nobody → ato
Status: NEW → ASSIGNED
Depends on: 1533831
Priority: -- → P1

As detailed in https://bugzilla.mozilla.org/show_bug.cgi?id=1546945,
enabling the remote agent in the default Firefox Nightly build
(https://bugzilla.mozilla.org/show_bug.cgi?id=1533831) caused a
leakcheck in M-bc to fail.

This patch cuts the remote agent off from the build, by stopping
to imply --enable-cdp, until we have had time to fix the leak.

Pushed by atolfsen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/004d050a8ec4
remote: temporarily stop implying --enable-cdp; r=aryx
Keywords: leave-open
Blocks: 1540655

All browser-chrome jobs have been requested on debug for
https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=004d050a8ec44987f44fd86c62097266b22b915e,
so it should confirm it was the remote agent that caused the leaks.

Because we don’t run all test jobs on all pushes, the working theory
is that the push I linked to in comment #0 saw the effect of the
remote agent M-bc jobs starting to leak.

As detailed in https://bugzilla.mozilla.org/show_bug.cgi?id=1546945,
there is a memory leak in the remote agent that causes it to
intermittently fail brwoser-chrome leakchecks.

It is possible the memory leak is related to holding onto a reference
of the XPConnect C++ object nsSocketTransportService, but this has
yet to be confirmed.

This patch disables the browser-chrome tests on debug builds,
where we run reference counting leakchecks, in order to get the
remote agent enabled in default Firefox Nightly builds.

Thanks-to: Alexandre Poirot <poirot.alex@gmail.com>

This patch negates the effect of 004d050a8ec4 (git 6dd9496ac0f9).

OK, I talked to ochameau and in order to make some meaningful
progress on enabling the remote agent in default Firefox Nightly
builds, we are going to disable the browser-chrome tests on debug
builds where reference counting leakchecks are run. I’ve submitted
the necessary patches for doing that as well as turning the remote
agent back on.

This bug is still relevant and now tracks re-enabling the M-bc
tests in remote/test/browser/*.js by first plugging the leak.

ochameau and I did some wild-goose-chase debugging on IRC, and we
suspect that we’re holding on to an XPConnect reference to a C++
instance of nsSocketTransportService. There are only a handful
of places we use sockets, which inculde for WebSocket client
connections, and for the HTTPD.

I narrowed it down to remote/test/browser/browser_cdp.js. On my
workstation it is failing the leakcheck consistently, which is
actually good news for debugging it.

This means I will revisit the attached patches and change them to
only ignore browser_cpd.js on debug.

Keywords: memory-leak
Summary: Remote agent browser-chrome tests causes leakcheck to fail intermittently → remote/test/browser/browser_cdp.js causes leakcheck to fail
Attachment #9060764 - Attachment description: bug 1546945: remote: disable browser-chrome tests on debug; → bug 1546945: remote: disable browser_cdp.js browser-chrome test on debug;
Pushed by atolfsen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f04e1e9c54aa
remote: disable browser_cdp.js browser-chrome test on debug; r=ochameau
https://hg.mozilla.org/integration/autoland/rev/868b1bf043dc
remote: re-enable remote agent in Firefox Nightly; r=ochameau

The remote agent has now been re-enabled, so relinquishing this
bug. We still need to fix this memory leak, but I don’t consider
it a top priority.

Status: ASSIGNED → NEW
Priority: P1 → P3
Assignee: ato → nobody
Duplicate of this bug: 1546850
Duplicate of this bug: 1547303

Over the weekend two other intermittent leaks were detected in the
remaining browser-chrome tests:

I’ve filed bug 1547679 to temporarily disable all browser-chrome
tests on debug until we’ve had time to investigate (this bug).

Depends on: 1547679
Summary: remote/test/browser/browser_cdp.js causes leakcheck to fail → remote/test/browser/ browser-chrome tests cause leakcheck to fail

I can help you at some point with investigating this leak, though I don't think I'll have a ton of time this week.

No longer blocks: 1540655

The failures in comment 19 are in accessible/tests/browser/events/. I don't know why this bug is being starred. There's another bug on file for it.

Attachment #9060696 - Attachment is obsolete: true
Attachment #9060764 - Attachment is obsolete: true
Attachment #9060765 - Attachment is obsolete: true

Removing the disabled tag per comment 20 (misclassification)

Whiteboard: [stockwell disable-recommended]

Misclassification, see above.

Whiteboard: [stockwell disable-recommended]
Whiteboard: [stockwell disable-recommended]

The leaks are being misclassified. In the two logs I was looking at, they were happening in toolkit/components/certviewer/tests/browser/

You need to log in before you can comment on or make changes to this bug.