Closed Bug 1352355 Opened 4 years ago Closed 2 years ago

Enable leak checking for web-platform-tests


(Testing :: web-platform-tests, enhancement)

Version 3
Not set


(firefox65 fixed)

Tracking Status
firefox65 --- fixed


(Reporter: jgraham, Assigned: jgraham)


(Depends on 3 open bugs)


(Whiteboard: [MemShrink:P1][wptsync upstream])


(7 files)

Because this should have been done some time ago.

One problem is how to deal with tests from upstream that leak on import. We can't delay the import in order to fix them, so we might need to add a metadata mechansim for skipping leak checking during those tests, or similar.
Depends on: 1333114
Depends on: 1352351
No longer depends on: 1333114
Depends on: 1352357
Depends on: 1333114
Whiteboard: [MemShrink]
Depends on: 1356498
Whiteboard: [MemShrink] → [P1:MemShrink]
Whiteboard: [P1:MemShrink] → [MemShrink:P1]
James, I'd be happy to have the service worker test expected fail if we can turn this on.
James, can we get this enabled soon even if it means disabling tests or marking them expected fail/leak?  We are depending more and more on WPT for sole coverage of features and leak testing is essential.
Flags: needinfo?(james)
I will pick this up once we have the new import process running. The central difficulty here is that if we import tests that start to leak it's hard to figure out what we need to do in terms of disabling tests or disabling leak checks until things are fixed; because the leak checks only run on shutdown the level of granularity is very high (basically nearest directory) so the obvious thing to do is start disabling leak checks for entire directories on import. But doing smaller more regular imports should help since it will be easier to isolate the problematic tests and easier to inform developers that something bad is happening.

Agreed on the priority; I will ensure I pick this up before the end of the year.
Flags: needinfo?(james)
James, is there still work to be done here? Are there any plans to continue work on the blocking bugs?
Flags: needinfo?(james)
There is a patch, but I dropped the ball on getting it landed, mostly I think because in my last try run it looked quite (but not entirely) green even with a deliberate leak. I'll give it another go now.
Flags: needinfo?(james)
So there are a few leaks, one in /media-source/ and one in referrer-policy/ and a few more "negative leaks caught" which I don't really know what to do with; at the moment I don't think there's anything in the patch that allows marking that as expected for any given directory. Maybe I need to add it (it also looks like some of the wpt expectation integration in that patch is outdated, so that needs to be fixed too).
Flags: needinfo?(erahm)
What kind of information are you looking for? The negative leak is interesting. I can try investigating it.
Flags: needinfo?(erahm) → needinfo?(continuation)
I guess I should actually needinfo you. I'm happy to help out with investigating leaks that this has turned up so that we can get it landed. Is what I meant. I'll file a bug for the negative leaks thing.
Flags: needinfo?(continuation) → needinfo?(james)
Depends on: 1500200
I should have been more specific, sorry. The negative leak thing was what I was looking for info on, so "it's interesting, I can try investigating is" is the perfect outcome, thanks :)
Flags: needinfo?(james)
Getting there slowly:

The only thing left that I don't know how to deal with is "Missing output line for total leaks"; that looks like it's associated with something bad happening, but I'm worried about the case that we import some tests that cause that kind of problem; there isn't a way in the current scheme to represent "this is expected to leak so much that it breaks the leak checker".
"Missing output line for total leaks" simply means that the leak checker did not output anything to the leak log file before the process terminated. The most common reason for this is that the process crashed. Intentional crashes in Mochitests are detected, and missing logs for those processes are not reported. I would not imagine that web platform tests involve crashes, but who knows.
I looked at one of the logs, and there's a shutdown hang:

17:40:53     INFO - PID 3168 | Hit MOZ_CRASH(Shutdown too long, probably frozen, causing a crash.) at z:/build/build/src/toolkit/components/terminator/nsTerminator.cpp:219
17:40:53     INFO - PID 3168 | #01: PRP_TryLock[Z:\task_1542211939\build\application\firefox\nss3.dll +0x12f285]
17:40:53     INFO - PID 3168 | #02: PR_MD_UNLOCK[Z:\task_1542211939\build\application\firefox\nss3.dll +0x11e34d]
17:40:53     INFO - PID 3168 | #03: o____lc_collate_cp_func[Z:\task_1542211939\build\application\firefox\ucrtbase.DLL +0x3e16f]
17:40:53     INFO - PID 3168 | #04: BaseThreadInitThunk[C:\windows\system32\kernel32.dll +0x53c45]
17:40:53     INFO - PID 3168 | #05: DllBlocklist_Initialize[Z:\task_1542211939\build\application\firefox\mozglue.dll +0x6564]
17:40:53     INFO - PID 3168 | #06: RtlInitializeExceptionChain[C:\windows\SYSTEM32\ntdll.dll +0x637f5]
17:40:53     INFO - PID 3168 | #07: RtlInitializeExceptionChain[C:\windows\SYSTEM32\ntdll.dll +0x637c8]

I have no idea what that stack means.
That usually means that some other thread is stuck doing something, which in turn makes the main thread wait, and the main thread failed to notify nsTerminator that it's still alive.  Can you see what all the other threads are doing?
Unfortunately that seems to be the only crash information. I know the full crash dump thing does include information about other threads, but I don't see the output for it.
Err, minidump-stackwalk is what I mean by "full crash dump thing" in case that was no clear. looks more promising; I think we are only getting the "missing the output line" error when a test unexpectedly fails; I will check the harness is doing the right thing in those cases.
Moves mozleak to use structured logging. The logger gets two new
actions, mozleak_object to indicate the name of an object that leaked
in a specific process and mozleak_total to indicate the total number
of bytes leaked in a process.

The output from the TBPL formatter is
expected to remain near-identical to the previous output from the
logger, so there shouldn't be any effect on the ability to fail jobs
if there are leaks.

Additional features required by web-platform-tests are also added
here; the leak thresholds are passed to the logger for mozleak_total
and a list of any objects allowed to leak are passed for
mozleak_object, so that a log consumer may decide whether a leak is
unexpected. In addition, the scope attribute is used to specify the
set of tests (or other tasks) running at the time of the leak, which
may be used to associate a leak with a specific set of files.

MozReview-Commit-ID: 19FsMxVQExH

Depends on D12408
This adds two new properties to wpt metadata files:

mozleak-allowed - This is a list of the form [process-name:object
name], which indicates objects that may be leaked in that specific
process. Automatic updates that find a leak may add that object to
this list.

mozleak-threshold - This is a list (but conceptually a map) of
[process-name: threshold bytes], indicating a threshold below which
leaks will not cause a test failure. This number is updated by setting
it to the observed value for a process.

MozReview-Commit-ID: KA1oPl837a8

Depends on D12410
Depends on D12411
MozReview-Commit-ID: 7JqVXVibiwy

Depends on D12412
Depends on D12413
The settngs() method is called to get a list of session-level settings
that should be applied when running a test. If those settings differ
from the ones applied to the previous test then we have to restart the
browser. But if we set the session on the class in the settings method
then at shutdown time we'll have incorrect settings. This is important
for things like leak checking where the values affect the shutdown
process. So instead ensure that we only set the settings on the class
during startup.

Depends on D12414
Attachment #9026338 - Attachment description: Bug 1352355 - Convert mozleak to structured logging, , mccr8 → Bug 1352355 - Convert mozleak to structured logging, ,
Pushed by
Convert mozleak to structured logging, , r=ahal,mccr8
Allow lists as default values in wpt expectation metadata files, r=ato
Enable storing and updating mozleak metadata in wpt ini, r=ato
Ensure leaks cause mozharness to fail, r=ahal
Enable leak checks by default in wpt debug builds, r=ato
Add metadata to allow existing leaks, r=mccr8
Fix use of wpt settings in firefox Browser, r=ato
Whiteboard: [MemShrink:P1] → [MemShrink:P1][wptsync upstream error]
Created web-platform-tests PR for changes under testing/web-platform/tests
Whiteboard: [MemShrink:P1][wptsync upstream error] → [MemShrink:P1][wptsync upstream]
Upstream PR was closed without merging
Depends on: 1518115
Depends on: 1521191
Whiteboard: [MemShrink:P1][wptsync upstream] → [MemShrink:P1][wptsync upstream error]
Upstream PR was closed without merging
Whiteboard: [MemShrink:P1][wptsync upstream error] → [MemShrink:P1][wptsync upstream]
Pushed by
[wpt PR 14464] - [Gecko Bug 1352355] Enable storing and updating mozleak metadata in wpt ini, a=testonly
Pushed by
[wpt PR 14464] - [Gecko Bug 1352355] Enable storing and updating mozleak metadata in wpt ini, a=testonly
You need to log in before you can comment on or make changes to this bug.