Closed Bug 1038943 Opened 5 years ago Closed 5 years ago

Turn on leak checking for b2g mochitests

Categories

(Testing :: Mochitest, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
mozilla34

People

(Reporter: khuey, Assigned: khuey)

References

(Depends on 1 open bug)

Details

(Whiteboard: [MemShrink:P1])

Attachments

(1 file, 6 obsolete files)

Attached patch Patch (obsolete) — Splinter Review
Which means adding support to the test harness.
Attachment #8456460 - Flags: review?(ahalberstadt)
Attached patch Patch (obsolete) — Splinter Review
And now that I'm passing the correct environment in stuff breaks.  \o/
Assignee: nobody → khuey
Attachment #8456460 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8456460 - Flags: review?(ahalberstadt)
Attachment #8456564 - Flags: review?(ahalberstadt)
Thanks, this looks good.. but I would put this in mozrunner instead of the mochitest harness. This would give us leak checking on reftest/marionette/xpcshell as well.

I don't mind taking this over if you don't mind waiting until next week sometime.
Flags: needinfo?(khuey)
I actually had it on my TODO list to move the leak checking code to somewhere in mozbase, since we currently have it all in automationutils.py (and maybe duplicated in mochitest). mozrunner doesn't work because the xpcshell harness doesn't use it.
Oh right, I think it's only B2G that uses mozrunner xpcshell, and I guess reftest doesn't use it yet at all.

Maybe a new 'mozleakcheck' module in mozbase?
Flags: needinfo?(khuey)
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #3)
> I actually had it on my TODO list to move the leak checking code to
> somewhere in mozbase, since we currently have it all in automationutils.py
> (and maybe duplicated in mochitest). mozrunner doesn't work because the
> xpcshell harness doesn't use it.

Did you mean you were planning to do this soon, or at some vague point in the future? I can take a look too if you want. What do you think of a new module for leak checking?
Flags: needinfo?(ted)
Comment on attachment 8456564 [details] [diff] [review]
Patch

I think ted and I agree that fixing this in mozbase is preferable to fixing it in mochitest. Unflagging.
Attachment #8456564 - Flags: review?(ahalberstadt)
Comment on attachment 8456564 [details] [diff] [review]
Patch

Review of attachment 8456564 [details] [diff] [review]:
-----------------------------------------------------------------

After talking to khuey on irc, it might be better to just land this for now and then move the leak processing over to mozbase along with everything else.

There's some horrible-ness around how mozrunner handles environment at the moment. Also around how runtestsb2g.py doesn't just instantiate a mozrunner object directly and instead relies on marionette to do it. This patch looks good, but I want to look into fixing a few things in mozrunner first so doing this isn't quite as difficult.
(In reply to Andrew Halberstadt [:ahal] from comment #5)
> Did you mean you were planning to do this soon, or at some vague point in
> the future? I can take a look too if you want. What do you think of a new
> module for leak checking?


Near-future when I do the reftest-on-mozbase work. Let's not block khuey's patch on that, it's pretty localized to Mochitest and provides clear value.
Flags: needinfo?(ted)
Whiteboard: [MemShrink] → [MemShrink:P1]
This patch is basically the same as Kyle's except I fixed up DeviceRunner's handling of environment so it's possible to modify it after instantiation.

I'm getting some kind of build error, so haven't been able to test locally. In the meantime, here's a try run:
https://tbpl.mozilla.org/?tree=Try&rev=631bddbfd05b
There was an extra comma in the last one: https://tbpl.mozilla.org/?tree=Try&rev=f400a2cb43d4

I'm not sure why debug mochitest-1 isn't being scheduled there.. I verified the proper environment is being passed in though.
Attachment #8460897 - Attachment is obsolete: true
Attachment #8461700 - Flags: review?(ted)
Probably because debug has a different job name than opt. TryChooser special-cases them.
Comment on attachment 8461700 [details] [diff] [review]
Do leak checking in b2g mochitests

Debug tests definitely don't like that patch. Must be something in the environment.
Attachment #8461700 - Flags: review?(ted)
I noticed this in the logs:
INFO ### XPCOM_MEM_BLOAT_LOG defined -- unable to log bloat/leaks to /data/local/tests/tmp/runtests_leaks.log

Maybe because /data/local/tests/tmp doesn't exist? Here's a new push that fixes that, though I'm not optimistic that it will solve all the orange and red:
https://tbpl.mozilla.org/?tree=Try&rev=4cbb6fa82754
MOZ_DISABLE_NONLOCAL_CONNECTIONS is still set in the environment somehow.
The latest try run dumps the env just before starting the b2g process:
08:06:53     INFO -    "MOZ_CRASHREPORTER": "1",
08:06:53     INFO -    "XPCOM_DEBUG_BREAK": "stack",
08:06:53     INFO -    "R_LOG_VERBOSE": "1",
08:06:53     INFO -    "NSPR_LOG_MODULES": "signaling:5,mtransport:5,datachannel:5",
08:06:53     INFO -    "MOZ_DISABLE_NONLOCAL_CONNECTIONS": null,
08:06:53     INFO -    "XRE_NO_WINDOWS_CRASH_DIALOG": "1",
08:06:53     INFO -    "GNOME_DISABLE_CRASH_DIALOG": "1",
08:06:53     INFO -    "R_LOG_DESTINATION": "stderr",
08:06:53     INFO -    "MOZ_CRASHREPORTER_NO_REPORT": "1",
08:06:53     INFO -    "NS_TRACE_MALLOC_DISABLE_STACKS": "1",
08:06:53     INFO -    "R_LOG_LEVEL": "6",
08:06:53     INFO -    "XPCOM_MEM_BLOAT_LOG": "/data/local/tests/log/runtests_leaks.log",
08:06:53     INFO -    "MOZ_GMP_PATH": "/builds/slave/test/build/xre/bin/gmp-fake"

Is it possible we need to set it to "0" instead of null? I don't understand how null could evaluate to true though:
http://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsSocketTransport2.cpp#1179
No, we need to delete it entirely.  Try changing the 'update' bit to 'del'.

C's getenv returns a non-null pointer if the variable is set at all.
Oh wait, this is just getting passed into 'sh', so it's probably being coerced to a string. Though that means that !!'0' would also be true. So we have to make sure not to set it at all. This might work:
https://tbpl.mozilla.org/?tree=Try&rev=a6a57775bdfd
Heh, yeah, what khuey said ;)
So that ran tests successfully but it doesn't look like it processed the leak log.  The runs should be bright orange with leaks.
This one seems to work.
Attachment #8456564 - Attachment is obsolete: true
Attachment #8461700 - Attachment is obsolete: true
Attachment #8462698 - Flags: review?(ted)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #21)
> So that ran tests successfully but it doesn't look like it processed the
> leak log.  The runs should be bright orange with leaks.

I see:

10:26:35     INFO -  2809 INFO Leaked URLs:
10:26:35     INFO -  2810 INFO   file:///system/b2g/omni.ja
10:26:35     INFO -  2811 INFO   chrome://b2g/content/shell.html
10:26:35     INFO -  2812 INFO   resource://gre-resources/ua.css
10:26:35     INFO -  2813 INFO   resource://gre-resources/html.css
10:26:35     INFO -  2814 INFO   chrome://global/content/minimal-xul.css
10:26:35     INFO -  2815 INFO   chrome://global/content/xul.css
10:26:35     INFO -  2816 INFO   resource://gre-resources/quirk.css
10:26:35     INFO -  2817 INFO   resource://gre-resources/full-screen-override.css
10:26:35     INFO -  2818 INFO   resource://gre/res/svg.css
10:26:35     INFO -  2819 INFO   resource://gre-resources/counterstyles.css
10:26:35     INFO -  2820 INFO   chrome://global/skin/scrollbars.css
10:26:35     INFO -  2821 INFO   resource://gre-resources/number-control.css
10:26:35     INFO -  2822 INFO   resource://gre-resources/forms.css
10:26:35     INFO -  2823 INFO   x:///chrome/chrome/content/shell.html
10:26:35     INFO -  2824 INFO   chrome://b2g/content/shell.css
10:26:35     INFO -  2825 INFO   chrome://b2g/content/shell.css
10:26:35     INFO -  2826 INFO   chrome://b2g/content/settings.js
10:26:35     INFO -  2827 INFO   chrome://b2g/content/shell.js
10:26:35     INFO -  2828 INFO   chrome://global/content/bindings/scrollbar.xml#scrollbar
10:26:35     INFO -  2829 INFO   chrome://global/content/bindings/scrollbar.xml
10:26:35     INFO -  2830 INFO   x:///chrome/toolkit/content/global/bindings/scrollbar.xml
10:26:35     INFO -  2831 INFO   chrome://global/content/bindings/scrollbar.xml
10:26:35     INFO -  2832 INFO   chrome://global/content/bindings/scrollbar.xml#thumb
10:26:35     INFO -  2833 INFO   chrome://global/content/bindings/scrollbar.xml
10:26:35     INFO -  2834 INFO   chrome://global/content/bindings/scrollbar.xml#scrollbar-base
10:26:35     INFO -  2835 INFO   chrome://global/content/bindings/scrollbar.xml#scrollbar
10:26:35     INFO -  2836 INFO   chrome://global/content/bindings/scrollbar.xml#scrollbar-base
10:26:35     INFO -  2837 INFO   chrome://global/content/bindings/scrollbar.xml#thumb
10:26:35     INFO -  2838 INFO   chrome://global/content/bindings/scrollbar.xml#scrollbar-base
10:26:35     INFO -  2839 INFO [Parent 669] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file ../../../gecko/xpcom/base/nsTraceRefcnt.cpp, line 145
10:26:35     INFO -  2840 INFO [Parent 669] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file ../../../gecko/xpcom/base/nsTraceRefcnt.cpp, line 145
10:26:36     INFO -  2841 INFO [Parent 669] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file ../../../gecko/xpcom/base/nsTraceRefcnt.cpp, line 145
10:26:36     INFO -  2842 INFO [Parent 669] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file ../../../gecko/xpcom/base/nsTraceRefcnt.cpp, line 145
10:26:36     INFO -  2843 INFO [Parent 669] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file ../../../gecko/xpcom/base/nsTraceRefcnt.cpp, line 145
10:26:36     INFO -  2844 INFO [Parent 669] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file ../../../gecko/xpcom/base/nsTraceRefcnt.cpp, line 145
10:26:36     INFO -  2845 INFO [Parent 669] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file ../../../gecko/xpcom/base/nsTraceRefcnt.cpp, line 145
10:26:36     INFO -  2846 INFO [Parent 669] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file ../../../gecko/xpcom/base/nsTraceRefcnt.cpp, line 145
10:26:36     INFO -  2847 INFO nsStringStats
10:26:36     INFO -  2848 INFO  => mAllocCount:         134469
10:26:36     INFO -  2849 INFO  => mReallocCount:         6055
10:26:36     INFO -  2850 INFO  => mFreeCount:          131544  --  LEAKED 2925 !!!
10:26:36     INFO -  2851 INFO  => mShareCount:         109945
10:26:36     INFO -  2852 INFO  => mAdoptCount:           5434
10:26:36     INFO -  2853 INFO  => mAdoptFreeCount:       5432  --  LEAKED 2 !!!
10:26:36     INFO -  2854 INFO  => Process ID: 669, Thread ID: 1074476168

I think maybe mozharness just needs to highlight the leaks. Which would be a separate patch. Also if it turns all the jobs bright orange, that means we can't land it right away.
The output of processLeakLog looks something like

TEST-UNEXPECTED-FAIL: ...

This leaked URL's stuff and the nsStringsStats stuff is there today (go grab a log off of mozilla-central and compare).

The harness has the ability to set a threshold value in bytes to not turn things orange, but I would like to prove that we can turn it orange first.
Comment on attachment 8462698 [details] [diff] [review]
Do leak checking in b2g mochitests

Ok, will unflag again for now (sorry for the bugspam ted)
Attachment #8462698 - Flags: review?(ted)
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #24)
> The output of processLeakLog looks something like
> 
> TEST-UNEXPECTED-FAIL: ...
> 
> This leaked URL's stuff and the nsStringsStats stuff is there today (go grab
> a log off of mozilla-central and compare).
> 
> The harness has the ability to set a threshold value in bytes to not turn
> things orange, but I would like to prove that we can turn it orange first.

I'll have less time to look into this this week. Here's a try run with some extra debugging info, but barring something obvious coming out of there, I don't have any other ideas:
https://tbpl.mozilla.org/?tree=Try&rev=41db1caa6657
Attached patch Patch (obsolete) — Splinter Review
Based on https://tbpl.mozilla.org/?tree=Try&rev=4e53a54dd68b I set the threshold at 400000 bytes.
Attachment #8464116 - Flags: review?(dbaron)
Attachment #8464116 - Flags: review?(ahalberstadt)
Comment on attachment 8464116 [details] [diff] [review]
Patch

Flagging ted since I wrote part of this patch and he's planning on refactoring leak checking later.
Attachment #8464116 - Flags: review?(ahalberstadt) → review?(ted)
Comment on attachment 8464116 [details] [diff] [review]
Patch

>-          "default": 0,
>+          "default": 400000,

What makes this specific to b2g?

(I think we used to have test-specific and platform-specific defaults somehow, back before we got all the leaks down to 0.  It might be worth seeing how we did that.)
Attached patch PatchSplinter Review
Attachment #8464116 - Attachment is obsolete: true
Attachment #8464116 - Flags: review?(ted)
Attachment #8464116 - Flags: review?(dbaron)
Attachment #8464138 - Flags: review?(ted)
Attachment #8464138 - Flags: review?(dbaron)
Comment on attachment 8464138 [details] [diff] [review]
Patch

>diff --git a/testing/mochitest/mochitest_options.py b/testing/mochitest/mochitest_options.py
>--- a/testing/mochitest/mochitest_options.py
>+++ b/testing/mochitest/mochitest_options.py
>@@ -750,16 +750,17 @@ class B2GOptions(MochitestOptions):
>         defaults = {}
>         defaults["httpPort"] = DEFAULT_PORTS['http']
>         defaults["sslPort"] = DEFAULT_PORTS['https']
>         defaults["logFile"] = "mochitest.log"
>         defaults["autorun"] = True
>         defaults["closeWhenDone"] = True
>         defaults["testPath"] = ""
>         defaults["extensionsToExclude"] = ["specialpowers"]
>+        defaults["leakThreshold"] = 400000
>         self.set_defaults(**defaults)

My understanding is that what you're asking me to review is the leak threshold.  This is fine with me, so r=dbaron.  At some point you're likely to want it to be a little more granular (per suite rather than for all suites), though.  (I sure hope B2GOptions is B2G-only, though somebody should probably double-check that.)

https://hg.mozilla.org/mozilla-central/rev/126af18be989 looks like the last time we had a nonzero leak threshold in mochitest, for what it's worth.
Attachment #8464138 - Flags: review?(dbaron) → review+
With both of the dependent bugs fixed the threshold can probably be 5000 bytes instead of 400000.
review ping?
Flags: needinfo?(ted)
Comment on attachment 8464138 [details] [diff] [review]
Patch

Review of attachment 8464138 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/mochitest/mochitest_options.py
@@ +754,5 @@
>          defaults["autorun"] = True
>          defaults["closeWhenDone"] = True
>          defaults["testPath"] = ""
>          defaults["extensionsToExclude"] = ["specialpowers"]
> +        defaults["leakThreshold"] = 400000

Do you have a bug on file on reducing this to zero? Would be great to mention the bug number here.

::: testing/mozbase/mozrunner/mozrunner/base/device.py
@@ +32,4 @@
>      def __init__(self, device_class, device_args=None, **kwargs):
> +        process_log = tempfile.NamedTemporaryFile(suffix='pidlog')
> +        self.env['MOZ_PROCESS_LOG'] = process_log.name
> +        self.env.update(kwargs.pop('env', None) or {})

kwargs.pop('env', {})

@@ +32,5 @@
>      def __init__(self, device_class, device_args=None, **kwargs):
> +        process_log = tempfile.NamedTemporaryFile(suffix='pidlog')
> +        self.env['MOZ_PROCESS_LOG'] = process_log.name
> +        self.env.update(kwargs.pop('env', None) or {})
> +        self._env = self.env

Feels a little weird to mix the class variable with an instance variable. Maybe just do self._env = dict(self.env) and then modify self._env?
Attachment #8464138 - Flags: review?(ted) → review+
Flags: needinfo?(ted)
Blocks: 1049202
And I went ahead and updated to the exact value so that any regression will be noticed.
Depends on: 962871
Depends on: 1068268
No longer depends on: 962871
Depends on: 962871
You need to log in before you can comment on or make changes to this bug.