Closed
Bug 826044
Opened 12 years ago
Closed 12 years ago
Debug builds leak for WebRTC crashtests on mozilla-inbound
Categories
(Core :: WebRTC, defect, P1)
Core
WebRTC
Tracking
()
RESOLVED
FIXED
mozilla21
People
(Reporter: whimboo, Assigned: jesup)
References
(Blocks 1 open bug)
Details
(Whiteboard: [WebRTC][blocking-webrtc+][qa-])
Attachments
(3 files, 7 obsolete files)
5.83 KB,
patch
|
Details | Diff | Splinter Review | |
1.30 KB,
patch
|
jsmith
:
review+
|
Details | Diff | Splinter Review |
7.25 KB,
patch
|
jsmith
:
review+
|
Details | Diff | Splinter Review |
Today I pushed a patch which should enable the WebRTC crashtests on mozilla-central. Not sure why but on mozilla-inbound we got various leaks reported which we haven't seen on Alder before. Reason (as figured out at this time) is that we have leak checking disabled for crash tests. Beh! :( I will enable leak checks on alder first so we can better diagnose the problem.
Leaks we currently see:
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 465275 bytes during test execution
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 132 instances of AtomImpl with size 24 bytes each (3168 bytes total)
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of BackstagePass with size 24 bytes
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 2 instances of BodyRule with size 16 bytes each (32 bytes total)
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 4 instances of Comment with size 68 bytes each (272 bytes total)
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 2 instances of DOMStorageImpl with size 100 bytes each (200 bytes total)
Reporter | ||
Comment 1•12 years ago
|
||
Attachment #697187 -
Flags: review?(rjesup)
Assignee | ||
Updated•12 years ago
|
Attachment #697187 -
Flags: review?(rjesup) → review+
Reporter | ||
Updated•12 years ago
|
Updated•12 years ago
|
Whiteboard: [MemShrink]
Reporter | ||
Comment 2•12 years ago
|
||
Not sure what's going on here exactly. But I can constantly reproduce it by running the following crash tests in succession: 780790.html and 791330.html. It doesn't matter which code we exercise in the first test, but when I remove the onload handler in 791330.html the leaks don't appear anymore.
Randell told me on IRC that he can see the leaks when running the automation but not when manually doing those checks and quitting the browser afterward. Sadly I can't test this given that my debug build is totally slow (up to 6 minutes until I see the first browser window), so I'm doing a clobber atm.
Assignee | ||
Comment 3•12 years ago
|
||
Running one test at a time, from ./mach crashtest dom/media/tests/crashtests
With just the first one or the last one, no leaks. All others (run alone) leak for me. Fedora 17.
If I load these crashtests into a real browser, I see NO leaks in any of the tests. (Start browser, load file, wait to completion, quit browser.)
Something is interacting with the automation rig. Timing issue perhaps.
Reporter | ||
Comment 4•12 years ago
|
||
So even with enabled leak checks I don't see any leaks with the latest Alder build. Everything is green for crashtests. I have triggered a clobber build in case it will change something. If not, we have to trust our local testing and have to try on inbound again when we think we are good. :/
Comment 5•12 years ago
|
||
This isn't a memshrink issue. This appears to be an automation issue.
Assignee: nobody → hskupin
Priority: -- → P1
Whiteboard: [MemShrink] → [WebRTC], [blocking-webrtc+]
Reporter | ||
Comment 6•12 years ago
|
||
Ted, would you have some pointers how to investigate and fix those leaks? I would like to continue here tomorrow while I'm on the train. Thanks.
Comment 7•12 years ago
|
||
If you can't reproduce them locally I don't know that I can offer you any advice. I'm also not really a leak-solving expert, you might want to ask khuey or dbaron.
Is your local debug build built with --disable-optimize? The tinderbox builds are optimized nowadays.
Assignee | ||
Comment 8•12 years ago
|
||
Mine were --disable-optimize; I see leaks when run locally (all but first and last tests). Same build, loading the tests into the real browser (fresh profile) and then quitting: no leaks.
Reporter | ||
Comment 9•12 years ago
|
||
So I did some additional tests earlier today and as what I can say now is that those leaks are not related to the automation framework. They are tightly connected to peer connection instances and when the browser gets closed. As an example have the following crashtest:
<!DOCTYPE HTML>
<html class="reftest-wait">
<!--
https://bugzilla.mozilla.org/show_bug.cgi?id=791270
-->
<head>
<meta charset="utf-8">
<title>Simple PeerConnection Video Test</title>
<script type="application/javascript">
function finish() {
document.documentElement.removeAttribute("class");
}
var pc = new mozRTCPeerConnection();
pc.close();
setTimeout(finish, 3500);
</script>
</head>
<body>
</body>
</html>
This will not leak as long as the setTimeout value is large enough. For me it's about 3.5s to let the test pass. Once you decrease the value you will see the leak. You can also make the test pass when you comment out both pc lines. So something inside the PC code has to be the cause.
Assignee: hskupin → nobody
Whiteboard: [WebRTC], [blocking-webrtc+] → [WebRTC][blocking-webrtc+][automation-blocked]
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → rjesup
Reporter | ||
Comment 10•12 years ago
|
||
Randell, were you able to make some progress here in the last 6 days? Anything you want me to check?
Comment 11•12 years ago
|
||
Note: What's shown in comment 9 doesn't match 791270.html, so I'll refer to it as 826044.html.
What I'm seeing on my mac (10.8.2) from running tests individually w/m-c:
- using 3 runs or more each
- with the following set (the latter may alter timing slightly):
MOZ_WEBRTC_LEAKING_TESTS=1
MOZ_WEBRTC_TESTS=1
NSPR_LOG_MODULES=signaling:3
default-preferences pref(media.peerconnection.enabled,true)
load 780790.html - no leaks
load 791270.html - no leaks (but error in log each time *)
load 791278.html - no leaks
load 791330.html - no leaks (but each run takes 5 minutes!)
load 799419.html - no leaks
load 801227.html - LEAKS (110 AtomImpl,1 BsPass,1 BodyRule,3 Comment,2 DOMStrgImpl)
load 802982.html - LEAKS (107 AtomImpl,1 BsPass,2 BodyRule,2 Comment,1 DR_State)
load 812785.html - no leaks
load 826044.html - no leaks (but crashed once **)
The number of leaked objects stays the same from run to run.
*)
0:03.89 ************************************************************
0:03.89 * Call to xpconnect wrapped JSObject produced this error: *
0:03.89 [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [IPeerConnection.addStream]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: file:///Users/Jan/moz/mozilla-central/obj-x86_64-apple-darwin12.2.1-debug/dist/NightlyDebug.app/Contents/MacOS/components/PeerConnection.js :: PeerConnection.prototype._executeNext :: line 321" data: no]
0:03.89 ************************************************************
**) TEST-UNEXPECTED-FAIL| Shutdown | Exited with code 1 during test run
Interestingly, 802982.html never calls mozRTCPeerConnection, just mozGetUserMedia.
801227.html calls both.
Assignee: rjesup → jib
Comment 12•12 years ago
|
||
Forgot to say that for 826044.html I tried setTimeout values of 3500m, 10 ms, 1 ms and directly calling the function without setTimeout(). No leaks.
Comment 13•12 years ago
|
||
> 0:03.89 * Call to xpconnect wrapped JSObject produced this error: *
> 0:03.89 [Exception... "Component returned failure code: 0x80004005
> (NS_ERROR_FAILURE) [IPeerConnection.addStream]" nsresult: "0x80004005
This is from pc.addStream(undefined). We could probably validate this input synchronously and throw a nicer error.
> Interestingly, 802982.html never calls mozRTCPeerConnection, just mozGetUserMedia.
> 801227.html calls both.
And if I comment out peerconnection from 801227.html then it leaks the same.
To summarize, all I need to make it leak every time is this:
<!DOCTYPE HTML>
<html class="reftest-wait">
<!-- https://bugzilla.mozilla.org/show_bug.cgi?id=801227 -->
<head>
<meta charset="utf-8">
<title>Abort due to page reload</title>
<script type="application/javascript">
<!-- If I remove the localStorage code then it doesn't leak -->
var index = localStorage.index || 0;
if (index < 3) {
localStorage.index = index + 1;
window.location.reload();
}
function finish() {
delete localStorage["index"];
document.documentElement.removeAttribute("class");
}
navigator.mozGetUserMedia({ audio: true, fake: true }, function (aStream) {
finish();
}, finish);
</script>
</head>
<body>
</body>
</html>
or this:
<!DOCTYPE html>
<html>
<!-- https://bugzilla.mozilla.org/show_bug.cgi?id=802982 -->
<head>
<meta charset="utf-8">
<title>Excessive getUserMedia calls</title>
<script type="application/javascript">
function boom() {
navigator.mozGetUserMedia({}, {}, {});
}
</script>
</head>
<body onload="boom();"></body>
</html>
Ergo, I'm seeing leaks from mozGetUserMedia, not mozRTCPeerConnection.
Assignee | ||
Comment 14•12 years ago
|
||
Comment 15•12 years ago
|
||
Making this a blocker to pref'ing on -- at least until we understand what's going on here: are they true leaks in the core code or are they artifacts of the tests themselves?
Comment 16•12 years ago
|
||
I can make them happen in a regular browser debug session.
Assignee | ||
Comment 17•12 years ago
|
||
We'll file 3 bugs: one for leaks from bug 802982 if the permission UI is enabled; one for leaks in 801227, and one for Javascript: Security error, line 14 in 822197
Assignee | ||
Updated•12 years ago
|
Attachment #714615 -
Flags: review?(jsmith)
Assignee | ||
Comment 18•12 years ago
|
||
the change to 802982.html is because there was a funky 16-bit char at the start
Comment 19•12 years ago
|
||
Comment on attachment 714615 [details] [diff] [review]
enable WebRTC crashtests
Review of attachment 714615 [details] [diff] [review]:
-----------------------------------------------------------------
review+ with one small nit above comments.
::: dom/media/tests/crashtests/crashtests.list
@@ +3,5 @@
> load 780790.html
> load 791270.html
> load 791278.html
> load 791330.html
> load 799419.html
I'd actually leave both of these tests commented out with a comment above explaining why we've prefed them off. Possibly marking a TODO pointing to the bug that's being fixed here.
Attachment #714615 -
Flags: review?(jsmith) → review+
Comment 20•12 years ago
|
||
Comment on attachment 714615 [details] [diff] [review]
enable WebRTC crashtests
Review of attachment 714615 [details] [diff] [review]:
-----------------------------------------------------------------
Actually, wait a sec. Change that - you need to pref this on in the global crash test list as well. review-
Attachment #714615 -
Flags: review+ → review-
Comment 21•12 years ago
|
||
(In reply to Jason Smith [:jsmith] from comment #20)
> Comment on attachment 714615 [details] [diff] [review]
> enable WebRTC crashtests
>
> Review of attachment 714615 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Actually, wait a sec. Change that - you need to pref this on in the global
> crash test list as well. review-
Specifically you need to modify "testing\crsahtest\crashtests.list" to have our tests prefed on by default:
# Bug 801682 - Because of leaks we can't enable media crashtests yet
# Bug 811873 - mozRTCPeerConnection doesn't support remote browser yet
# skip-if(browserIsRemote) include ../../dom/media/tests/crashtests/crashtests.list
^^^ Change that line
Assignee | ||
Comment 22•12 years ago
|
||
Assignee | ||
Updated•12 years ago
|
Attachment #714615 -
Attachment is obsolete: true
Assignee | ||
Comment 23•12 years ago
|
||
Assignee | ||
Updated•12 years ago
|
Attachment #714630 -
Attachment is obsolete: true
Assignee | ||
Updated•12 years ago
|
Attachment #714631 -
Flags: review?(jsmith)
Comment 24•12 years ago
|
||
Comment on attachment 714631 [details] [diff] [review]
enable WebRTC crashtests
Review of attachment 714631 [details] [diff] [review]:
-----------------------------------------------------------------
review+ - small nit
::: testing/crashtest/crashtests.list
@@ -26,5 @@
> include ../../dom/bindings/crashtests/crashtests.list
> include ../../dom/indexedDB/crashtests/crashtests.list
>
> -# Bug 801682 - Because of leaks we can't enable media crashtests yet
> -# Bug 811873 - mozRTCPeerConnection doesn't support remote browser yet
I'd still keep the comment about "remote browser" around here. The other is fine to get rid of
Attachment #714631 -
Flags: review?(jsmith) → review+
Assignee | ||
Comment 25•12 years ago
|
||
Target Milestone: --- → mozilla21
Comment 27•12 years ago
|
||
Backed out for a variety of failures.
https://hg.mozilla.org/integration/mozilla-inbound/rev/44567892e06f
Comment 28•12 years ago
|
||
Added the two dependencies that are causing failures. The third failure though on that list I'm puzzled about - 802982.html appears to be failing to load. Don't know why...
Assignee | ||
Updated•12 years ago
|
Attachment #697187 -
Attachment is obsolete: true
Assignee | ||
Comment 29•12 years ago
|
||
(In reply to Jason Smith [:jsmith] from comment #28)
> Added the two dependencies that are causing failures. The third failure
> though on that list I'm puzzled about - 802982.html appears to be failing to
> load. Don't know why...
I have a suspicion that there's a time limit to load, and that test does a ton of getUserMedias in the on-load. Switching to reftest-wait and re-Trying
Assignee | ||
Comment 30•12 years ago
|
||
Still retriggering, but looking good with reftest-wait: compare
https://tbpl.mozilla.org/?tree=Try&rev=d6e178388988
to
https://tbpl.mozilla.org/?tree=Try&rev=f4a2fe79e180
(the one that fails to load it appears always to be 10.8 opt)
(the other oranges are all known; I haven't bothered starring)
Assignee | ||
Comment 31•12 years ago
|
||
Ok, we now have a fail-to-load, but on a different test - bu again one that does a ton of stuff all inside onload. This should be a indicator to us that's a bad pattern. Will retry with that fixed too (this one is reftest-wait, but does everything in onload());
Assignee | ||
Comment 32•12 years ago
|
||
Ok, we're looking quite green on this:
https://tbpl.mozilla.org/?tree=Try&rev=a0bf58900913
Assignee | ||
Comment 33•12 years ago
|
||
whole bunch of fixes for the crashtests. A bunch needed to be reftest-wait - plain reeftest with tests that take noticable time (at least under load) breaks the onload timeout in crashtests. a couple more broke due to bug 842075; adding temporary try {}'s fixes those. 799419 wasn't in the last try push (converted to reftest-wait), so I pushed another to be sure, and if it fails I'll re-remove it.
Assignee | ||
Comment 34•12 years ago
|
||
this one actually will enable them (and not on android/b2g)
Assignee | ||
Updated•12 years ago
|
Attachment #714845 -
Attachment description: enable WebRTC crashtests → Fix Webrtc crashtests, disable known permaoranges
Attachment #714845 -
Flags: review?(jsmith)
Assignee | ||
Updated•12 years ago
|
Attachment #714847 -
Flags: review?(jsmith)
Assignee | ||
Updated•12 years ago
|
Attachment #714631 -
Attachment is obsolete: true
Updated•12 years ago
|
Assignee: jib → rjesup
Assignee | ||
Comment 35•12 years ago
|
||
799419 needed finish() of course...
Assignee | ||
Updated•12 years ago
|
Attachment #714845 -
Attachment is obsolete: true
Attachment #714845 -
Flags: review?(jsmith)
Assignee | ||
Updated•12 years ago
|
Attachment #714847 -
Attachment is obsolete: true
Attachment #714847 -
Flags: review?(jsmith)
Assignee | ||
Updated•12 years ago
|
Attachment #714859 -
Flags: review?(jsmith)
Assignee | ||
Comment 36•12 years ago
|
||
Comment on attachment 714847 [details] [diff] [review]
enable WebRTC crashtests
Oops, bzexport wasn't helpful here
Attachment #714847 -
Attachment is obsolete: false
Attachment #714847 -
Flags: review?(jsmith)
Assignee | ||
Updated•12 years ago
|
Attachment #714859 -
Attachment description: enable WebRTC crashtests → Fix Webrtc crashtests, disable known permaoranges
Assignee | ||
Comment 37•12 years ago
|
||
and one more test that needed the same treatment. When too much is done in onload, the results are random and depend on what other tests you've run in the list. Also, this needed the try{} around addIceCandidate. New try pushed
Assignee | ||
Updated•12 years ago
|
Attachment #714859 -
Attachment is obsolete: true
Attachment #714859 -
Flags: review?(jsmith)
Assignee | ||
Updated•12 years ago
|
Attachment #714870 -
Flags: review?(jsmith)
Assignee | ||
Comment 38•12 years ago
|
||
debug/opt, with 799419 (and 834100 updated):
https://tbpl.mozilla.org/?tree=Try&rev=7c69f056268c
older debug-only pushes without 799419:
https://tbpl.mozilla.org/?tree=Try&rev=1b355f94aad6
https://tbpl.mozilla.org/?tree=Try&rev=a0bf58900913
Updated•12 years ago
|
Attachment #714847 -
Flags: review?(jsmith) → review+
Comment 39•12 years ago
|
||
Comment on attachment 714870 [details] [diff] [review]
Fix Webrtc crashtests, disable known permaoranges
Review of attachment 714870 [details] [diff] [review]:
-----------------------------------------------------------------
review- only for 802982.html
::: dom/media/tests/crashtests/802982.html
@@ +17,5 @@
> for (var j = 0; j < 100; ++j) {
> navigator.mozGetUserMedia({}, {}, {});
> }
> }
> + finish(); // we're not waiting for success/error callbacks here
Why is this outside of the boom function? Shouldn't this be at the end of the boom function?
Attachment #714870 -
Flags: review?(jsmith) → review-
Assignee | ||
Comment 40•12 years ago
|
||
Assignee | ||
Updated•12 years ago
|
Attachment #714870 -
Attachment is obsolete: true
Assignee | ||
Comment 41•12 years ago
|
||
Comment on attachment 714924 [details] [diff] [review]
Fix Webrtc crashtests, disable known permaoranges
Yup - oversight (that turns out not to matter on Try...)
Attachment #714924 -
Flags: review?(jsmith)
Assignee | ||
Comment 42•12 years ago
|
||
I've repushed the try to verify it's ok with this change since the test wasn't getting real coverage before (though it was for most of my tries; this was a later change).
Updated•12 years ago
|
Attachment #714924 -
Flags: review?(jsmith) → review+
Assignee | ||
Comment 43•12 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=8104e8c3f061
We have green on all platforms
Assignee | ||
Comment 44•12 years ago
|
||
Comment 45•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/31fa1546a849
https://hg.mozilla.org/mozilla-central/rev/8e9f7c541947
Status: NEW → RESOLVED
Closed: 12 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Updated•12 years ago
|
Whiteboard: [WebRTC][blocking-webrtc+][automation-blocked] → [WebRTC][blocking-webrtc+][automation-blocked][qa-]
Reporter | ||
Updated•12 years ago
|
Whiteboard: [WebRTC][blocking-webrtc+][automation-blocked][qa-] → [WebRTC][blocking-webrtc+][qa-]
You need to log in
before you can comment on or make changes to this bug.
Description
•