Closed Bug 698827 Opened 12 years ago Closed 11 years ago

Run 10.5 leak builds on 10.6 machines for certain branches

Categories

(Release Engineering :: General, defect, P1)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Assigned: armenzg)

References

Details

Attachments

(3 files)

bug 674647 got a little noisy and I would like to focus this bug on allowing to select platforms for each branch.

For mozilla-central and based branches we will switch to run 10.5 leak test builds on 10.6 machines.
I will be busy next week with buildduty so I am lowering the priority as I don't know if I will be to tackle it during that week.
Priority: P2 → P3
I will pick this up once the patch in bug 674647 lands.
Assignee: armenzg → nobody
Nope. That is for enabling leaktests on 10.6.

Now, if I run 10.5 jobs (which have leaktests on by default) on 10.6 machines we won't have the intermittent oranges.
Assignee: nobody → armenzg
Priority: P3 → P2
Depends on: 707152
Anything I can do to help in this bug?

We currently run the leak test in the same machine that does the build. Wouldn't it be easier to keep that and just move the build to 10.6 (i.e., bug 674647)?
I am going to write a patch to do 10.5 leak builds on 10.6 which requires selection per branch. This is a considerate refactoring and I will be tackling it today.
Thanks Rafael for the offer.
This refactoring will allow us to switch later on to do 10.5 and 10.6 builds on 10.7 (if that ends up being possible).
Priority: P2 → P1
Rafael I got this:
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1326303782.1326311730.976.gz&fulltext=1#err0
TEST-UNEXPECTED-FAIL | jit_test.py -a -m -d       | /builds/slave/m-cen-osx-dbg/build/js/src/jit-test/tests/basic/testBug597736.js: /builds/slave/m-cen-osx-dbg/build/js/src/jit-test/tests/basic/testBug597736.js:29: Error: Assertion failed: got false, expected true: Some finalizations must happen

I'm running another build.
This patch was easier than I first thought.

coop, I tried to test this patch with test-masters.sh and dump_masters.sh but the diff of the later one did not help me with this.
From looking at:
http://dev-master01.build.scl1.mozilla.com:8040/builders/OS%20X%2010.5.2%20mozilla-central%20leak%20test%20build
you can see that I don't have any "darwin9" or "mac" slaves listed.

Should we make the change only for try and later on for the remaining branches? (depending on the orange that I mentioned on the previous comment).
Attachment #587813 - Flags: review?(coop)
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #7)
> Rafael I got this:
> http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1326303782.
> 1326311730.976.gz&fulltext=1#err0
> TEST-UNEXPECTED-FAIL | jit_test.py -a -m -d       |
> /builds/slave/m-cen-osx-dbg/build/js/src/jit-test/tests/basic/testBug597736.
> js:
> /builds/slave/m-cen-osx-dbg/build/js/src/jit-test/tests/basic/testBug597736.
> js:29: Error: Assertion failed: got false, expected true: Some finalizations
> must happen
> 
> I'm running another build.

This is still with the 10.5 SDK, right. Just running a 10.5 debug build on a 10.6 machine and running "make check", correct?

I will try to reproduce it when I get home.
(In reply to Rafael Ávila de Espíndola (:espindola) from comment #9)
> 
> This is still with the 10.5 SDK, right. Just running a 10.5 debug build on a
> 10.6 machine and running "make check", correct?
> 
> I will try to reproduce it when I get home.

I assume so (I don't know where to verify in the log what sdk is being used). I run another 2 builds and both finished as orange.
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #10)
> (In reply to Rafael Ávila de Espíndola (:espindola) from comment #9)
> > 
> > This is still with the 10.5 SDK, right. Just running a 10.5 debug build on a
> > 10.6 machine and running "make check", correct?
> > 
> > I will try to reproduce it when I get home.
> 
> I assume so (I don't know where to verify in the log what sdk is being
> used). I run another 2 builds and both finished as orange.

Sorry, I was unable to get to it yesterday. Just rebooted into 10.6 and will start a build.

You can find the SDK by searching for --with-macos-sdk on the build log.
That's right. 10.5 sdk on moz2-darwin10-slave03:
  --with-macos-sdk=/Developer/SDKs/MacOSX10.5.sdk

[1] http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTest/1326303782.1326311730.976.gz&fulltext=1
I was able to reproduce this with

/Users/espindola/mozilla-central/obj-x86_64-apple-darwin10.4.1/dist/bin/js -m -d -a -f ~/test.js

With test.js being:

-----------------------------------------------------------------
function leak_test() {
    // Create a reference loop function->script->traceFragment->object->function
    // that GC must be able to break. To embedd object into the fragment the
    // code use prototype chain of depth 2 which caches obj.__proto__.__proto__
    // into the fragment.

    // To make sure that we have no references to the function f after this
    // function returns due via the conservative scan of the native stack we
    // loop here multiple times overwriting the stack and registers with new garabge.
    for (var j = 0; j != 8; ++j) {
	var f = Function("a", "var s = 0; for (var i = 0; i != 100; ++i) s += a.b; return s;");
	var c = {b: 1, f: f, leakDetection: makeFinalizeObserver()};
	f({ __proto__: { __proto__: c}});
	f = c = a = null;
	gc();
    }
}

function test()
{
    var base = finalizeCount();
    print("ESPINDOLA base=" + base + "\n");
    leak_test();
    gc();
    gc();
    var n = finalizeCount();
    var x = finalizeCount();
    print("ESPINDOLA n=" + n + "\n");
    print("ESPINDOLA x=" + x + "\n");
    assertEq(base + 4 < x, true, "Some finalizations must happen");
}

test();
----------------------------------------------------------

The output is


ESPINDOLA base=0

ESPINDOLA n=4

ESPINDOLA x=4

I am debugging it.
Comment on attachment 587813 [details] [diff] [review]
run 10.5 leak builds on 10.6

Is there anything preventing us from doing *all* these builds on macosx64 right now? We don't release these builds anyway.
Attachment #587813 - Flags: review?(coop) → review+
(In reply to Chris Cooper [:coop] from comment #14)
> Comment on attachment 587813 [details] [diff] [review]
> run 10.5 leak builds on 10.6
> 
> Is there anything preventing us from doing *all* these builds on macosx64
> right now? We don't release these builds anyway.

We would get test failures because some fixes are missing on those branches.
I can change the output of finalizeCount from 4 to 6 (and therefore make the test pass) by changing the optimization level of just jsapi.o from -O3 to -O0.

Igor, you added this test and from the comment it looks like it is sensitive to stack layout because of our use of a conservative gc. Do you think we might just be getting unlucky with the layout we get with -O3 and running on 10.6?
Btw, compiling just JS_GetGlobalForScopeChain with -O0 also "fixes" the problem.
(In reply to Rafael Ávila de Espíndola (:espindola) from comment #16)
> Igor, you added this test and from the comment it looks like it is sensitive
> to stack layout because of our use of a conservative gc. Do you think we
> might just be getting unlucky with the layout we get with -O3 and running on
> 10.6?

Your experiment with changing the optimization level clearly supports this. For now I think we should just disable the test but file a bug about  better infrastructure for reliable leak testing in presence of conservative GC.
Cool. The unreliable test has been removed. Armen, can you try your patch again? When do you think we can enable it?
Perfect. I grabbed another slave and I am now testing a new build.
I forgot this.
Attachment #588946 - Flags: review?(coop)
Attachment #588946 - Flags: review?(coop) → review+
Comment on attachment 588946 [details] [diff] [review]
run 10.5 leak builds on 10.6 for try

http://hg.mozilla.org/build/buildbot-configs/rev/9e063725e873

I'm only landing for try right now. I would like to see it run it well for a couple of days.

I had unit tests running on staging without any trouble and all running green.
Attachment #588946 - Flags: checked-in+
> I'm only landing for try right now. I would like to see it run it well for a
> couple of days.

Why? This is just a change to debug bulids. The worse that can happen is we getting more oranges and reverting.

In any case, let me know when it is live on try.
(In reply to Rafael Ávila de Espíndola (:espindola) from comment #23)
> > I'm only landing for try right now. I would like to see it run it well for a
> > couple of days.
> 
> Why? This is just a change to debug bulids. The worse that can happen is we
> getting more oranges and reverting.
> 
> In any case, let me know when it is live on try.

It is live on try since this morning.
I'm sorry. We get developers at our neck all the time if anything goes wrong.
I prefer to be conservative for anything that changes the normal for devs.
We have slaves try-mac64-slave[01-31] and moz2-darwin10-slave[11-14] to handle the extra load. Let's see if we have wait times or not in the morning.

The push is looking good:
https://tbpl.mozilla.org/?tree=Try&rev=f3c8c19da938
Comment on attachment 587813 [details] [diff] [review]
run 10.5 leak builds on 10.6

Landed on "default":
http://hg.mozilla.org/build/buildbot-configs/rev/3467b1f70734
Attachment #587813 - Flags: checked-in+
This is live now.
Unless something goes wrong, a second build would show up in here:
https://tbpl.mozilla.org/?jobname=OS%20X%2010.5.2&rev=79e5d0b77d10

I triggered a second build which grabbed a darwin10 slave.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Depends on: 725214
From IRC and after bug 721575 landed: f202988dec86

We also landed some time ago a fix on bug 725214:
http://hg.mozilla.org/build/buildbot-configs/rev/f9fe6cb5023f
Attachment #599177 - Flags: review+
Attachment #599177 - Flags: checked-in+
Depends on: 730195
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.