test_finalizer.js was green when it landed in bug 720771 on Wednesday, but suddenly became perma-orange on Linux64 PGO builds about one day later on Thursday, starting with push to inbound, which was a merge between m-c and inbound which were both green previously: https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=21106c79a43d and also starting with this push to mozilla-central which happened 14 hours later and does not contain any of the same patches: https://tbpl.mozilla.org/?rev=10622eaff4fc Since this happens only on PGO builds, it may be related to a bug in gcc's PGO. Since it started on unrelated changesets on two different branches, perhaps it is triggered by a threshold of code size, or something similar that does not depend on the specific code that changed. Log from one of the crashes: https://tbpl.mozilla.org/php/getParsedLog.php?id=10870167&tree=Firefox Rev3 Fedora 12x64 mozilla-central pgo test xpcshell on 2012-04-13 03:50:17 PDT for push 10622eaff4fc TEST-PASS | /home/cltbld/talos-slave/test/build/xpcshell/tests/toolkit/components/ctypes/tests/unit/test_finalizer.js | [test_result_dispose : 322] 0 == 0 TEST-INFO | (xpcshell/head.js) | test 1 finished TEST-INFO | (xpcshell/head.js) | exiting test TEST-PASS | (xpcshell/head.js) | 3100 (+ 0) check(s) passed TEST-INFO | (xpcshell/head.js) | 0 check(s) todo <<<<<<< Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux64-pgo/1334300401/firefox-14.0a1.en-US.linux-x86_64.crashreporter-symbols.zip PROCESS-CRASH | /home/cltbld/talos-slave/test/build/xpcshell/tests/toolkit/components/ctypes/tests/unit/test_finalizer.js | application crashed (minidump found) Crash dump filename: /home/cltbld/talos-slave/test/build/xpcshell/tests/toolkit/components/ctypes/tests/unit/594a7af1-47c9-6492-61c53786-7b27a475.dmp Operating system: Linux 0.0.0 Linux 184.108.40.206-127.fc12.x86_64 #1 SMP Sat Nov 7 21:11:14 EST 2009 x86_64 CPU: amd64 family 6 model 23 stepping 10 2 CPUs Crash reason: SIGSEGV Crash address: 0x7fa4fdafc990 Thread 0 (crashed) 0 0x7fa4fdafc990 rbx = 0x019b8150 r12 = 0x00000008 r13 = 0x00000001 r14 = 0x1a2d4170 r15 = 0x00000001 rip = 0xfdafc990 rsp = 0x1a2d4168 rbp = 0x1a2d4170 Found by: given as instruction pointer in context 1 libxul.so!ffi_call [ffi64.c:10622eaff4fc : 485 + 0x24] rip = 0x0c0e6231 rsp = 0x1a2d4190 Found by: stack scanning 2 libxul.so + 0x15445ff rip = 0x0bf5c600 rsp = 0x1a2d41c8 Found by: stack scanning 3 libxul.so!js::ctypes::CDataFinalizer::CallFinalizer [CTypes.cpp:10622eaff4fc : 6673 + 0x4] rip = 0x0c0d148c rsp = 0x1a2d4280 Found by: stack scanning 4 libxul.so!js::ctypes::CDataFinalizer::Finalize [CTypes.cpp:10622eaff4fc : 6819 + 0x9] rbx = 0x019b8140 r12 = 0x00000040 r13 = 0xfdf5d0c0 rip = 0x0c0d14c0 rsp = 0x1a2d42b0 rbp = 0xfdf5d040 Found by: call frame info 5 libxul.so!js::gc::FinalizeTypedArenas<JSObject> [jsobjinlines.h:10622eaff4fc : 256 + 0x25] rbx = 0xfdf5d080 r12 = 0x00000040 r13 = 0xfdf5d0c0 rip = 0x0bf6e03c rsp = 0x1a2d42c0 rbp = 0xfdf5d040 Found by: call frame info 6 libxul.so!js::gc::ArenaLists::finalizeObjects [jsgc.cpp:10622eaff4fc : 1499 + 0x2c] rbx = 0x1a2d4410 r12 = 0x1a2d4410 r13 = 0x01a02250 r14 = 0x00000000 r15 = 0x00000000 rip = 0x0bf6f8e5 rsp = 0x1a2d4390 rbp = 0x01aab010 Found by: call frame info 7 libxul.so!GCCycle [jsgc.cpp:10622eaff4fc : 3171 + 0xe] rbx = 0x01a02000 r12 = 0x1a2d4410 r13 = 0x01a02250 r14 = 0x00000000 r15 = 0x00000000 rip = 0x0bf6fcad rsp = 0x1a2d43b0 rbp = 0x1a2d4400 Found by: call frame info 8 libxul.so!Collect [jsgc.cpp:10622eaff4fc : 3685 + 0x10] rbx = 0x01a02000 r12 = 0x01a028a0 r13 = 0x00000000 r14 = 0x00000000 r15 = 0x00000000 rip = 0x0bf70445 rsp = 0x1a2d44c0 rbp = 0x01a02250 Found by: call frame info 9 xpcshell!main [xpcshell.cpp:10622eaff4fc : 2017 + 0xc] rbx = 0x0952e6f0 r12 = 0x00000000 r13 = 0x01aa1f00 r14 = 0x00000000 r15 = 0x00000000 rip = 0x00407c8e rsp = 0x1a2d4500 rbp = 0x00000000 Found by: call frame info 10 libc-2.11.so + 0x1eb1c rbx = 0x00000000 r12 = 0x004044f0 r13 = 0x1a2d4850 r14 = 0x00000000
Created attachment 614846 [details] [diff] [review] disable test on Linux64 opt/pgo This patch disables the test for now on Linux64 opt/pgo builds. (There's no way to disable it for PGO only.)
Comment on attachment 614846 [details] [diff] [review] disable test on Linux64 opt/pgo We talked it over on IRC, and: <bsmedberg> that doesn't sound like the kind of test-disablement we want <mbrubeck> jorendorff, bsmedberg: Alternately we could try backing out all of bug 720771. <jorendorff> mbrubeck: I was thinking that <jorendorff> mbrubeck: certainly if the bug is reproducible, backing out the whole thing seems better to me <mbrubeck> okay, I'll see if it backs out cleanly <bsmedberg> I think that backing out is preferable if this needs to be solved immediately. I agree. Poor Yoric.
Backed out bug 720771: https://hg.mozilla.org/mozilla-central/rev/e1f0bb28fbb4 Leaving this bug open since this issue still needs to be fixed before the backed-out patches can re-land. You can close this bug if you'd rather just track the work in bug 720771.
Just for clarification: the problem is *not* in bug 720771 but in dependent bug 742384, which was landed immediately after.
My bad, it may actually be in bug 720771. I am currently investigating the issue and I have the impression that there is a strange interaction between PGO and garbage-collection (see bug 745448).
Ok, issue identified: - our gc is Boehm-style conservative, so _anything_ can cause a reference to be falsely identified as live; - my test erroneously relied on all dead references being released. I have posted fixes to the test suite, now waiting for jorendorff's review to land.