Closed Bug 1480494 Opened 3 years ago Closed 3 years ago

Intermittent Btup built stuck - Task timeout after 3600 seconds. Force killing container.

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(firefox63 fixed)

RESOLVED FIXED
mozilla63
Tracking Status
firefox63 --- fixed

People

(Reporter: aryx, Assigned: mshal)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

This build spends a big time printing nothing to the log and timing out.

https://treeherder.mozilla.org/logviewer.html#?job_id=191290088&repo=autoland

[task 2018-08-01T02:06:18.951Z] 02:06:18     INFO -  In file included from Unified_cpp_xpcom_tests_gtest3.cpp:83:0:
[task 2018-08-01T02:06:18.951Z] 02:06:18     INFO -  /builds/worker/workspace/build/src/xpcom/tests/gtest/TestTokenizer.cpp:678:13: note: 'u8' was declared here
[task 2018-08-01T02:06:18.951Z] 02:06:18     INFO -       uint8_t u8;
[task 2018-08-01T02:06:18.951Z] 02:06:18     INFO -               ^~
[task 2018-08-01T02:06:18.951Z] 02:06:18     INFO -  In file included from /builds/worker/workspace/build/src/obj-firefox/dist/include/gtest/gtest.h:58:0,
[task 2018-08-01T02:06:18.951Z] 02:06:18     INFO -                   from /builds/worker/workspace/build/src/xpcom/tests/gtest/TestTextFormatter.cpp:8,
[task 2018-08-01T02:06:18.951Z] 02:06:18     INFO -                   from Unified_cpp_xpcom_tests_gtest3.cpp:2:
[task 2018-08-01T02:06:18.951Z] 02:06:18     INFO -  /builds/worker/workspace/build/src/obj-firefox/dist/include/gtest/internal/gtest-internal.h: In member function 'virtual void Tokenizer_ReadIntegers_Test::TestBody()':
[task 2018-08-01T02:06:18.951Z] 02:06:18     INFO -  /builds/worker/workspace/build/src/obj-firefox/dist/include/gtest/internal/gtest-internal.h:1188:3: warning: 'signed_value64' may be used uninitialized in this function [-Wmaybe-uninitialized]
[task 2018-08-01T02:06:18.952Z] 02:06:18     INFO -     if (const ::testing::AssertionResult gtest_ar_ = \
[task 2018-08-01T02:06:18.952Z] 02:06:18     INFO -     ^~
[task 2018-08-01T02:06:18.952Z] 02:06:18     INFO -  In file included from Unified_cpp_xpcom_tests_gtest3.cpp:83:0:
[task 2018-08-01T02:06:18.952Z] 02:06:18     INFO -  /builds/worker/workspace/build/src/xpcom/tests/gtest/TestTokenizer.cpp:1281:11: note: 'signed_value64' was declared here
[task 2018-08-01T02:06:18.952Z] 02:06:18     INFO -     int64_t signed_value64;
[task 2018-08-01T02:06:18.952Z] 02:06:18     INFO -             ^~~~~~~~~~~~~~

[taskcluster:error] Task timeout after 3600 seconds. Force killing container.
[taskcluster 2018-08-01 02:43:50.831Z] === Task Finished ===
[taskcluster 2018-08-01 02:43:50.832Z] Unsuccessful task run with exit code: -1 completed in 3621.112 seconds
mshal figured out each of the failing logs is missing the line that invokes the build script for the style crate.

We're stuck waiting for bindgen with the following stack:

Thread 2 (Thread 0x7f5ef3dfe700 (LWP 24760)):
#0  0x00007f5ef54cd51d in read () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007f5ef51c53a1 in std::sys::unix::fd::FileDesc::read::h74ef260a7e0ba40c () at libstd/sys/unix/fd.rs:58
#2  std::sys::unix::pipe::AnonPipe::read::h6304a0ae8ccf93bb () at libstd/sys/unix/pipe.rs:71
#3  std::sys::unix::process::process_inner::_$LT$impl$u20$std..sys..unix..process..process_common..Command$GT$::spawn::h125abc074f68bae2 ()
    at libstd/sys/unix/process/process_unix.rs:74
#4  0x00007f5ef51cb98a in std::process::Command::output::h4a76362cb8d1b633 () at libstd/process.rs:737
#5  0x0000561620c89780 in clang_sys::support::run::he9752f1d23e34fbc ()
#6  0x0000561620c89b0d in clang_sys::support::run_clang::h8aa1402902dc9c36 ()
#7  0x0000561620c88016 in clang_sys::support::Clang::new::h3dbb9e936f63647c ()
#8  0x0000561620c8864d in clang_sys::support::Clang::find::hd6b84a99818eba11 ()
#9  0x0000561620c17311 in bindgen::Builder::generate::h22052e4f78681c79 ()
#10 0x0000561620af0511 in build_script_build::build_gecko::bindings::write_binding_file::h080caa46e6f21d22 ()
#11 0x0000561620af7964 in build_script_build::build_gecko::bindings::generate_bindings::h04f1704a7663421c ()
#12 0x00007f5ef520c71a in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:105
#13 0x0000561620ae409c in _$LT$F$u20$as$u20$alloc..boxed..FnBox$LT$A$GT$$GT$::call_box::hbc6d8b8f1402c86b ()
#14 0x00007f5ef51d204b in _$LT$alloc..boxed..Box$LT$$LP$dyn$u20$alloc..boxed..FnBox$LT$A$C$$u20$Output$u3d$R$GT$$u20$$u2b$$u20$$u27$a$RP$$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h3cff4ee56f008a5e () at /checkout/src/liballoc/boxed.rs:652
#15 std::sys_common::thread::start_thread::hf067c3c41437276e () at libstd/sys_common/thread.rs:24
#16 0x00007f5ef51ce306 in std::sys::unix::thread::Thread::new::thread_start::haf8e559911a01c17 () at libstd/sys/unix/thread.rs:90
#17 0x00007f5ef54c46ba in start_thread (arg=0x7f5ef3dfe700) at pthread_create.c:333
#18 0x00007f5ef4a9641d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 1 (Thread 0x7f5ef5ade900 (LWP 24740)):
#0  0x00007f5ef54c598d in pthread_join (threadid=140045795190528, thread_return=0x0) at pthread_join.c:90
#1  0x00007f5ef51ce47c in std::sys::unix::thread::Thread::join::h31b1188eed4a9d5a () at libstd/sys/unix/thread.rs:177
#2  0x0000561620af9a32 in build_script_build::main::h1fa3f9f1bf414d53 ()
#3  0x0000561620adcd63 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h6fe0d75cae52dd99 ()
#4  0x00007f5ef51cd093 in std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::hb7949577b059f871 () at libstd/rt.rs:59
#5  std::panicking::try::do_call::h81e05366fcfc078a () at libstd/panicking.rs:310
#6  0x00007f5ef520c71a in __rust_maybe_catch_panic () at libpanic_unwind/lib.rs:105
#7  0x00007f5ef51d2466 in std::panicking::try::h8211b1a0cffc0eb9 () at libstd/panicking.rs:289
#8  std::panic::catch_unwind::h543c92261109e763 () at libstd/panic.rs:392
#9  std::rt::lang_start_internal::hab655c063b9aabab () at libstd/rt.rs:58
#10 0x0000561620afa164 in main ()

The tup build is one of the only builds using a nightly rustc, that may be the issue here rather than something tup specific. I'm going to see if I can reproduce on a make build.
FWIW, thread 1 in that stack is just waiting for stdout from a clang process as part of bindgen:
https://github.com/KyleMayes/clang-sys/blob/67f7d8c25eff694d7ba58ff42da6e5b502413b7d/src/support.rs#L142

It actually looks like it's hung running `clang --version`:
https://github.com/KyleMayes/clang-sys/blob/67f7d8c25eff694d7ba58ff42da6e5b502413b7d/src/support.rs#L165
At least some of this is getting starred over on bug 1467668.
See Also: → 1467668
Most is in bug 1411358 (look at "Test Type" "opt").
See Also: → 1411358
Blocks: 1411358
See Also: 1411358
This includes a fix for the style build script hang where pthreads fork
subprocesses, as well as a fix for ignoring the icecream file lock.

MozReview-Commit-ID: 29eNcbNtwB1
Comment on attachment 9003566 [details]
Bug 1480494 - Update tup toolchain to e948a999a38fefa0ac0d92f6357f82aca2f9cb17; r?chmanchester

Chris Manchester (:chmanchester) has approved the revision.
Attachment #9003566 - Flags: review+
Pushed by cmanchester@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a56e76cb2c02
Update tup toolchain to e948a999a38fefa0ac0d92f6357f82aca2f9cb17; r=chmanchester
https://hg.mozilla.org/mozilla-central/rev/a56e76cb2c02
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Assignee: nobody → mshal
You need to log in before you can comment on or make changes to this bug.