Closed Bug 1407687 Opened 7 years ago Closed 6 years ago

Intermittent timeouts in linux rusttests tasks

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: gbrown, Unassigned)

References

Details

In bug 1204281 there are reports of timeouts in linux32-rusttests and linux64-rusttests.

https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=135749181&lineNumber=32774

[task 2017-10-09T18:48:30.672Z] 18:48:30     INFO - note: library: m
[task 2017-10-09T18:48:30.672Z] 18:48:30     INFO - note: library: rt
[task 2017-10-09T18:48:30.672Z] 18:48:30     INFO - note: library: pthread
[task 2017-10-09T18:48:30.672Z] 18:48:30     INFO - note: library: util
[task 2017-10-09T18:48:32.190Z] 18:48:32     INFO -      Running `/builds/worker/workspace/build/src/rustc/bin/rustc --crate-name stylo_tests /builds/worker/workspace/build/src/servo/tests/unit/stylo/lib.rs --emit=dep-info,link -C opt-level=3 --test -C metadata=c9082d76cfaf8a76 -C extra-filename=-c9082d76cfaf8a76 --out-dir /builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps --target i686-unknown-linux-gnu -C linker=/builds/worker/workspace/build/src/build/cargo-linker -L dependency=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps -L dependency=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/release/deps --extern env_logger=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps/libenv_logger-40b33462646310da.rlib --extern euclid=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps/libeuclid-53520cc4344e88fc.rlib --extern cssparser=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps/libcssparser-049e8fc1533dc8df.rlib --extern size_of_test=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps/libsize_of_test-53dc78ce558d81b4.rlib --extern malloc_size_of=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps/libmalloc_size_of-0ae69d75e21b4bfb.rlib --extern geckoservo=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps/libgeckoservo-f7ee3e90a6761274.rlib --extern style_traits=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps/libstyle_traits-c5400bc0cc2f4476.rlib --extern log=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps/liblog-fd2fff032e50a631.rlib --extern selectors=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps/libselectors-ade130cf9af8dd02.rlib --extern smallvec=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps/libsmallvec-9896d5f32c80d5c2.rlib --extern libc=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps/liblibc-9a1c8ca7327d4e94.rlib --extern atomic_refcell=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps/libatomic_refcell-b0459433c291d7e6.rlib --extern style=/builds/worker/workspace/build/src/obj-firefox/toolkit/library/i686-unknown-linux-gnu/release/deps/libstyle-e7eb6d1272252479.rlib -C debuginfo=2`

[taskcluster:error] Task timeout after 3600 seconds. Force killing container.
[taskcluster 2017-10-09 18:48:44.699Z] === Task Finished ===
[taskcluster 2017-10-09 18:48:44.700Z] Unsuccessful task run with exit code: -1 completed in 3603.589 seconds
If these tasks just need more time to run, that's easy to modify:

https://dxr.mozilla.org/mozilla-central/rev/a0488ecc201c04f2617e7b02f039344e8fbf0d9a/taskcluster/ci/build/linux.yml#317

but I'm not sure if that's appropriate. It looks like these tasks often run in ~30 minutes.

Nathan -- Can you have a look?
Flags: needinfo?(nfroyd)
We could bump up the amount of time they need to run...but I'm unsure what would cause the execution time to double.  Oh, maybe if sccache isn't hitting in the cache and we wind up having to compile *all* the C++ code (or at least a significant fraction of it) + the Rust code for libxul + the Rust code for the tests...that would definitely make things run more slowly.

This also indicates that it'd be really nice if we were able to run the normal builds, and then package up the objdirs and ship them somewhere so we could just run the Rust compilation for the tests as a separate job.

Bumping the timeout here to 5400 seconds or even 7200 should be sufficient.  Do you want to do that or shall I?
Flags: needinfo?(nfroyd)
If we made the rusttest builds dependent on the normal builds, would that help us get sccache cache hits? I assume most of the crates we wind up building will be the same between both builds, and it's just the actual crates we're testing that will get built in the test configuration.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #4)
> If we made the rusttest builds dependent on the normal builds, would that
> help us get sccache cache hits? I assume most of the crates we wind up
> building will be the same between both builds, and it's just the actual
> crates we're testing that will get built in the test configuration.

Yes, that would help a lot.  How would we accomplish that?  Would that even ensure that the objdir from the normal build gets reused for the rusttests build?
Flags: needinfo?(ted)
I'll bump the max run time to 5400, at least for short-term help. Thanks for looking at the bigger issue.
Keywords: leave-open
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8d668d63bfe0
Increase max run time for linux rusttests; r=me,test-only
(In reply to Nathan Froyd [:froydnj] from comment #5)
> Yes, that would help a lot.  How would we accomplish that?  Would that even
> ensure that the objdir from the normal build gets reused for the rusttests
> build?

You'd have to fiddle with the task definitions so that the rusttest tasks depend on the build tasks. It would not allow us to reuse objdirs AFAIK, but it should allow you to get sccache hits for everything that was built during the build.
Flags: needinfo?(ted)
Blocks: 1411358
No longer blocks: 1411358
Product: Core → Firefox Build System
The leave-open keyword is there and there is no activity for 6 months.
:kmoir, maybe it's time to close this bug?
Flags: needinfo?(kmoir)
:gbrown is this bug still an issue or can it be closed?  It currently has the leave-open keyword so it isn't being closed.
Flags: needinfo?(kmoir) → needinfo?(gbrown)
I think this is okay now. I don't see recent failures on mozilla-central.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(gbrown)
Keywords: leave-open
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.