Closed Bug 1480631 Opened 5 years ago Closed 5 years ago

Switch Linux builds to clang

Categories

(Firefox Build System :: General, enhancement)

enhancement
Not set
normal

Tracking

(firefox63 fixed)

RESOLVED FIXED
mozilla63
Tracking Status
firefox63 --- fixed

People

(Reporter: glandium, Assigned: glandium)

References

Details

Attachments

(1 file, 5 obsolete files)

Build with clang 6, without LTO, turn out to be faster than those produced by GCC 6, while also being smaller and faster in build time.

But this is only talking about normal builds, not PGO, but we should already start by switching the normal builds. We should however keep some job with GCC.

PGO will be dealt with in a followup bug.
Depends on: 1480706
Depends on: 1481373
Depends on: 1481378
Attached patch Switch Linux builds to clang. (obsolete) — Splinter Review
This change switches most CI builds to clang, with a few exceptions:
- valgrind builds, until bug 1481670 is figured out.
- PGO and nightly builds, until that's fully tested.
- any build that doesn't use build/unix/mozconfig.linux (e.g. probably 
  all those driven by autospider.py, maybe others).
Attachment #8998377 - Flags: review?(core-build-config-reviews)
Attached patch Switch Linux builds to clang. (obsolete) — Splinter Review
+ exception for coverage builds, per bug 1471339 comment 17.
Attachment #8998377 - Attachment is obsolete: true
Attachment #8998377 - Flags: review?(core-build-config-reviews)
Attachment #8998380 - Flags: review?(core-build-config-reviews)
We already copy the 64-bits libraries, but don't copy the 32-bits
libraries, which prevents building for linux32 by default.

Incidentally, this also makes the clang build system build the 32-bits
compiler-rt libraries, allowing e.g. 32-bits PGO.
Attachment #8998417 - Flags: review?(core-build-config-reviews)
Attached patch Switch Linux builds to clang. (obsolete) — Splinter Review
Attachment #8998380 - Attachment is obsolete: true
Attachment #8998380 - Flags: review?(core-build-config-reviews)
Attachment #8998420 - Flags: review?(core-build-config-reviews)
This change switches most CI builds to clang, with a few exceptions:
- valgrind builds, until bug 1481670 is figured out.
- PGO and nightly builds, until that's fully tested.
- coverage builds, per bug 1471339 comment 17.
- base toolchain builds, to keep some builds on GCC even when we're fully switched to clang.
- any build that doesn't use build/unix/mozconfig.linux (e.g. probably all those driven by autospider.py, maybe others).
Attachment #8998420 - Attachment is obsolete: true
Attachment #8998420 - Flags: review?(core-build-config-reviews)
Blocks: 1481721
Attachment #8998428 - Flags: review?(core-build-config-reviews)
Attachment #8998417 - Flags: review?(core-build-config-reviews) → review+
Attachment #8998428 - Flags: review?(core-build-config-reviews) → review+
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/910a805c960d
Copy 32-bits libraries from gcc to clang. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/dd64a5e6d4df
Switch Linux builds to clang. r=froydnj
Backed out 3 changesets (Bug 1480631) for failure at /home/cltbld/workspace/build/tests/talos/talos/run_tests.py

Backout:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8489f76a1b4ddbef11e853fe7355cdfbf7e03689

Push link:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=dd64a5e6d4dfa07ba7a819878852cb458a87bf98&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-classifiedState=unclassified

Failure log:
https://treeherder.mozilla.org/logviewer.html#?job_id=192876141&repo=mozilla-inbound&lineNumber=1847

15:30:13     INFO -  TEST-INFO | started process 14026 (/home/cltbld/workspace/build/application/firefox/firefox -profile /tmp/tmpXVgNnt/profile http://localhost:45891/startup_test/tresize/addon/content/tresize-test.html)
15:32:43     INFO -  Timeout waiting for test completion; killing browser...
15:32:43     INFO -  Terminating psutil.Process(pid=14026, name='firefox', started='15:30:12')
15:32:43     INFO -  TEST-UNEXPECTED-ERROR | tresize | timeout
15:32:43    ERROR -  Traceback (most recent call last):
15:32:43     INFO -    File "/home/cltbld/workspace/build/tests/talos/talos/run_tests.py", line 299, in run_tests
15:32:43     INFO -      talos_results.add(mytest.runTest(browser_config, test))
15:32:43     INFO -    File "/home/cltbld/workspace/build/tests/talos/talos/ttest.py", line 63, in runTest
15:32:43     INFO -      return self._runTest(browser_config, test_config, setup)
15:32:43     INFO -    File "/home/cltbld/workspace/build/tests/talos/talos/ttest.py", line 207, in _runTest
15:32:43     INFO -      debugger_args=browser_config['debugger_args']
15:32:43     INFO -    File "/home/cltbld/workspace/build/tests/talos/talos/talos_process.py", line 143, in run_browser
15:32:43     INFO -      raise TalosError("timeout")
15:32:43     INFO -  TalosError: timeout
15:32:43     INFO -  TEST-INFO took 196336ms
15:32:43     INFO -  SUITE-END | took 307s
15:32:43    ERROR - Return code: 2
15:32:43  WARNING - setting return code to 2
15:32:43    ERROR - # TBPL FAILURE #
15:32:43     INFO - Running post-action listener: _package_coverage_data
15:32:43     INFO - Running post-action listener: _resource_record_post_action
15:32:43     INFO - Running post-action listener: process_java_coverage_data
Flags: needinfo?(mh+mozilla)
And android x86 bustage too:
/builds/worker/workspace/build/src/build/unix/stdc++compat/stdc++compat.cpp:42:10: fatal error: 'tr1/unordered_map' file not found

Because adding a lib directory busts includes, right?
Flags: needinfo?(mh+mozilla)
Only differences in the clang.tar.xz:
2985a2986,2988
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.asan-i386.a
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.asan-i386.so
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.asan-preinit-i386.a
2989a2993
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.asan_cxx-i386.a
2991a2996
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.builtins-i386.a
2992a2998
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.cfi-i386.a
2993a3000
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.cfi_diag-i386.a
3002a3010
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.lsan-i386.a
3007a3016
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.profile-i386.a
3008a3018
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.safestack-i386.a
3009a3020,3021
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.scudo-i386.a
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.scudo-i386.so
3011a3024
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.scudo_cxx-i386.a
3012a3026
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.stats-i386.a
3013a3028
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.stats_client-i386.a
3018a3034,3035
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.ubsan_minimal-i386.a
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.ubsan_minimal-i386.so
3021a3039,3040
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.ubsan_standalone-i386.a
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.ubsan_standalone-i386.so
3024a3044
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.ubsan_standalone_cxx-i386.a
3616a3637,3699
> clang/lib32/
> clang/lib32/libasan.a
> clang/lib32/libasan.la
> clang/lib32/libasan.so
> clang/lib32/libasan.so.1
> clang/lib32/libasan.so.1.0.0
> clang/lib32/libasan_preinit.o
> clang/lib32/libatomic.a
> clang/lib32/libatomic.la
> clang/lib32/libatomic.so
> clang/lib32/libatomic.so.1
> clang/lib32/libatomic.so.1.1.0
> clang/lib32/libcilkrts.a
> clang/lib32/libcilkrts.la
> clang/lib32/libcilkrts.so
> clang/lib32/libcilkrts.so.5
> clang/lib32/libcilkrts.so.5.0.0
> clang/lib32/libcilkrts.spec
> clang/lib32/libgcc_s.so
> clang/lib32/libgcc_s.so.1
> clang/lib32/libgomp.a
> clang/lib32/libgomp.la
> clang/lib32/libgomp.so
> clang/lib32/libgomp.so.1
> clang/lib32/libgomp.so.1.0.0
> clang/lib32/libgomp.spec
> clang/lib32/libitm.a
> clang/lib32/libitm.la
> clang/lib32/libitm.so
> clang/lib32/libitm.so.1
> clang/lib32/libitm.so.1.0.0
> clang/lib32/libitm.spec
> clang/lib32/libquadmath.a
> clang/lib32/libquadmath.la
> clang/lib32/libquadmath.so
> clang/lib32/libquadmath.so.0
> clang/lib32/libquadmath.so.0.0.0
> clang/lib32/libsanitizer.spec
> clang/lib32/libssp.a
> clang/lib32/libssp.la
> clang/lib32/libssp.so
> clang/lib32/libssp.so.0
> clang/lib32/libssp.so.0.0.0
> clang/lib32/libssp_nonshared.a
> clang/lib32/libssp_nonshared.la
> clang/lib32/libstdc++.a
> clang/lib32/libstdc++.la
> clang/lib32/libstdc++.so
> clang/lib32/libstdc++.so.6
> clang/lib32/libstdc++.so.6.0.20
> clang/lib32/libstdc++.so.6.0.20-gdb.py
> clang/lib32/libsupc++.a
> clang/lib32/libsupc++.la
> clang/lib32/libubsan.a
> clang/lib32/libubsan.la
> clang/lib32/libubsan.so
> clang/lib32/libubsan.so.0
> clang/lib32/libubsan.so.0.0.0
> clang/lib32/libvtv.a
> clang/lib32/libvtv.la
> clang/lib32/libvtv.so
> clang/lib32/libvtv.so.0
> clang/lib32/libvtv.so.0.0.0

All in lib...
and no obvious difference in configure output :(

As for talos, it was running fine as of dcea9f4775fa on try :(
(In reply to Daniel Varga [:dvarga] from comment #8)
> Failure log:
> https://treeherder.mozilla.org/logviewer.html#?job_id=192876141&repo=mozilla-
> inbound&lineNumber=1847

It looks like a lot of failures on that commit are content processes locking up at Sandbox.cpp:633, which would be here: https://searchfox.org/mozilla-central/rev/0f4d50ff5211e8dd85e119ef683d04b63062ed90/security/sandbox/linux/Sandbox.cpp#633

This is in the post-exec content process.  Breakpad isn't being given symbols for libpthread, and bug 1310314 has eaten its immediate caller, but I can guess a few things from the registers.  rDI is 6, which suggests kSandboxChrootClientFd (one of the few advantages for hard-coded fd numbers; cf. bug 1440207) rSI is a stack address and rDX is 1.  So this is probably waiting for the chroot helper to respond: https://searchfox.org/mozilla-central/rev/0f4d50ff5211e8dd85e119ef683d04b63062ed90/security/sandbox/linux/Sandbox.cpp#311

And I bet I know what the problem is: there's another instance of bug 1480401 at https://searchfox.org/mozilla-central/source/security/sandbox/linux/launch/SandboxLaunch.cpp#593

I'll file a bug.
(In reply to Mike Hommey [:glandium] from comment #9)
> And android x86 bustage too:
> /builds/worker/workspace/build/src/build/unix/stdc++compat/stdc++compat.cpp:
> 42:10: fatal error: 'tr1/unordered_map' file not found
> 
> Because adding a lib directory busts includes, right?

So this one is happening because now that the lib32 directory exists, the libstdc++ compat check now can find a version for libstdc++, and that enables building stdc++compat, which we only really want for host, not target.
Depends on: 1481989
Comment on attachment 8998417 [details] [diff] [review]
Copy 32-bits libraries from gcc to clang

Moving this part to bug 1481989
Attachment #8998417 - Attachment is obsolete: true
Please note that until bug 1397493 is resolved, all fuzzing coverage builds need to stay on GCC. Otherwise we have no way to dump coverage while the process is still running. Thanks!
Attached patch Switch Linux builds to clang. (obsolete) — Splinter Review
Added exclusion for fuzzing ccov build. Carrying over r+.
Attachment #8998428 - Attachment is obsolete: true
Attachment #8999045 - Flags: review+
Attachment #8998428 - Attachment is obsolete: false
Comment on attachment 8999045 [details] [diff] [review]
Switch Linux builds to clang.

As per irc discussion, the fuzzing-asan-ccov build actually doesn't want gcc.
Attachment #8999045 - Attachment is obsolete: true
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/48b8084995b1
Switch Linux builds to clang. r=froydnj
https://hg.mozilla.org/mozilla-central/rev/48b8084995b1
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
some perf improvements:
== Change summary for alert #15061 (as of Thu, 16 Aug 2018 05:34:49 GMT) ==

Improvements:

  7%  perf_reftest_singletons linux64 opt e10s stylo     49.52 -> 45.91
  7%  tps linux64 opt e10s stylo                         11.72 -> 10.92
  6%  a11yr linux64 opt e10s stylo                       224.83 -> 210.32
  6%  perf_reftest_singletons linux64-qr opt e10s stylo  50.67 -> 47.58
  6%  a11yr linux64-qr opt e10s stylo                    223.84 -> 210.25
  6%  tart linux64 opt e10s stylo                        2.23 -> 2.10
  5%  speedometer linux64-qr opt e10s stylo              70.27 -> 73.88
  5%  speedometer linux64 opt e10s stylo                 70.90 -> 74.51
  5%  about_preferences_basic linux64 opt e10s stylo     165.65 -> 157.78
  5%  ts_paint_heavy linux64 opt e10s stylo              203.33 -> 193.67
  5%  tp5o linux64 opt e10s stylo                        146.06 -> 139.24
  5%  ts_paint_webext linux64 opt e10s stylo             205.50 -> 196.08
  5%  tp5o_webext linux64-qr opt e10s stylo              226.05 -> 215.84
  4%  tp5o linux64-qr opt e10s stylo                     151.62 -> 144.94
  4%  ts_paint linux64 opt e10s stylo                    203.96 -> 195.17
  4%  tp5o_webext linux64 opt e10s stylo                 218.53 -> 209.12
  4%  cpstartup content-process-startup linux64 opt e10s stylo169.77 -> 162.67
  4%  tscrollx linux64 opt e10s stylo                    0.71 -> 0.69
  4%  ts_paint_heavy linux64-qr opt e10s stylo           271.25 -> 261.58
  3%  tsvgx linux64 opt e10s stylo                       184.36 -> 177.96
  3%  about_preferences_basic linux64-qr opt e10s stylo  165.84 -> 160.11
  3%  ts_paint_webext linux64-qr opt e10s stylo          273.04 -> 263.83
  3%  ts_paint linux64-qr opt e10s stylo                 271.08 -> 262.92
  3%  tsvgx linux64-qr opt e10s stylo                    337.25 -> 327.37
  3%  sessionrestore linux64 opt e10s stylo              278.63 -> 271.08
  2%  stylebench linux64 opt e10s stylo                  38.89 -> 39.81
  2%  tscrollx linux64-qr opt e10s stylo                 0.42 -> 0.41
  2%  tp5o_scroll linux64 opt e10s stylo                 0.60 -> 0.59
  2%  stylebench linux64-qr opt e10s stylo               39.13 -> 39.94

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=15061
some build time improvements:
== Change summary for alert #15096 (as of Thu, 16 Aug 2018 05:34:49 GMT) ==

Improvements:

 27%  build times linux64 opt taskcluster-c5.4xlarge tup     1,165.97 -> 849.46
 25%  build times linux64 opt taskcluster-m5d.4xlarge tup    1,139.04 -> 849.45
 23%  build times linux64 opt taskcluster-c4.4xlarge tup     1,350.73 -> 1,045.54
 21%  build times linux64 opt taskcluster-m4.4xlarge tup     1,451.89 -> 1,152.88

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=15096
Interestingly, I'm not seeing a build time speedup on Linux64 opt or plain builds. Why is Tup seeing such a drastic speedup? Perhaps the improvement is lost in the noise?
sccache?
Neither plain builds nor tup builds use sccache.
Plain builds are still using gcc because they don't use mozconfig.linux.
Blocks: 1484862
Blocks: 1485219
"perf" key word?
See Also: → 1608849
You need to log in before you can comment on or make changes to this bug.