Switch Linux builds to clang

RESOLVED FIXED in Firefox 63

Status

RESOLVED FIXED
3 months ago
2 months ago

People

(Reporter: glandium, Assigned: glandium)

Tracking

unspecified
mozilla63
Dependency tree / graph

Firefox Tracking Flags

(firefox63 fixed)

Details

Attachments

(1 attachment, 5 obsolete attachments)

(Assignee)

Description

3 months ago
Build with clang 6, without LTO, turn out to be faster than those produced by GCC 6, while also being smaller and faster in build time.

But this is only talking about normal builds, not PGO, but we should already start by switching the normal builds. We should however keep some job with GCC.

PGO will be dealt with in a followup bug.
(Assignee)

Updated

3 months ago
Depends on: 1480706
(Assignee)

Updated

3 months ago
Depends on: 1481373
(Assignee)

Updated

3 months ago
Depends on: 1481378
(Assignee)

Comment 1

3 months ago
Created attachment 8998377 [details] [diff] [review]
Switch Linux builds to clang.

This change switches most CI builds to clang, with a few exceptions:
- valgrind builds, until bug 1481670 is figured out.
- PGO and nightly builds, until that's fully tested.
- any build that doesn't use build/unix/mozconfig.linux (e.g. probably 
  all those driven by autospider.py, maybe others).
Attachment #8998377 - Flags: review?(core-build-config-reviews)
(Assignee)

Comment 2

3 months ago
Created attachment 8998380 [details] [diff] [review]
Switch Linux builds to clang.

+ exception for coverage builds, per bug 1471339 comment 17.
Attachment #8998377 - Attachment is obsolete: true
Attachment #8998377 - Flags: review?(core-build-config-reviews)
(Assignee)

Updated

3 months ago
Attachment #8998380 - Flags: review?(core-build-config-reviews)
(Assignee)

Comment 3

3 months ago
Created attachment 8998417 [details] [diff] [review]
Copy 32-bits libraries from gcc to clang

We already copy the 64-bits libraries, but don't copy the 32-bits
libraries, which prevents building for linux32 by default.

Incidentally, this also makes the clang build system build the 32-bits
compiler-rt libraries, allowing e.g. 32-bits PGO.
Attachment #8998417 - Flags: review?(core-build-config-reviews)
(Assignee)

Comment 4

3 months ago
Created attachment 8998420 [details] [diff] [review]
Switch Linux builds to clang.
Attachment #8998380 - Attachment is obsolete: true
Attachment #8998380 - Flags: review?(core-build-config-reviews)
Attachment #8998420 - Flags: review?(core-build-config-reviews)
(Assignee)

Comment 5

3 months ago
Created attachment 8998428 [details] [diff] [review]
Switch Linux builds to clang.

This change switches most CI builds to clang, with a few exceptions:
- valgrind builds, until bug 1481670 is figured out.
- PGO and nightly builds, until that's fully tested.
- coverage builds, per bug 1471339 comment 17.
- base toolchain builds, to keep some builds on GCC even when we're fully switched to clang.
- any build that doesn't use build/unix/mozconfig.linux (e.g. probably all those driven by autospider.py, maybe others).
Attachment #8998420 - Attachment is obsolete: true
Attachment #8998420 - Flags: review?(core-build-config-reviews)
(Assignee)

Updated

3 months ago
Blocks: 1481721
(Assignee)

Updated

3 months ago
Attachment #8998428 - Flags: review?(core-build-config-reviews)

Updated

3 months ago
Attachment #8998417 - Flags: review?(core-build-config-reviews) → review+

Updated

3 months ago
Attachment #8998428 - Flags: review?(core-build-config-reviews) → review+

Comment 6

3 months ago
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/910a805c960d
Copy 32-bits libraries from gcc to clang. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/dd64a5e6d4df
Switch Linux builds to clang. r=froydnj
Backed out 3 changesets (Bug 1480631) for failure at /home/cltbld/workspace/build/tests/talos/talos/run_tests.py

Backout:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8489f76a1b4ddbef11e853fe7355cdfbf7e03689

Push link:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=dd64a5e6d4dfa07ba7a819878852cb458a87bf98&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-classifiedState=unclassified

Failure log:
https://treeherder.mozilla.org/logviewer.html#?job_id=192876141&repo=mozilla-inbound&lineNumber=1847

15:30:13     INFO -  TEST-INFO | started process 14026 (/home/cltbld/workspace/build/application/firefox/firefox -profile /tmp/tmpXVgNnt/profile http://localhost:45891/startup_test/tresize/addon/content/tresize-test.html)
15:32:43     INFO -  Timeout waiting for test completion; killing browser...
15:32:43     INFO -  Terminating psutil.Process(pid=14026, name='firefox', started='15:30:12')
15:32:43     INFO -  TEST-UNEXPECTED-ERROR | tresize | timeout
15:32:43    ERROR -  Traceback (most recent call last):
15:32:43     INFO -    File "/home/cltbld/workspace/build/tests/talos/talos/run_tests.py", line 299, in run_tests
15:32:43     INFO -      talos_results.add(mytest.runTest(browser_config, test))
15:32:43     INFO -    File "/home/cltbld/workspace/build/tests/talos/talos/ttest.py", line 63, in runTest
15:32:43     INFO -      return self._runTest(browser_config, test_config, setup)
15:32:43     INFO -    File "/home/cltbld/workspace/build/tests/talos/talos/ttest.py", line 207, in _runTest
15:32:43     INFO -      debugger_args=browser_config['debugger_args']
15:32:43     INFO -    File "/home/cltbld/workspace/build/tests/talos/talos/talos_process.py", line 143, in run_browser
15:32:43     INFO -      raise TalosError("timeout")
15:32:43     INFO -  TalosError: timeout
15:32:43     INFO -  TEST-INFO took 196336ms
15:32:43     INFO -  SUITE-END | took 307s
15:32:43    ERROR - Return code: 2
15:32:43  WARNING - setting return code to 2
15:32:43    ERROR - # TBPL FAILURE #
15:32:43     INFO - Running post-action listener: _package_coverage_data
15:32:43     INFO - Running post-action listener: _resource_record_post_action
15:32:43     INFO - Running post-action listener: process_java_coverage_data
Flags: needinfo?(mh+mozilla)
(Assignee)

Comment 9

3 months ago
And android x86 bustage too:
/builds/worker/workspace/build/src/build/unix/stdc++compat/stdc++compat.cpp:42:10: fatal error: 'tr1/unordered_map' file not found

Because adding a lib directory busts includes, right?
Flags: needinfo?(mh+mozilla)
(Assignee)

Comment 10

3 months ago
Only differences in the clang.tar.xz:
2985a2986,2988
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.asan-i386.a
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.asan-i386.so
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.asan-preinit-i386.a
2989a2993
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.asan_cxx-i386.a
2991a2996
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.builtins-i386.a
2992a2998
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.cfi-i386.a
2993a3000
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.cfi_diag-i386.a
3002a3010
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.lsan-i386.a
3007a3016
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.profile-i386.a
3008a3018
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.safestack-i386.a
3009a3020,3021
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.scudo-i386.a
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.scudo-i386.so
3011a3024
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.scudo_cxx-i386.a
3012a3026
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.stats-i386.a
3013a3028
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.stats_client-i386.a
3018a3034,3035
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.ubsan_minimal-i386.a
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.ubsan_minimal-i386.so
3021a3039,3040
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.ubsan_standalone-i386.a
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.ubsan_standalone-i386.so
3024a3044
> clang/lib/clang/6.0.1/lib/linux/libclang_rt.ubsan_standalone_cxx-i386.a
3616a3637,3699
> clang/lib32/
> clang/lib32/libasan.a
> clang/lib32/libasan.la
> clang/lib32/libasan.so
> clang/lib32/libasan.so.1
> clang/lib32/libasan.so.1.0.0
> clang/lib32/libasan_preinit.o
> clang/lib32/libatomic.a
> clang/lib32/libatomic.la
> clang/lib32/libatomic.so
> clang/lib32/libatomic.so.1
> clang/lib32/libatomic.so.1.1.0
> clang/lib32/libcilkrts.a
> clang/lib32/libcilkrts.la
> clang/lib32/libcilkrts.so
> clang/lib32/libcilkrts.so.5
> clang/lib32/libcilkrts.so.5.0.0
> clang/lib32/libcilkrts.spec
> clang/lib32/libgcc_s.so
> clang/lib32/libgcc_s.so.1
> clang/lib32/libgomp.a
> clang/lib32/libgomp.la
> clang/lib32/libgomp.so
> clang/lib32/libgomp.so.1
> clang/lib32/libgomp.so.1.0.0
> clang/lib32/libgomp.spec
> clang/lib32/libitm.a
> clang/lib32/libitm.la
> clang/lib32/libitm.so
> clang/lib32/libitm.so.1
> clang/lib32/libitm.so.1.0.0
> clang/lib32/libitm.spec
> clang/lib32/libquadmath.a
> clang/lib32/libquadmath.la
> clang/lib32/libquadmath.so
> clang/lib32/libquadmath.so.0
> clang/lib32/libquadmath.so.0.0.0
> clang/lib32/libsanitizer.spec
> clang/lib32/libssp.a
> clang/lib32/libssp.la
> clang/lib32/libssp.so
> clang/lib32/libssp.so.0
> clang/lib32/libssp.so.0.0.0
> clang/lib32/libssp_nonshared.a
> clang/lib32/libssp_nonshared.la
> clang/lib32/libstdc++.a
> clang/lib32/libstdc++.la
> clang/lib32/libstdc++.so
> clang/lib32/libstdc++.so.6
> clang/lib32/libstdc++.so.6.0.20
> clang/lib32/libstdc++.so.6.0.20-gdb.py
> clang/lib32/libsupc++.a
> clang/lib32/libsupc++.la
> clang/lib32/libubsan.a
> clang/lib32/libubsan.la
> clang/lib32/libubsan.so
> clang/lib32/libubsan.so.0
> clang/lib32/libubsan.so.0.0.0
> clang/lib32/libvtv.a
> clang/lib32/libvtv.la
> clang/lib32/libvtv.so
> clang/lib32/libvtv.so.0
> clang/lib32/libvtv.so.0.0.0

All in lib...
(Assignee)

Comment 11

3 months ago
and no obvious difference in configure output :(

As for talos, it was running fine as of dcea9f4775fa on try :(
(In reply to Daniel Varga [:dvarga] from comment #8)
> Failure log:
> https://treeherder.mozilla.org/logviewer.html#?job_id=192876141&repo=mozilla-
> inbound&lineNumber=1847

It looks like a lot of failures on that commit are content processes locking up at Sandbox.cpp:633, which would be here: https://searchfox.org/mozilla-central/rev/0f4d50ff5211e8dd85e119ef683d04b63062ed90/security/sandbox/linux/Sandbox.cpp#633

This is in the post-exec content process.  Breakpad isn't being given symbols for libpthread, and bug 1310314 has eaten its immediate caller, but I can guess a few things from the registers.  rDI is 6, which suggests kSandboxChrootClientFd (one of the few advantages for hard-coded fd numbers; cf. bug 1440207) rSI is a stack address and rDX is 1.  So this is probably waiting for the chroot helper to respond: https://searchfox.org/mozilla-central/rev/0f4d50ff5211e8dd85e119ef683d04b63062ed90/security/sandbox/linux/Sandbox.cpp#311

And I bet I know what the problem is: there's another instance of bug 1480401 at https://searchfox.org/mozilla-central/source/security/sandbox/linux/launch/SandboxLaunch.cpp#593

I'll file a bug.
(Assignee)

Comment 13

3 months ago
(In reply to Mike Hommey [:glandium] from comment #9)
> And android x86 bustage too:
> /builds/worker/workspace/build/src/build/unix/stdc++compat/stdc++compat.cpp:
> 42:10: fatal error: 'tr1/unordered_map' file not found
> 
> Because adding a lib directory busts includes, right?

So this one is happening because now that the lib32 directory exists, the libstdc++ compat check now can find a version for libstdc++, and that enables building stdc++compat, which we only really want for host, not target.
(Assignee)

Updated

3 months ago
Depends on: 1481989
(Assignee)

Comment 14

3 months ago
Comment on attachment 8998417 [details] [diff] [review]
Copy 32-bits libraries from gcc to clang

Moving this part to bug 1481989
Attachment #8998417 - Attachment is obsolete: true
Duplicate of this bug: 800471
Please note that until bug 1397493 is resolved, all fuzzing coverage builds need to stay on GCC. Otherwise we have no way to dump coverage while the process is still running. Thanks!
(Assignee)

Comment 17

3 months ago
Created attachment 8999045 [details] [diff] [review]
Switch Linux builds to clang.

Added exclusion for fuzzing ccov build. Carrying over r+.
Attachment #8998428 - Attachment is obsolete: true
Attachment #8999045 - Flags: review+
(Assignee)

Updated

3 months ago
Attachment #8998428 - Attachment is obsolete: false
(Assignee)

Comment 18

3 months ago
Comment on attachment 8999045 [details] [diff] [review]
Switch Linux builds to clang.

As per irc discussion, the fuzzing-asan-ccov build actually doesn't want gcc.
Attachment #8999045 - Attachment is obsolete: true

Comment 19

3 months ago
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/48b8084995b1
Switch Linux builds to clang. r=froydnj

Comment 20

3 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/48b8084995b1
Status: NEW → RESOLVED
Last Resolved: 3 months ago
status-firefox63: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
some perf improvements:
== Change summary for alert #15061 (as of Thu, 16 Aug 2018 05:34:49 GMT) ==

Improvements:

  7%  perf_reftest_singletons linux64 opt e10s stylo     49.52 -> 45.91
  7%  tps linux64 opt e10s stylo                         11.72 -> 10.92
  6%  a11yr linux64 opt e10s stylo                       224.83 -> 210.32
  6%  perf_reftest_singletons linux64-qr opt e10s stylo  50.67 -> 47.58
  6%  a11yr linux64-qr opt e10s stylo                    223.84 -> 210.25
  6%  tart linux64 opt e10s stylo                        2.23 -> 2.10
  5%  speedometer linux64-qr opt e10s stylo              70.27 -> 73.88
  5%  speedometer linux64 opt e10s stylo                 70.90 -> 74.51
  5%  about_preferences_basic linux64 opt e10s stylo     165.65 -> 157.78
  5%  ts_paint_heavy linux64 opt e10s stylo              203.33 -> 193.67
  5%  tp5o linux64 opt e10s stylo                        146.06 -> 139.24
  5%  ts_paint_webext linux64 opt e10s stylo             205.50 -> 196.08
  5%  tp5o_webext linux64-qr opt e10s stylo              226.05 -> 215.84
  4%  tp5o linux64-qr opt e10s stylo                     151.62 -> 144.94
  4%  ts_paint linux64 opt e10s stylo                    203.96 -> 195.17
  4%  tp5o_webext linux64 opt e10s stylo                 218.53 -> 209.12
  4%  cpstartup content-process-startup linux64 opt e10s stylo169.77 -> 162.67
  4%  tscrollx linux64 opt e10s stylo                    0.71 -> 0.69
  4%  ts_paint_heavy linux64-qr opt e10s stylo           271.25 -> 261.58
  3%  tsvgx linux64 opt e10s stylo                       184.36 -> 177.96
  3%  about_preferences_basic linux64-qr opt e10s stylo  165.84 -> 160.11
  3%  ts_paint_webext linux64-qr opt e10s stylo          273.04 -> 263.83
  3%  ts_paint linux64-qr opt e10s stylo                 271.08 -> 262.92
  3%  tsvgx linux64-qr opt e10s stylo                    337.25 -> 327.37
  3%  sessionrestore linux64 opt e10s stylo              278.63 -> 271.08
  2%  stylebench linux64 opt e10s stylo                  38.89 -> 39.81
  2%  tscrollx linux64-qr opt e10s stylo                 0.42 -> 0.41
  2%  tp5o_scroll linux64 opt e10s stylo                 0.60 -> 0.59
  2%  stylebench linux64-qr opt e10s stylo               39.13 -> 39.94

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=15061
some build time improvements:
== Change summary for alert #15096 (as of Thu, 16 Aug 2018 05:34:49 GMT) ==

Improvements:

 27%  build times linux64 opt taskcluster-c5.4xlarge tup     1,165.97 -> 849.46
 25%  build times linux64 opt taskcluster-m5d.4xlarge tup    1,139.04 -> 849.45
 23%  build times linux64 opt taskcluster-c4.4xlarge tup     1,350.73 -> 1,045.54
 21%  build times linux64 opt taskcluster-m4.4xlarge tup     1,451.89 -> 1,152.88

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=15096

Comment 23

3 months ago
Interestingly, I'm not seeing a build time speedup on Linux64 opt or plain builds. Why is Tup seeing such a drastic speedup? Perhaps the improvement is lost in the noise?
(Assignee)

Comment 24

3 months ago
sccache?

Comment 25

3 months ago
Neither plain builds nor tup builds use sccache.
(Assignee)

Comment 26

3 months ago
Plain builds are still using gcc because they don't use mozconfig.linux.
(Assignee)

Updated

3 months ago
Blocks: 1484862
(Assignee)

Updated

3 months ago
Blocks: 1485219

Comment 27

2 months ago
"perf" key word?
You need to log in before you can comment on or make changes to this bug.