Closed
Bug 1409267
Opened 7 years ago
Closed 7 years ago
ASAN builds fail with clang 5.0
Categories
(Firefox Build System :: General, defect, P3)
Firefox Build System
General
Tracking
(firefox58 fixed)
RESOLVED
FIXED
mozilla58
Tracking | Status | |
---|---|---|
firefox58 | --- | fixed |
People
(Reporter: glandium, Assigned: glandium)
References
Details
Attachments
(1 file)
https://treeherder.mozilla.org/logviewer.html#?job_id=136732246&repo=try&lineNumber=27705
[task 2017-10-13T07:09:00.452Z] 07:09:00 INFO - Executing /builds/worker/workspace/build/src/obj-firefox/dist/bin/shlibsign -v -o ../../dist/firefox/libsoftokn3.chk -i ../../dist/firefox/libsoftokn3.so
[task 2017-10-13T07:09:00.453Z] 07:09:00 INFO - Library File: ../../dist/firefox/libsoftokn3.so 934664 bytes
[task 2017-10-13T07:09:00.453Z] 07:09:00 INFO - Check File: ../../dist/firefox/libsoftokn3.chk
[task 2017-10-13T07:09:00.453Z] 07:09:00 INFO - hash: 32 bytes
[task 2017-10-13T07:09:00.453Z] 07:09:00 INFO - 48 ee d5 a0 35 01 7f 44 bf a1
[task 2017-10-13T07:09:00.454Z] 07:09:00 INFO - a0 b7 3a ce f3 86 7b 1f 70 04
[task 2017-10-13T07:09:00.454Z] 07:09:00 INFO - 38 95 0c 2c f5 76 10 b7 d5 8d
[task 2017-10-13T07:09:00.454Z] 07:09:00 INFO - d9 a1
[task 2017-10-13T07:09:00.454Z] 07:09:00 INFO - signature: 64 bytes
[task 2017-10-13T07:09:00.454Z] 07:09:00 INFO - 3a 65 c3 ab 34 74 ba 8e d1 d8
[task 2017-10-13T07:09:00.455Z] 07:09:00 INFO - 75 db 1b 38 95 9b ad 16 79 fa
[task 2017-10-13T07:09:00.455Z] 07:09:00 INFO - ab e9 f4 77 1e ed 90 e4 00 24
[task 2017-10-13T07:09:00.455Z] 07:09:00 INFO - 61 de 0b e9 8a fc 24 52 98 11
[task 2017-10-13T07:09:00.455Z] 07:09:00 INFO - a2 7e 4e 70 58 0a f5 25 6e 24
[task 2017-10-13T07:09:00.456Z] 07:09:00 INFO - d7 1e 58 eb 05 08 83 a7 13 8c
[task 2017-10-13T07:09:00.456Z] 07:09:00 INFO - f3 c1 bf 0c
[task 2017-10-13T07:09:00.456Z] 07:09:00 INFO - ==36118==LeakSanitizer has encountered a fatal error.
[task 2017-10-13T07:09:00.456Z] 07:09:00 INFO - ==36118==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1
[task 2017-10-13T07:09:00.456Z] 07:09:00 INFO - ==36118==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc)
[task 2017-10-13T07:09:00.456Z] 07:09:00 INFO - Traceback (most recent call last):
[task 2017-10-13T07:09:00.457Z] 07:09:00 INFO - File "/builds/worker/workspace/build/src/toolkit/mozapps/installer/packager.py", line 333, in <module>
[task 2017-10-13T07:09:00.457Z] 07:09:00 INFO - main()
[task 2017-10-13T07:09:00.457Z] 07:09:00 INFO - File "/builds/worker/workspace/build/src/toolkit/mozapps/installer/packager.py", line 327, in main
[task 2017-10-13T07:09:00.457Z] 07:09:00 INFO - copier.copy(args.destination)
[task 2017-10-13T07:09:00.457Z] 07:09:00 INFO - File "/builds/worker/workspace/build/src/python/mozbuild/mozpack/copier.py", line 431, in copy
[task 2017-10-13T07:09:00.458Z] 07:09:00 INFO - copy_results.append((destfile, f.copy(destfile, skip_if_older)))
[task 2017-10-13T07:09:00.458Z] 07:09:00 INFO - File "/builds/worker/workspace/build/src/toolkit/mozapps/installer/packager.py", line 125, in copy
[task 2017-10-13T07:09:00.458Z] 07:09:00 INFO - errors.fatal('Error while signing %s' % self.path)
[task 2017-10-13T07:09:00.458Z] 07:09:00 INFO - File "/builds/worker/workspace/build/src/python/mozbuild/mozpack/errors.py", line 103, in fatal
[task 2017-10-13T07:09:00.458Z] 07:09:00 INFO - self._handle(self.FATAL, msg)
[task 2017-10-13T07:09:00.458Z] 07:09:00 INFO - File "/builds/worker/workspace/build/src/python/mozbuild/mozpack/errors.py", line 98, in _handle
[task 2017-10-13T07:09:00.459Z] 07:09:00 INFO - raise ErrorMessage(msg)
[task 2017-10-13T07:09:00.459Z] 07:09:00 INFO - mozpack.errors.ErrorMessage: Error: Error while signing ../../dist/firefox/libsoftokn3.so
Assignee | ||
Comment 1•7 years ago
|
||
Setting the suggested LSAN_OPTIONS only yields:
[task 2017-10-17T06:54:14.055Z] 06:54:14 INFO - Executing /builds/worker/workspace/build/src/obj-firefox/dist/bin/shlibsign -v -o ../../dist/firefox/libsoftokn3.chk -i ../../dist/firefox/libsoftokn3.so
[task 2017-10-17T06:54:14.055Z] 06:54:14 INFO - ==36334==AddressSanitizer: failed to intercept '__isoc99_printf'
[task 2017-10-17T06:54:14.056Z] 06:54:14 INFO - ==36334==AddressSanitizer: failed to intercept '__isoc99_sprintf'
[task 2017-10-17T06:54:14.056Z] 06:54:14 INFO - ==36334==AddressSanitizer: failed to intercept '__isoc99_snprintf'
[task 2017-10-17T06:54:14.056Z] 06:54:14 INFO - ==36334==AddressSanitizer: failed to intercept '__isoc99_fprintf'
[task 2017-10-17T06:54:14.056Z] 06:54:14 INFO - ==36334==AddressSanitizer: failed to intercept '__isoc99_vprintf'
[task 2017-10-17T06:54:14.057Z] 06:54:14 INFO - ==36334==AddressSanitizer: failed to intercept '__isoc99_vsprintf'
[task 2017-10-17T06:54:14.057Z] 06:54:14 INFO - ==36334==AddressSanitizer: failed to intercept '__isoc99_vsnprintf'
[task 2017-10-17T06:54:14.057Z] 06:54:14 INFO - ==36334==AddressSanitizer: failed to intercept '__isoc99_vfprintf'
[task 2017-10-17T06:54:14.057Z] 06:54:14 INFO - ==36334==AddressSanitizer: failed to intercept 'process_vm_readv'
[task 2017-10-17T06:54:14.058Z] 06:54:14 INFO - ==36334==AddressSanitizer: failed to intercept 'process_vm_writev'
[task 2017-10-17T06:54:14.058Z] 06:54:14 INFO - ==36334==AddressSanitizer: libc interceptors initialized
[task 2017-10-17T06:54:14.058Z] 06:54:14 INFO - || `[0x10007fff8000, 0x7fffffffffff]` || HighMem ||
[task 2017-10-17T06:54:14.058Z] 06:54:14 INFO - || `[0x02008fff7000, 0x10007fff7fff]` || HighShadow ||
[task 2017-10-17T06:54:14.059Z] 06:54:14 INFO - || `[0x00008fff7000, 0x02008fff6fff]` || ShadowGap ||
[task 2017-10-17T06:54:14.059Z] 06:54:14 INFO - || `[0x00007fff8000, 0x00008fff6fff]` || LowShadow ||
[task 2017-10-17T06:54:14.059Z] 06:54:14 INFO - || `[0x000000000000, 0x00007fff7fff]` || LowMem ||
[task 2017-10-17T06:54:14.059Z] 06:54:14 INFO - MemToShadow(shadow): 0x00008fff7000 0x000091ff6dff 0x004091ff6e00 0x02008fff6fff
[task 2017-10-17T06:54:14.059Z] 06:54:14 INFO - redzone=16
[task 2017-10-17T06:54:14.060Z] 06:54:14 INFO - max_redzone=2048
[task 2017-10-17T06:54:14.060Z] 06:54:14 INFO - quarantine_size_mb=256M
[task 2017-10-17T06:54:14.060Z] 06:54:14 INFO - thread_local_quarantine_size_kb=1024K
[task 2017-10-17T06:54:14.060Z] 06:54:14 INFO - malloc_context_size=30
[task 2017-10-17T06:54:14.060Z] 06:54:14 INFO - SHADOW_SCALE: 3
[task 2017-10-17T06:54:14.060Z] 06:54:14 INFO - SHADOW_GRANULARITY: 8
[task 2017-10-17T06:54:14.061Z] 06:54:14 INFO - SHADOW_OFFSET: 0x7fff8000
[task 2017-10-17T06:54:14.061Z] 06:54:14 INFO - ==36334==Installed the sigaction for signal 11
[task 2017-10-17T06:54:14.061Z] 06:54:14 INFO - ==36334==Installed the sigaction for signal 7
[task 2017-10-17T06:54:14.061Z] 06:54:14 INFO - ==36334==Installed the sigaction for signal 8
[task 2017-10-17T06:54:14.061Z] 06:54:14 INFO - ==36334==T0: stack [0x7ffd91a76000,0x7ffd92276000) size 0x800000; local=0x7ffd92273f9c
[task 2017-10-17T06:54:14.062Z] 06:54:14 INFO - ==36334==AddressSanitizer Init done
[task 2017-10-17T06:54:14.062Z] 06:54:14 INFO - Library File: ../../dist/firefox/libsoftokn3.so 1579360 bytes
[task 2017-10-17T06:54:14.062Z] 06:54:14 INFO - Check File: ../../dist/firefox/libsoftokn3.chk
[task 2017-10-17T06:54:14.062Z] 06:54:14 INFO - hash: 32 bytes
[task 2017-10-17T06:54:14.062Z] 06:54:14 INFO - 0b 65 16 cc 1d 3f 5e 0b 3d ba
[task 2017-10-17T06:54:14.063Z] 06:54:14 INFO - 6b eb 67 e3 28 f6 10 a2 98 e0
[task 2017-10-17T06:54:14.063Z] 06:54:14 INFO - 9b 04 b0 ae 3c 62 8b 1d 6b 5f
[task 2017-10-17T06:54:14.063Z] 06:54:14 INFO - 89 54
[task 2017-10-17T06:54:14.063Z] 06:54:14 INFO - signature: 64 bytes
[task 2017-10-17T06:54:14.063Z] 06:54:14 INFO - 17 ac 2b 35 63 00 c4 bc 53 e6
[task 2017-10-17T06:54:14.063Z] 06:54:14 INFO - 91 33 92 3c c0 78 35 e7 1b c6
[task 2017-10-17T06:54:14.064Z] 06:54:14 INFO - 71 df 5e 9c e5 c6 e4 04 d4 d0
[task 2017-10-17T06:54:14.064Z] 06:54:14 INFO - ca fc 77 c7 14 a8 4c bb 08 3d
[task 2017-10-17T06:54:14.064Z] 06:54:14 INFO - 38 f1 1b da a1 d7 c3 ba cb e2
[task 2017-10-17T06:54:14.064Z] 06:54:14 INFO - 62 99 fb 7d a0 60 a6 29 db a6
[task 2017-10-17T06:54:14.064Z] 06:54:14 INFO - 17 b5 31 64
[task 2017-10-17T06:54:14.065Z] 06:54:14 INFO - ==36335==Could not attach to thread 36334 (errno 1).
[task 2017-10-17T06:54:14.065Z] 06:54:14 INFO - ==36335==Failed suspending threads.
(most likely because of the limitations on ptrace on taskcluster) :(
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → nobody
Component: Build Config → Build
Product: Core → NSS
Version: unspecified → other
Comment 2•7 years ago
|
||
We're using clang 5.0 on OSS-Fuzz, as well as on TC with ASan+UBSan for fuzzing [1]. Both are Docker-based. On TC we had to enable the 'allowPtrace' feature [2] to make this work. Did you try that?
[1] https://searchfox.org/nss/source/automation/taskcluster/docker-fuzz/setup.sh#45
(the comment is old, we're just using the latest version there)
[2] https://searchfox.org/nss/source/automation/taskcluster/graph/src/extend.js#495
Assignee | ||
Comment 3•7 years ago
|
||
Tried again with allowPtrace:
https://tools.taskcluster.net/groups/LLtRPq0vSByMsFvxZat2ig/tasks/Px3fUGMzQY2VLrO0bs76Kw/details
Still failed similarly, with the same "Could not attach to thread x" message.
https://queue.taskcluster.net/v1/task/Px3fUGMzQY2VLrO0bs76Kw/runs/0/artifacts/public/logs/live_backing.log
Note that one was with clang 4, which fails too.
Another with clang 5 and allowPtrace here:
https://queue.taskcluster.net/v1/task/UwzBCvnSQfWSbZ6bYZuxjQ/runs/0/artifacts/public/logs/live_backing.log
Comment 4•7 years ago
|
||
We fail when running shlibsign, I guess we don't need that for ASan builds anyway? We have -Dsign_libs=0 set for our fuzzing builds. IIRC we ran into similar issues, or TC limitations, and decided to just turn that off.
Updated•7 years ago
|
Priority: -- → P3
Comment 5•7 years ago
|
||
This is the same failure as bug 1377561 and it is not specific to Clang 5.0. Our version 0f 4.0.1 on TC also suffers from the same problem. For the fuzzing build (which uses 4.0.1) we circumvented this by disabling LSan (which is not a great option for the general ASan builds).
As mentioned in the other bug, I could reproduce this locally, but only with the version we have on TC. The Ubuntu 4.0 and 5.0 versions from the LLVM repos do not reproduce this.
Comment 6•7 years ago
|
||
Mike, can you tell me where/how to download the Clang 5 toolchain build you used for this? I couldn't find it on TC. Thanks!
Flags: needinfo?(mh+mozilla)
Comment 7•7 years ago
|
||
Nevermind the needinfo, I found it already, but I also managed to reproduce it locally with clang 4 from TC.
(In reply to Mike Hommey [:glandium] from comment #1)
> [task 2017-10-17T06:54:14.065Z] 06:54:14 INFO - ==36335==Could not
> attach to thread 36334 (errno 1).
> [task 2017-10-17T06:54:14.065Z] 06:54:14 INFO - ==36335==Failed
> suspending threads.
The exact same thing also happens locally on my normal machine (Ubuntu 16.04 LTS). If set /proc/sys/kernel/yama/ptrace_scope to 0, the error goes away. That option should normally only affect ptrace on non-child processes, right?
dmesg also shows the error:
> [45196470.781634] ptrace of pid 16494 was attempted by: shlibsign (pid 16495)
I'm still trying to find a minimal test for this now. If anyone has an idea, please let me know :)
Flags: needinfo?(mh+mozilla)
Assignee | ||
Comment 8•7 years ago
|
||
Build artifacts expire quickly on try, you'd need to do another push to get one.
Comment 9•7 years ago
|
||
I think I found something that might explain this. I confirmed two straces (I know LSan doesn't work under strace) and saw that the binary compiled with my local Clangs perform an additional syscall that the binary produced with Clang from TC does not:
> prctl(PR_SET_PTRACER, 9434) = 0
I traced this prctl call to this code in LSan's runtime:
> // On some systems we have to explicitly declare that we want to be traced
> // by the tracer thread.
>#ifdef PR_SET_PTRACER
> internal_prctl(PR_SET_PTRACER, tracer_pid, 0, 0, 0);
>#endif
Could it be that this code is not included in the Clang build on TC? The Clang I downloaded from RelEng is 4.0.1 (which also had exactly the same problem). It would explain the symptoms we see here.
Mike, any idea about this?
Flags: needinfo?(mh+mozilla)
Assignee | ||
Comment 10•7 years ago
|
||
PR_SET_PTRACER is new to Linux kernel 3.4, and the headers we have on the build machines are those of kernel 2.6.32. Even after bug 1399679, they would still be those of kernel 3.2.
But the builds *do* run on kernel 3.13, meaning it's available at runtime. So if we could define PR_SET_PTRACER to the right value (0x59616d61) somehow, that code would be enabled. It seems like patching that file to have
#ifndef PR_SET_PTRACER
#define PR_SET_PTRACER 0x59616d61
#endif
would be the right thing to do and upstream.
Flags: needinfo?(mh+mozilla)
Assignee | ||
Comment 11•7 years ago
|
||
Assignee | ||
Comment 12•7 years ago
|
||
Same player shoot again: https://treeherder.mozilla.org/#/jobs?repo=try&revision=94f09e96c4a907f643452105de5ad64b8a8b9b00
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → mh+mozilla
Component: Build → Build Config
Product: NSS → Core
Version: other → unspecified
Comment hidden (mozreview-request) |
Updated•7 years ago
|
Comment 14•7 years ago
|
||
mozreview-review |
Comment on attachment 8925833 [details]
Bug 1409267 - Patch compiler-rt to always use prctl(PR_SET_PTRACER).
https://reviewboard.mozilla.org/r/197026/#review202258
r=me with nit addressed; feel free to rephrase the comment or not as you see fit.
::: build/build-clang/pr_set_ptracer.patch:7
(Diff revision 1)
> ++// Sufficiently old kernel header don't provide this value, but we can still
> ++// call prctl with it if the runtime kernel is newer. Even if it's not, prctl
> ++// will only return EINVAL.
Nit: "old kernel headers"
WDYT about rephrasing this to:
"Sufficiently old kernel headers don't provide this value, but we can still call prctl with it. If the runtime kernel is new enough, the prctl call will have the desired effect; if the kernel's too old, the call will error and we can ignore said error."
The "call prctl with it if the runtime kernel is newer. Even if it's not..." phrasing is a little strange, since we're saying we can only call prctl with the value on a new enough kernel, but then we turn around and contradict ourselves.
Attachment #8925833 -
Flags: review?(nfroyd) → review+
Comment hidden (mozreview-request) |
Comment 16•7 years ago
|
||
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/autoland/rev/b2dd86a66344
Patch compiler-rt to always use prctl(PR_SET_PTRACER). r=froydnj
Assignee | ||
Comment 17•7 years ago
|
||
Patch landed upstream: https://reviews.llvm.org/rL317668
Comment 18•7 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox58:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Updated•7 years ago
|
Product: Core → Firefox Build System
You need to log in
before you can comment on or make changes to this bug.
Description
•