Closed Bug 1700475 Opened 3 years ago Closed 3 years ago

Asan build failing - "failed to run custom build command for `webrender v0.61.0...`"

Categories

(Core :: Graphics: WebRender, defect)

defect

Tracking

()

RESOLVED FIXED
89 Branch
Tracking Status
firefox89 --- fixed

People

(Reporter: benc, Assigned: emilio)

References

Details

Attachments

(4 files)

Attached file asan_build_fail.txt

(this is on Linux, building Thunderbird rather than Firefox).

My ASan mozconfig no longer builds. It bails out with:

8:20.04 error: failed to run custom build command for `webrender v0.61.0 (/home/ben/tb/mozilla/gfx/wr/webrender)`
48:20.04 Caused by:
48:20.05   process didn't exit successfully: `/home/ben/tb/mozilla/objdir-tb-asan/release/build/webrender-6b1d7a36ddbaeea3/build-script-build` (exit code: 1)

... followed by a long stdout dump, which I've attached.

The offending mozconfig contains:

ac_add_options --enable-application=comm/mail
ac_add_options --enable-calendar
ac_add_options --with-ccache=/usr/bin/ccache
#ac_add_options --enable-debug
#ac_add_options --disable-optimize
ac_add_options --enable-clang-plugin
#ac_add_options --enable-dmd

mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/objdir-tb-asan

# Enable ASan specific code and build workarounds
ac_add_options --enable-address-sanitizer

# Add ASan to our compiler flags
export CFLAGS="-fsanitize=address -Dxmalloc=myxmalloc -fPIC"
export CXXFLAGS="-fsanitize=address -Dxmalloc=myxmalloc -fPIC"

# Additionally, we need the ASan flag during linking. Normally, our C/CXXFLAGS would
# be used during linking as well but there is at least one place in our build where
# our CFLAGS are not added during linking.
# Note: The use of this flag causes Clang to automatically link the ASan runtime :)
export LDFLAGS="-fsanitize=address"

# These three are required by ASan
ac_add_options --disable-jemalloc
ac_add_options --disable-crashreporter
ac_add_options --disable-elf-hack

# Keep symbols to symbolize ASan traces later
export MOZ_DEBUG_SYMBOLS=1
ac_add_options --enable-debug-symbols
ac_add_options --disable-install-strip

# Settings for an opt build (preferred)
# The -gline-tables-only ensures that all the necessary debug information for ASan
# is present, but the rest is stripped so the resulting binaries are smaller.
ac_add_options --enable-optimize="-O2 -gline-tables-only"
ac_add_options --disable-debug

# Settings for a debug+opt build
#ac_add_options --enable-optimize
#ac_add_options --enable-debug

# ASan specific options on Linux
ac_add_options --enable-valgrind


# enable logrefcnt (implicit for debug, not release)
ac_add_options --enable-logrefcnt
#ac_add_options --enable-trace-malloc

My standard mozconfig debug builds still works fine.

mach environment gives:

$ ./mach environment
platform:
	Linux-5.10.0-1-amd64-x86_64-with-glibc2.31
python version:
	3.9.1+ (default, Jan 10 2021, 15:42:50) 
[GCC 10.2.1 20201224]
python prefix:
	/home/ben/.mozbuild/_virtualenvs/mach
mach cwd:
	/home/ben/tb/mozilla
os cwd:
	/home/ben/tb/mozilla
mach directory:
	/home/ben/tb/mozilla
state directory:
	/home/ben/.mozbuild
object directory:
	/home/ben/tb/mozilla/objdir-tb-asan
mozconfig path:
	/home/ben/tb/mozilla/mozconfig
mozconfig configure args:
	--enable-application=comm/mail
	--enable-calendar
	--with-ccache=/usr/bin/ccache
	--enable-clang-plugin
	--enable-address-sanitizer
	--disable-jemalloc
	--disable-crashreporter
	--disable-elf-hack
	--enable-debug-symbols
	--disable-install-strip
	--enable-optimize=-O2 -gline-tables-only
	--disable-debug
	--enable-valgrind
	--enable-logrefcnt
config topsrcdir:
	/home/ben/tb/mozilla
config topobjdir:
	/home/ben/tb/mozilla/objdir-tb-asan

Anyone have any ideas on what is going wrong? Any more information I can provide which might help?

Since this leakage is happening in webrender, I'm going to send this that way, but feel free to send it back if this is a more core build issue :)

Component: General → Graphics: WebRender
Product: Firefox Build System → Core

Can you provide the verbose log? ./mach build -vv or what not. Note that you're enabling valgrind as well which seems a bit odd but...

(In reply to Emilio Cobos Álvarez (:emilio) from comment #2)

Can you provide the verbose log? ./mach build -vv or what not. Note that you're enabling valgrind as well which seems a bit odd but...

That was just me cargo-culting :-) It's mentioned as an extra bit of config at the bottom of:
https://firefox-source-docs.mozilla.org/tools/sanitizer/asan.html#adjusting-the-build-configuration

(I'll try it without valgrind, and do a ./mach build -vv now)

Attached file asan_build_v.log.gz

I gzipped it (it seemed awfully impolite to add 12MB of my uncompressed build rubbish to the bugzilla historical record :-)

I removed the "--enable-valgrind" line, but still same issue.

Attached file Relevant bits.

So it seems GLSLopt or something is leaking and that is somehow causing the build to fail? Probably we don't care about it much... Why are build scripts using ASAN anyways? I guess we build glslopt with the regular CFLAGS / CXXFLAGS?

(In reply to Emilio Cobos Álvarez (:emilio) from comment #5)

So it seems GLSLopt or something is leaking and that is somehow causing the build to fail? Probably we don't care about it much... Why are build scripts using ASAN anyways? I guess we build glslopt with the regular CFLAGS / CXXFLAGS?

Yup - I could imagine a bunch of utilities like that would leak by design anyway, letting the OS clean up when they exit. Especially for anything that looks at all like a compiler (or optimiser).

I'm stumbling about in the dark, but if there are any flag/config hacks you'd like me to try out to save you some pointless build time, just let me know!

I can reproduce this trivially (was trying to do an asan build today).

This reproduces just with:

mk_add_options AUTOCLOBBER=1
ac_add_options --enable-bootstrap
ac_add_options --enable-address-sanitizer
ac_add_options --disable-jemalloc
ac_add_options --disable-crashreporter
ac_add_options --disable-elf-hack
ac_add_options --enable-debug

We surely have had to deal with this in other places... Mike, Jamie, do you know how does automation deal with this?

Flags: needinfo?(mh+mozilla)
Flags: needinfo?(jnicol)

Ah, I bet it's this.

Also update the build logs, manually passing those flags shouldn't be
needed anymore.

Assignee: nobody → emilio
Status: NEW → ASSIGNED
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(jnicol)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #8)

Ah, I bet it's this.

I think I also hit this recently in oss-fuzz. One of the host tools is failing because we now build it with ASan and a memleak is detected.

This is caused by recent changes that :truber made in an effort to ensure that this code is built with ASan (we previously missed at least one bug because ASan was missing here). The problem is though that this binary does not link mozglue and therefore does not have our standard ASan settings (which include detect_leaks=0).

:truber, can you ensure that everything is covered here? Emilio already seems to have updated some things, we should ensure that nothing else breaks because of this. Thanks!

Flags: needinfo?(jschwartzentruber)
Attachment #9212607 - Attachment description: Bug 1700475 - Allow passing ASAN_OPTIONS in mozconfig, and pass detect_leaks=0 by default if enabling ASAN. r=#build → Bug 1700475 - Allow the WebRender build script to leak under ASAN. r=#build
Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/633535d970af
Allow the WebRender build script to leak under ASAN. r=firefox-build-system-reviewers,glandium

I've tried fuzzing-ccov and fuzzing-asan-opt locally and both work with his patch. We'll have to deal with breaks as they come until this is fixed properly in cargo. Thanks Emilio!

Flags: needinfo?(jschwartzentruber)
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 89 Branch
Regressions: 1635327
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: