Packaging libgraphitewasm.so fails in stage-package step during elfhack invocation
Categories
(Firefox Build System :: General, defect)
Tracking
(firefox74 fixed)
| Tracking | Status | |
|---|---|---|
| firefox74 | --- | fixed |
People
(Reporter: gk, Assigned: glandium)
Details
Attachments
(2 files)
We are trying to test RLBox with Graphite2 in Tor Browser which is based on ESR68. The currently backported set of patches[1] fails with
{{{
0:01.84 /usr/bin/make -j4 -s stage-package
0:11.36 ../../dist/firefox/libnspr4.so: Reduced by 7992 bytes
0:11.36 ../../dist/firefox/libplc4.so: No gain. Skipping
0:11.36 ../../dist/firefox/libplds4.so: Couldn't find .bss. Skipping
0:18.92 ../../dist/firefox/libxul.so: Reduced by 7147304 bytes
0:19.23 ../../dist/firefox/libmozgtk.so: Couldn't find .bss. Skipping
0:19.24 ../../dist/firefox/gtk2/libmozgtk.so: Couldn't find .bss. Skipping
0:19.25 ../../dist/firefox/libgraphitewasm.so: Traceback (most recent call last):
0:19.25 File "/var/tmp/build/firefox-f557e0b636c3/toolkit/mozapps/installer/packager.py", line 347, in <module>
0:19.25 main()
0:19.25 File "/var/tmp/build/firefox-f557e0b636c3/toolkit/mozapps/installer/packager.py", line 341, in main
0:19.25 copier.copy(args.destination)
0:19.25 File "/var/tmp/build/firefox-f557e0b636c3/python/mozbuild/mozpack/copier.py", line 431, in copy
0:19.25 copy_results.append((destfile, f.copy(destfile, skip_if_older)))
0:19.25 File "/var/tmp/build/firefox-f557e0b636c3/python/mozbuild/mozpack/files.py", line 310, in copy
0:19.25 elfhack(dest)
0:19.25 File "/var/tmp/build/firefox-f557e0b636c3/python/mozbuild/mozpack/executables.py", line 125, in elfhack
0:19.25 errors.fatal('Error executing ' + ' '.join(cmd))
0:19.25 File "/var/tmp/build/firefox-f557e0b636c3/python/mozbuild/mozpack/errors.py", line 103, in fatal
0:19.25 self._handle(self.FATAL, msg)
0:19.25 File "/var/tmp/build/firefox-f557e0b636c3/python/mozbuild/mozpack/errors.py", line 98, in _handle
0:19.25 raise ErrorMessage(msg)
0:19.25 mozpack.errors.ErrorMessage: Error: Error executing /var/tmp/build/firefox-f557e0b636c3/obj-x86_64-pc-linux-gnu/build/unix/elfhack/elfhack ../../dist/firefox/libgraphitewasm.so
0:19.27 make[1]: *** [stage-package] Error 1
0:19.27 make: *** [stage-package] Error 2
}}}
As we are on the esr68 branch we use a similar clang-based toolchain as Mozilla which is based on clang 8.0.1. GCC is 8.3.0, binutils is 2.31.1, and configure thinks the linker is gold. The .mozconfig file we have is
{{{
. $topsrcdir/browser/config/mozconfig
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-@CONFIG_GUESS@
mk_add_options MOZ_APP_DISPLAYNAME="Tor Browser"
export MOZILLA_OFFICIAL=1
CC="clang --gcc-toolchain=/var/tmp/dist/gcc"
CXX="clang++ --gcc-toolchain=/var/tmp/dist/gcc"
HOST_CC=$CC
HOST_CXX=$CXX
export BINDGEN_CFLAGS='--gcc-toolchain=/var/tmp/dist/gcc'
ac_add_options --enable-optimize
ac_add_options --enable-official-branding
ac_add_options --enable-default-toolkit=cairo-gtk3
ac_add_options --enable-tor-browser-update
ac_add_options --enable-signmar
ac_add_options --enable-verify-mar
ac_add_options --disable-strip
ac_add_options --disable-install-strip
ac_add_options --disable-tests
ac_add_options --disable-debug
ac_add_options --disable-crashreporter
ac_add_options --disable-webrtc
ac_add_options --disable-eme
ac_add_options --enable-proxy-bypass-protection
ac_add_options MOZ_TELEMETRY_REPORTING=
}}}
Something to note is that I currently use commit 87b7a019472770f08d49cf3b558867dc76ea74eb for the wasi-sdk due to the clang 8.0.1 requirements in case that matters.
I've uploaed the resulting library for further inspection.[2]
[1] https://gitweb.torproject.org/user/gk/tor-browser.git/log/?h=bug_32380_v6
[2] https://people.torproject.org/~gk/testbuilds/libgraphitewasm.so{.asc}
Comment 1•5 years ago
•
|
||
elfhack fails on the libgraphitewasm.so file but doesn't dump any useful info when it does so (Segmentation fault (core dumped)). I'm not familiar enough with elfhack to understand why that might be happening. According to the README the library isn't awfully robust especially on non-libxul targets, so I guess that shouldn't be terribly shocking.
I assume elfhack is making some assumptions about the library that aren't true of libgraphitewasm.so. We can either fix elfhack so it won't freak out or always skip it for wasm targets.
[Note: the previous version of this comment had incorrect info which I edited out.]
Updated•5 years ago
|
Comment 2•5 years ago
|
||
Valgrind dump attached. Looks like some sort of bad pointer math somewhere.
Comment 3•5 years ago
|
||
What appears to be happening is that we're doing relocations on the dynamic section (.dynamic), which we get here:
https://searchfox.org/mozilla-central/source/build/unix/elfhack/elfhack.cpp#764
We look for the .rela.dyn section for the relocations:
https://searchfox.org/mozilla-central/source/build/unix/elfhack/elfhack.cpp#770
And then, eventually, we want to look at the symtab associated with that section:
https://searchfox.org/mozilla-central/source/build/unix/elfhack/elfhack.cpp#830
And that symtab is nullptr, which crashes in predictable ways. We try to get the symtab here:
https://searchfox.org/mozilla-central/source/build/unix/elfhack/elf.cpp#491
The shdr.sh_link field is set correctly in the binary, pointing at .dynsym, so I guess the next question is why the getSection call there fails. Or maybe we're not using the correct parent for when we construct the ElfSection for .rela.dyn?
Comment 4•5 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #3)
The
shdr.sh_linkfield is set correctly in the binary, pointing at.dynsym, so I guess the next question is why thegetSectioncall there fails. Or maybe we're not using the correctparentfor when we construct theElfSectionfor.rela.dyn?
The long story short is that this all happens while we're initializing the section array for the Elf object:
https://searchfox.org/mozilla-central/source/build/unix/elfhack/elf.cpp#179-185
we're initializing section 1, the .dynsym section, which is a symtab-y section, so we wind up in ElfSymtab_Section where we want to ask for a section for some symbol:
https://searchfox.org/mozilla-central/source/build/unix/elfhack/elf.cpp#827-828
and so we start to create ElfSections on demand, until we get to the point where we hit an infinite recursion guard (!):
https://searchfox.org/mozilla-central/source/build/unix/elfhack/elf.cpp#310
and we return nullptr up the chain, with predictably bad results later.
I think we could work around this by changing lucetc to lay out the binary differently (?). The other option is some surgery on elfhack to make it less dependent on the particular ordering of the sections.
| Assignee | ||
Comment 5•5 years ago
|
||
until we get to the point where we hit an infinite recursion guard (!):
The path that leads to that is that while creating the ElfSymtab_Section, one symbol (_DYNAMIC) refers to the .dynamic section, which makes us initialize section 11, initializing a ElfDynamic_Section. While doing that, we initialize a ElfLocation for DT_RELA, which points to section 5, which we initialize a ElfRel_Section for, which has a sh_link or 1, so ElfSection initializes link trying to get section 1...
I think we could work around this by changing lucetc to lay out the binary differently (?). The other option is some surgery on elfhack to make it less dependent on the particular ordering of the sections.
The problem is not the order of the sections. It's that _DYNAMIC symbol. A elfhack workaround would be to start by initializing the .dynamic section.
| Assignee | ||
Comment 6•5 years ago
|
||
Updated•5 years ago
|
Comment 8•5 years ago
|
||
| bugherder | ||
Description
•