Closed Bug 1495733 Opened 6 years ago Closed 3 years ago

elfhack crashes when library is larger than 4G

Categories

(Firefox Build System :: General, defect)

62 Branch
defect
Not set
normal

Tracking

(firefox97 fixed)

RESOLVED FIXED
97 Branch
Tracking Status
firefox97 --- fixed

People

(Reporter: stransky, Assigned: glandium)

References

Details

Attachments

(2 files)

Elfhack crashes on Fedora 29 when building optimized Firefox. Program received signal SIGABRT, Aborted. 0x00007ffff7ad453f in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: dnf debuginfo-install libgcc-8.2.1-2.fc29.x86_64 libstdc++-8.2.1-2.fc29.x86_64 (gdb) bt #0 0x00007ffff7ad453f in raise () from /lib64/libc.so.6 #1 0x00007ffff7abe895 in abort () from /lib64/libc.so.6 #2 0x00007ffff7e91e9b in ?? () from /lib64/libstdc++.so.6 #3 0x00007ffff7e982fc in ?? () from /lib64/libstdc++.so.6 #4 0x00007ffff7e98357 in std::terminate() () from /lib64/libstdc++.so.6 #5 0x00007ffff7e985b8 in __cxa_throw () from /lib64/libstdc++.so.6 #6 0x00007ffff7ecb77c in std::__throw_ios_failure(char const*) () from /lib64/libstdc++.so.6 #7 0x00007ffff7f01442 in std::basic_ios<char, std::char_traits<char> >::clear(std::_Ios_Iostate) () from /lib64/libstdc++.so.6 #8 0x00007ffff7f05ae5 in std::istream::seekg(std::fpos<__mbstate_t>) () from /lib64/libstdc++.so.6 #9 0x0000000000405090 in ElfSection::ElfSection(serializable<Elf_Shdr_Traits>&, std::basic_ifstream<char, std::char_traits<char> >*, Elf*) () #10 0x0000000000404d96 in Elf::getSection(int) () #11 0x00000000004058ab in Elf::Elf(std::basic_ifstream<char, std::char_traits<char> >&) () #12 0x000000000040a3a2 in do_file(char const*, bool, bool, bool) () #13 0x0000000000402d55 in main () [komat@localhost ~]$ rpm -q gcc gcc-8.2.1-2.fc29.x86_64 [komat@localhost ~]$ rpm -q binutils binutils-2.31.1-13.fc29.x86_64
The affected library is available here: https://stransky.fedorapeople.org/libxul.so.tar.bz
Summary: elfhack crash at ElfSection::ElfSection() → elfhack crashes at ElfSection::ElfSection()
Component: Build Config → General
Product: Toolkit → Firefox Build System
Summary: elfhack crashes at ElfSection::ElfSection() → elfhack crashes when library is larger than 2G

Hello.

I was working on a fix for this, but when gdb blew up while attempting to step into Elf::Elf() I gave up. So to reiterate a few things I posted in bug #1554428, this problem is due to using a 32-bit int for both ELF32 and ELF64 offsets, although ELF64 offsets are 64-bit. build/unix/elfhack/elfxx.h:284:

  ElfSection *getSectionAt(unsigned int offset);

There are more places aside from this one, e.g., ElfLocation. Perhaps something else later treats this as signed, thus leading to the problem? Either way, I'm confident that building with -g3 can even get us over the 4GiB mark.

Additionally, the error handing and reporting is extremely poor in this code and should be improved. Aside from that, I recommend something like:

struct ELF64 {
    typedef uint64_t offset_type;
    typedef int64_t reloffset_type;
    typedef uint32_t size_type;
    ... etc.
};

struct ELF32 {
    typedef uint32_t offset_type;
    typedef int32_t reloffset_type;
    typedef uint32_t size_type;
    ... etc.
};

template<typename elfver>
class Elf {
   ... etc.
};

And then modify these types to something like:

template<typename elfver, typename T>
class ElfValue {
    ElfSection<elfver> *section;
    // Moving section into this class cleans up subclasses, at the expense of
    // making a few objects slightly larger, but also shrinking the vtable.
public:
    ElfValue::ElfValue(ElfSection<elfver> *section_ = NULL) : section(section_) {}
    virtual T getValue() { return 0; }
    virtual ElfSection<elfver> *getSection() { return section; }
};

...
template<typename elfver>
class ElfLocation: public ElfValue<elfver, elfver::offset_type> {
    typename elfver::reloffset_type offset;
public:
    enum position { ABSOLUTE, RELATIVE };
    ElfLocation(): ElfValue(NULL), offset(0) {};
    ElfLocation(ElfSection *section, typename elfver::reloffset_type off, enum position pos = RELATIVE);
    ElfLocation(typename elfver::offset_type location, Elf *elf);
    typename elfver::offset_type getValue();
};

template<typename elfver>
class ElfSize: public ElfValue<elfver, elfver::size_type> {
public:
    ElfSize(ElfSection *s)<elfver>: ElfValue(s) {};
    typename elfver::size_type getValue();
};

Etcetera. This will add code bloat as code for Elf<ELF32> and Elf<ELF64> will generate separate copies (unless elfhack is built for only one target), but icache efficiency shouldn't be a factor. This design will also better facilitate support for future versions of ELF files.

Correction, I forgot to remove the virtual from ElfValue::ElfSection().

This only solves the easy half of the problem outlined in the bug,
leaving the other half for later.

iostream::tellg() actually returns streampos, which is able to support
files larger than 4GiB with libstdc++, but converting to an int
obviously truncated that, as well as transformed values between 2GiB and
4GiB into invalid negative numbers.

iostream::seekg() also takes a streampos, so storing the streampos as-is
is enough to address the problem with tellg()/seekg() sequences.

The other half of the problem involves elfhack converting 64-bits ELF
headers to 32-bits headers internally, which requires deeper changes.

This change however, is enough to support files up to 4GiB, which is
already a good first step.

Keywords: leave-open
Assignee: nobody → mh+mozilla
Status: NEW → ASSIGNED
Pushed by mh@glandium.org: https://hg.mozilla.org/integration/autoland/rev/f165b048285c Fix elfhack for files > 2GiB and < 4GiB. r=gsvelto

The leave-open keyword is there and there is no activity for 6 months.
:glandium, maybe it's time to close this bug?

Flags: needinfo?(mh+mozilla)
Flags: needinfo?(mh+mozilla)
Keywords: leave-open
Summary: elfhack crashes when library is larger than 2G → elfhack crashes when library is larger than 4G

Elfhack currently can't deal with files larger than 4GiB because it
translates all ELF data structures to the 32-bits variant, even for
64-bits ELF files. So if the original file has e.g. sections that start
after the 4GiB boundary, they can't be represented in memory.

Practically speaking, this is not causing problems, but has prevented a
working elfhack for aarch64 because e.g. some relocation types don't fit
in the 32-bits ELF representation.

Blocks: 1747782
Blocks: 1747783
Pushed by mh@glandium.org: https://hg.mozilla.org/integration/autoland/rev/d9637574c7e4 Use a 64-bits internal representation in elfhack. r=gsvelto
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 97 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: