Closed Bug 1840931 Opened 1 year ago Closed 1 year ago

Elfhack doesn't work with binaries larger than 4GB

Categories

(Firefox Build System :: General, defect)

defect

Tracking

(firefox-esr115 fixed, firefox116 fixed)

RESOLVED FIXED
116 Branch
Tracking Status
firefox-esr115 --- fixed
firefox116 --- fixed

People

(Reporter: stransky, Assigned: glandium)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

When PGO build is produced by GCC 13.1.1, elfhack breaks instrumented libxul.so build for Firefox fails to start to train profile build and crashes with SIGILL.

backtrace of the crash:

Thread 1 "firefox" received signal SIGILL, Illegal instruction.
std::sys::unix::args::imp::ARGV_INIT_ARRAY::init_wrapper (argc=1, argv=0x7fffffffd9d8, _envp=<optimized out>) at library/std/src/sys/unix/args.rs:125
125	library/std/src/sys/unix/args.rs: No such file or directory.
Missing separate debuginfos, use: dnf debuginfo-install alsa-lib-1.2.9-1.fc38.x86_64 bzip2-libs-1.0.8-13.fc38.x86_64 cairo-1.17.8-4.fc38.x86_64 cairo-gobject-1.17.8-4.fc38.x86_64 dbus-glib-0.112-5.fc38.x86_64 dbus-libs-1.14.8-1.fc38.x86_64 fontconfig-2.14.2-1.fc38.x86_64 freetype-2.13.0-2.fc38.x86_64 fribidi-1.0.12-3.fc38.x86_64 gdk-pixbuf2-2.42.10-2.fc38.x86_64 graphite2-1.3.14-11.fc38.x86_64 harfbuzz-7.1.0-1.fc38.x86_64 json-glib-1.6.6-4.fc38.x86_64 libXau-1.0.11-2.fc38.x86_64 libXcomposite-0.4.5-9.fc38.x86_64 libXcursor-1.2.1-3.fc38.x86_64 libXdamage-1.1.5-9.fc38.x86_64 libXext-1.3.5-2.fc38.x86_64 libXfixes-6.0.0-5.fc38.x86_64 libXi-1.8.1-1.fc38.x86_64 libXinerama-1.1.5-2.fc38.x86_64 libXrandr-1.5.2-10.fc38.x86_64 libXrender-0.9.11-2.fc38.x86_64 libXtst-1.2.4-2.fc38.x86_64 libblkid-2.38.1-4.fc38.x86_64 libcap-2.48-6.fc38.x86_64 libcloudproviders-0.3.1-7.fc38.x86_64 libdatrie-0.2.13-5.fc38.x86_64 libepoxy-1.5.10-3.fc38.x86_64 libffi-3.4.4-2.fc38.x86_64 libgcc-13.1.1-4.fc38.x86_64 libjpeg-turbo-2.1.4-2.fc38.x86_64 libmount-2.38.1-4.fc38.x86_64 libpng-1.6.37-14.fc38.x86_64 libstdc++-13.1.1-4.fc38.x86_64 libthai-0.1.29-4.fc38.x86_64 libtracker-sparql-3.5.3-1.fc38.x86_64 libxcb-1.13.1-11.fc38.x86_64 libxkbcommon-1.5.0-2.fc38.x86_64 libzstd-1.5.5-1.fc38.x86_64 lz4-libs-1.9.4-2.fc38.x86_64 nspr-4.35.0-7.fc38.x86_64 pango-1.50.14-1.fc38.x86_64 pcre2-10.42-1.fc38.1.x86_64 pixman-0.42.2-1.fc38.x86_64 sqlite-libs-3.40.1-2.fc38.x86_64 xz-libs-5.4.1-1.fc38.x86_64 zlib-1.2.13-3.fc38.x86_64
(gdb) bt
#0  std::sys::unix::args::imp::ARGV_INIT_ARRAY::init_wrapper (argc=1, argv=0x7fffffffd9d8, _envp=<optimized out>) at library/std/src/sys/unix/args.rs:125
#1  0x00007ffff7fcf17f in call_init (env=0x7ffff781f600, argv=0x7fffffffd9d8, argc=1, l=<optimized out>) at dl-init.c:70
#2  call_init (l=<optimized out>, argc=1, argv=0x7fffffffd9d8, env=0x7ffff781f600) at dl-init.c:26
#3  0x00007ffff7fcf27d in _dl_init (main_map=0x7ffff785b200, argc=1, argv=0x7fffffffd9d8, env=0x7ffff781f600) at dl-init.c:117
#4  0x00007ffff7fcb5c2 in __GI__dl_catch_exception (exception=exception@entry=0x0, operate=operate@entry=0x7ffff7fd5ea0 <call_dl_init>, args=args@entry=0x7fffffffb240) at dl-catch.c:211
#5  0x00007ffff7fd5e3c in dl_open_worker (a=a@entry=0x7fffffffb3f0) at dl-open.c:808
#6  0x00007ffff7fcb523 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffb3d0, operate=operate@entry=0x7ffff7fd5da0 <dl_open_worker>, args=args@entry=0x7fffffffb3f0)
    at dl-catch.c:237
#7  0x00007ffff7fd61b4 in _dl_open
    (file=0x7fffffffb700 "/home/komat/tmp12/crashes/libxul.so", mode=<optimized out>, caller_dlopen=0x5555555ae58a <XPCOMGlueLoad(char const*, mozilla::LibLoadingStrategy)+1642>, nsid=<optimized out>, argc=1, argv=0x7fffffffd9d8, env=0x7ffff781f600) at dl-open.c:884
#8  0x00007ffff7aaa6d4 in dlopen_doit (a=a@entry=0x7fffffffb6a0) at dlopen.c:56
#9  0x00007ffff7fcb523 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffb5e0, operate=0x7ffff7aaa670 <dlopen_doit>, args=0x7fffffffb6a0) at dl-catch.c:237
#10 0x00007ffff7fcb679 in _dl_catch_error (objname=0x7fffffffb648, errstring=0x7fffffffb650, mallocedp=0x7fffffffb647, operate=<optimized out>, args=<optimized out>) at dl-catch.c:256
#11 0x00007ffff7aaa1b3 in _dlerror_run (operate=operate@entry=0x7ffff7aaa670 <dlopen_doit>, args=args@entry=0x7fffffffb6a0) at dlerror.c:138
#12 0x00007ffff7aaa78f in dlopen_implementation (dl_caller=<optimized out>, mode=<optimized out>, file=<optimized out>) at dlopen.c:71
#13 ___dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:81
#14 0x00005555555ae58a in GetLibHandle (aDependentLib=0x7fffffffb700 "/home/komat/tmp12/crashes/libxul.so") at /builddir/build/BUILD/firefox-114.0.2/xpcom/glue/standalone/nsXPCOMGlue.cpp:89
#15 ReadDependentCB (aLibLoadingStrategy=mozilla::LibLoadingStrategy::ReadAhead, aDependentLib=0x7fffffffb700 "/home/komat/tmp12/crashes/libxul.so")
    at /builddir/build/BUILD/firefox-114.0.2/xpcom/glue/standalone/nsXPCOMGlue.cpp:144
#16 XPCOMGlueLoad(char const*, mozilla::LibLoadingStrategy)
    (aXPCOMFile=aXPCOMFile@entry=0x7ffff781d160 "/home/komat/tmp12/crashes/libxul.so", aLibLoadingStrategy=aLibLoadingStrategy@entry=mozilla::LibLoadingStrategy::ReadAhead)
    at /builddir/build/BUILD/firefox-114.0.2/xpcom/glue/standalone/nsXPCOMGlue.cpp:323
#17 0x00005555555ae87d in mozilla::GetBootstrap(char const*, mozilla::LibLoadingStrategy)
    (aXPCOMFile=aXPCOMFile@entry=0x7ffff781d550 "/home/komat/tmp12/crashes/firefox", aLibLoadingStrategy=aLibLoadingStrategy@entry=mozilla::LibLoadingStrategy::ReadAhead)
    at /builddir/build/BUILD/firefox-114.0.2/xpcom/glue/standalone/nsXPCOMGlue.cpp:405
#18 0x00005555555a5e82 in InitXPCOMGlue(mozilla::LibLoadingStrategy) (aLibLoadingStrategy=aLibLoadingStrategy@entry=mozilla::LibLoadingStrategy::ReadAhead)
    at /builddir/build/BUILD/firefox-114.0.2/browser/app/nsBrowserApp.cpp:241
#19 0x00005555555a3239 in main(int, char**, char**) (argc=<optimized out>, argv=<optimized out>, envp=0x7fffffffd9e8)
    at /builddir/build/BUILD/firefox-114.0.2/browser/app/nsBrowserApp.cpp:434

Note that build at instrumented/dist/bin is OK, only instrumented/dist/firefox is broken.

Please attach the non-broken (pre-elfhack) libxul.so. (or, well, since it's going to be too large for bugzilla, store it somewhere (google drive? s3? wherever) and paste a link.

Flags: needinfo?(stransky)
Assignee: nobody → mh+mozilla

Boy, binutils's strip doesn't like this file... it's been running for 10 minutes and is not done yet.

This is most certainly related to the size of the binary being above 4GB (once stripped, elfhack doesn't break it).

Yeah, that's what it is, there are plenty of functions related to offsets and sizes that use unsigned ints, IOW, 32-bits values.

Summary: Elfhack break libxul.so for GCC 13.1.1/PGO → Elfhack doesn't work with binaries larger than 4GB

BTW, file doesn't like the file either:

libxul.so: ERROR: , dynamically linked, BuildID[sha1]=c837420d354edbd1d33f1d5872444760cc148be0 Note section size too big (163435456 > 67108864) (Invalid argument)

(the .gnu.build.attributes is very large, not that it would make a difference)

I'm pretty sure there are other theoretical problems in the code,
notably when a single section is larger than 4GB, but by the time
we reach that limit, bug 1839740 will have been fixed.

Can you confirm the attached patch fixes your problem?

Flags: needinfo?(stransky)

Sure, will test now.
Thanks.

Seems to be working.

Flags: needinfo?(stransky)
Pushed by mh@glandium.org: https://hg.mozilla.org/integration/autoland/rev/51b78cf4881b More properly handle files > 4GB in elfhack. r=gsvelto
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 116 Branch

Comment on attachment 9341684 [details]
Bug 1840931 - More properly handle files > 4GB in elfhack.

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Build failure in some configurations (notably --disable-install-strip, which is often used downstream)
  • User impact if declined: See above
  • Fix Landed on Version: 116
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): As far as Mozilla builds are concerned, it's a no-op. For builds that failed, the worst that can happen is that they keep failing or produce a broken libxul.so (but that's unlikely, that would already have been caught).
Attachment #9341684 - Flags: approval-mozilla-esr115?

Comment on attachment 9341684 [details]
Bug 1840931 - More properly handle files > 4GB in elfhack.

Approved for 115.1esr.

Attachment #9341684 - Flags: approval-mozilla-esr115? → approval-mozilla-esr115+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: