Closed Bug 629638 Opened 13 years ago Closed 13 years ago

elfhack doesn't support SHN_COMMON symbols that -flto adds in injected object file

Categories

(Firefox Build System :: General, defect)

All
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla2.0b12

People

(Reporter: erus.iluvatar, Assigned: glandium)

References

Details

Attachments

(1 file, 3 obsolete files)

User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:2.0b11pre) Gecko/20110127 Firefox/4.0b11pre
Build Identifier: 

Building firefox from mercurial sources, build failed.

Reproducible: Always

Steps to Reproduce:
1.get latest firefox sources from mercurial repo
2.make -j1 -f client.mk
3.elfhack fails
Actual Results:  
/var/abs/local/yaourtbuild/firefox-hg/src/mozilla-central/ff-opt-obj/build/unix/elfhack/elfhack -b test.so
test.so: terminate called after throwing an instance of 'std::runtime_error'
  what():  Section index out of bounds

Expected Results:  
should build fine

Archlinux x86_64

gcc 4.5.2

(what else do you need?)
ld 2.21 and coreutils 8.9 if it may help.
Looking @ #629639, I answer to the same question : 
β readelf -r src/mozilla-central/ff-opt-obj/build/unix/elfhack/inject/*.o

File: src/mozilla-central/ff-opt-obj/build/unix/elfhack/inject/x86_64-noinit.o

Relocation section '.rela.text' at offset 0x13e8 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000003  001200000002 R_X86_64_PC32     0000000000000000 relhack - 4
00000000000a  001300000002 R_X86_64_PC32     0000000000000000 elf_header - 4

Relocation section '.rela.eh_frame' at offset 0x1418 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000200000002 R_X86_64_PC32     0000000000000000 .text + 0

File: src/mozilla-central/ff-opt-obj/build/unix/elfhack/inject/x86_64.o

Relocation section '.rela.text' at offset 0x14c0 contains 3 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000003  001200000002 R_X86_64_PC32     0000000000000000 relhack - 4
00000000000a  001300000002 R_X86_64_PC32     0000000000000000 elf_header - 4
00000000003a  001400000002 R_X86_64_PC32     0000000000000000 original_init - 4

Relocation section '.rela.eh_frame' at offset 0x1508 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000200000002 R_X86_64_PC32     0000000000000000 .text + 0
See Also: → 629639
This is quite different. Could you attach your test.so ?
See Also: 629639
Assignee: nobody → mh+mozilla
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
I'm not in front of the computer, here it is if you don't mind : http://ompldr.org/vNzZ5eQ/test.so
Version: unspecified → Trunk
(In reply to comment #4)
> I'm not in front of the computer, here it is if you don't mind :
> http://ompldr.org/vNzZ5eQ/test.so

elfhack doesn't fail for me on this file. Are you sure this is the one?
Wait, with my elfhack it fails, and i'm sure it's the good one.

http://ompldr.org/vNzZ6aA/elfhack if it comes from it?
(In reply to comment #6)
> Wait, with my elfhack it fails, and i'm sure it's the good one.
> 
> http://ompldr.org/vNzZ6aA/elfhack if it comes from it?

Works, too.
Weird, where may this come from and what should I do?
(In reply to comment #8)
> Weird, where may this come from and what should I do?

Double check that these are really the files you get from build/unix/elfhack when it fails. Note that you need to generate test.so by hand with the gcc command line make prints, as make removes it when elfhack fails.
Well, I added a cp $@ $@.wat before the elfhack -b line in the make file, 
so i'm really sure it's the same file as the one elfhack -b fails on.

And ./elfhack -b *.wat gives the same "Section index out of bounds" error.
does 'make check' on a copy of http://hg.mozilla.org/users/mh_glandium.org/elfhack/ have the same result?
By the way, stupid question: what is the mozilla-central revision you are building ?
Building rev. 61563 fails, 'make check' on your repo runs fine U_u
(In reply to comment #13)
> Building rev. 61563 fails, 'make check' on your repo runs fine U_u

meaning one of the options used when building in the mozilla tree triggers this. It would be helpful if you could isolate which one it is.
Ok, i'll try to remove my .mozconfig and enable them one by one after my dinner.
OS: Linux → Windows CE
Hum, looks like there is a problem with something else : 

β cat .mozconfig 
. $topsrcdir/browser/config/mozconfig

leads to : 

/var/abs/local/yaourtbuild/firefox-hg/src/mozilla-central/obj-x86_64-unknown-linux-gnu/build/unix/elfhack/elfhack -b test.so
test.so: terminate called after throwing an instance of 'std::runtime_error'
  what():  Section index out of bounds
make[6]: *** [test.so] Abandon

Anything else to do?
OS: Windows CE → Linux
(In reply to comment #16)
> Hum, looks like there is a problem with something else : 
> 
> β cat .mozconfig 
> . $topsrcdir/browser/config/mozconfig
> 
> leads to : 
> 
> /var/abs/local/yaourtbuild/firefox-hg/src/mozilla-central/obj-x86_64-unknown-linux-gnu/build/unix/elfhack/elfhack
> -b test.so
> test.so: terminate called after throwing an instance of 'std::runtime_error'
>   what():  Section index out of bounds
> make[6]: *** [test.so] Abandon
> 
> Anything else to do?

Try changing the flags passed on the gcc command lines when building under build/unix/elfhack, and see if that makes a difference.
Still the same with CXXFLAGS='' : 


c++ -fno-rtti -Wall -Wpointer-arith -Woverloaded-virtual -Wsynth -Wno-ctor-dtor-privacy -Wno-non-virtual-dtor -Wcast-align -Wno-invalid-offsetof -Wno-variadic-macros -Werror=return-type -Wno-long-long -fno-strict-aliasing -fshort-wchar -pthread -pipe -fexceptions  -DNDEBUG -DTRIMMED -Os -freorder-blocks -fomit-frame-pointer -finline-limit=50 -fPIC -shared -Wl,-z,defs -Wl,-h,test.so -o test.so test.o
===
=== If you get failures below, please file a bug describing the error
=== and your environment (compiler and linker versions), and use
=== --disable-elf-hack until this is fixed.
===
/var/abs/local/yaourtbuild/firefox-hg/src/mozilla-central/obj-x86_64-unknown-linux-gnu/build/unix/elfhack/elfhack -b test.so
test.so: terminate called after throwing an instance of 'std::runtime_error'
  what():  Section index out of bounds
I just installed a fresh arch linux system, and here, it builds fine from the mozilla-central tree :-/
Using [testing] repo?
(In reply to comment #20)
> Using [testing] repo?

core and extra
even after upgrading to testing, it still works
Okay, must come from me, but still, wtf?
(In reply to comment #23)
> Okay, must come from me, but still, wtf?

Your best bet would be to actually debug the issue. If you're around on irc (irc.mozilla.org), I can give a hand (nick glandium). You'd need to first check what index is being requested to Elf::getSection in elf.cpp, and where from (a backtrace from gdb would be useful)
After some debugging on irc, it turns out it was due to -flto.
This patch is wrong because it won't work properly with arm builds that use -mthumb, but it should already fix your problem locally.
Summary: elfhack fails to build on Archlinux → elfhack doesn't support SHN_COMMON symbols that -flto adds in injected object file
This should work well enough.
Attachment #508094 - Attachment is obsolete: true
Attachment #508341 - Flags: review?(khuey)
Added missing -I flags.
Attachment #508341 - Attachment is obsolete: true
Attachment #508344 - Flags: review?(khuey)
Attachment #508341 - Flags: review?(khuey)
Comment on attachment 508344 [details] [diff] [review]
Build elfhack injected code with a limited set of CFLAGS

The elfhack injected code is written in C for portability, but its intent is to be as simple as possible. A number of build flags end up adding various things in objects (extra calls to accounting functions, etc.), which is not intended to happen to this injected code, and most of the time is not supported by the pseudo linker in elfhack.
This patch reduces the possibility of flags such as -pg or -flto to affect the injected code. Only target (-m*) and include flags (-I*) are actually kept (both of which are necessary for Android builds ; the former is necessary for Maemo builds).
Attachment #508344 - Flags: approval2.0?
Forgot to mention I pushed to try with success: b78c1ae46b43
Comment on attachment 508344 [details] [diff] [review]
Build elfhack injected code with a limited set of CFLAGS

Actually, it's wrong, the injected code was not compiled as thumb on android and maemo
Attachment #508344 - Flags: approval2.0?
Basically the same patch, except we move the definition after including rules.mk (and move other definitions to leave them grouped).
Attachment #508344 - Attachment is obsolete: true
Attachment #508691 - Flags: review?(khuey)
Pushed to try as c84135e0aa22, this time with real success, the injected code being properly built with the -march flags on arm.
Comment on attachment 508691 [details] [diff] [review]
Build elfhack injected code with a limited set of CFLAGS

See comment 28.
Attachment #508691 - Flags: approval2.0?
Attachment #508691 - Flags: approval2.0? → approval2.0+
Pushed:
http://hg.mozilla.org/mozilla-central/rev/2772a0cf36d1
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 4.0b12
Product: Firefox → Core
QA Contact: build.config → build-config
Target Milestone: Firefox 4.0b12 → mozilla2.0b12
Hardware: x86_64 → All
Blocks: 629639
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.