Problem with cannot create debug link section



Release Engineering
3 years ago
3 years ago


(Reporter: Tomcat, Unassigned)


Firefox Tracking Flags

(Not tracked)




3 years ago
see bug about a problem with a slave with builds.

01:21 < pmoore|projectduty> Tomcat|sheriffduty: in those logs with the corrupted zips, i also see lines like:
01:22 < pmoore|projectduty>  /usr/bin/objcopy:dist/bin/stSlDtG3: cannot create debug link section `dist/bin/test_unlock_notify.dbg': Invalid operation
01:23 < pmoore|projectduty> i wonder if we need to update our version of objcopy - i see a lot of e.g. debian bugs with this problem, and the solution was to upgrade binutils package
01:23 < Tomcat|sheriffduty> pmoore|projectduty: yeah i will file a bug for this slave
01:24 < pmoore|projectduty> i wonder if we are just using a buggy version of code, and the problem is occasionally exhibited on a slave, but maybe is not a slave problem but a tools problem that happens sporadically
01:24 < pmoore|projectduty> or maybe once it happens, it leaves the machine in a bad state, so it looks like a slave problem
01:24 < pmoore|projectduty> just a thought, could be way off base
01:24 < Tomcat|sheriffduty> yeah
01:25 < pmoore|projectduty> might be worth raising a bug for the slave and also a separate bug for the root cause? or maybe we already have oneā€¦


3 years ago
Blocks: 1025863
Pmoore's guess looks probable:

looking at an ec2 instance (non spot but should be the same), I am getting:

[ ~]$ /usr/bin/objcopy --version
GNU objcopy version 20091009

and 2.20 was reported to have a bug that may be related to this situation. Here is that bug:

based off comments, in the sourceware bug, an updated version resolved that issue. I'm guessing we need to play with puppet to do this. /me dives in deeper.
so i am not sure if these add up. the issue reported
here -
and here -

refers to using --add-gnu-debuglink. That option doesn't seem to be in the output of 'make buildsymbols'

we do use the 'gold' binutils and that seems to be a culprit for such output like:
/usr/bin/objcopy:dist/bin/stBmHhWA: cannot create debug link section `dist/bin/': Invalid operation

It looks like we we install binutils against:
If objcopy is not a red herring, we can just set OBJCOPY in build/unix/mozconfig.linux. That said, why are we using gold? we shouldn't be.
(In reply to Mike Hommey [:glandium] from comment #3)
> If objcopy is not a red herring, we can just set OBJCOPY in
> build/unix/mozconfig.linux

To use the one that comes alongside gcc, which is from a recent binutils.
(In reply to Mike Hommey [:glandium] from comment #3)
> If objcopy is not a red herring, we can just set OBJCOPY in
> build/unix/mozconfig.linux. That said, why are we using gold? we shouldn't
> be.

/me finds `Bug 633269 - Use gold for linking on linux` looks like you were against this too back then :)

not sure if this how things are today, I may be reading things incorrectly from puppet repo.
So far bld-linux64-spot-136 is the only slave I could find that was having this issue. There is the possibility that the zip was actually bad, or the instance created was an anomaly.

maybe related, it seems that was also having issues unpacking. not with zip but with tar. It had four failures in a row:

ERROR: Command failed. See logs for output.
 # ['tar', '--use-compress-program', 'pigz', '-xf', '/builds/mock_mozilla/cache/mozilla-centos6-x86_64/root_cache/cache.tar.gz', '-C', '/builds/mock_mozilla/mozilla-centos6-x86_64/root/']
program finished with exit code 2

It appears like bld-linux64-spot-494 got 'un-stuck' and is running green again.
So there are a few ideas here:

1) The slave is to blame
    a) this was due to bad instances (bad AMI) -> Bug 1025842 certainly suggests that to be the case for bld-linux64-spot-494
    b) maybe ran out of disk space, this would explain how bld-linux64-spot-136 got stuck in a rut on the same builder

2) the zip was corrupt: unlikely as other slaves built fine for the rev at incident and the surrounding revs.

3) there is a bug with binutils: I want to say that this is a red herring as other slaves seem to be able to build just fine.

I am tempted to suggest this is number (1). Let's see what the result of Bug 1025842 is first as it might be related.

I am going to re-enable bld-linux64-spot-136. It looks like it has been terminated since being disabled so its state is lost and it should have a fresh start. Won't help with debugging what actually happened though.
 If this is a 'disk space' issue, we will have to act quickly on the spot(s) in question.
Depends on: 1025842
this *looks* solved, or if its not is not a buildduty issue anymore.

Please either file new bugs for followup issues, or re-open and move to a different component if my assessment is wrong.
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.