Open Bug 1380557 Opened 8 years ago Updated 3 years ago

Support symbol table for debug version of artifact builds

Categories

(Firefox Build System :: General, enhancement)

enhancement

Tracking

(firefox57 wontfix)

Tracking Status
firefox57 --- wontfix

People

(Reporter: swu, Unassigned)

References

(Blocks 1 open bug)

Details

The current debug version of artifact build doesn't contain symbol table. It would be great to download the symbol table for debug version, so we can use GDB to debug with source code.
(In reply to Shian-Yow Wu [:swu] from comment #0) > The current debug version of artifact build doesn't contain symbol table. > It would be great to download the symbol table for debug version, so we can > use GDB to debug with source code. I believe that even the debug binaries are stripped in automation, because the unstripped versions are perhaps 1GB. Mozilla runs a symbol server, but I've never figured out how to make it work. (And it might be Visual Studio-only.)
The symbol server only has symbols for nightlies anyways, but taskcluster artifacts *do* have the full debug symbols in a separate zip, which we could optionally use.
We already have support for download the crashreporter-symbols files in artifact builds: http://searchfox.org/mozilla-central/source/python/mozbuild/mozbuild/artifacts.py#856. I think it was intended for use in automation. I don't know where it needs to get placed for the `minidump_stackwalker` integration to work. swu: can you see if this does what you need "out of the box", or suggest improvements that you need?
Flags: needinfo?(swu)
(In reply to Nick Alexander :nalexander from comment #3) > We already have support for download the crashreporter-symbols files in > artifact builds: > http://searchfox.org/mozilla-central/source/python/mozbuild/mozbuild/ > artifacts.py#856. I think it was intended for use in automation. > > I don't know where it needs to get placed for the `minidump_stackwalker` > integration to work. > > swu: can you see if this does what you need "out of the box", or suggest > improvements that you need? My wish is to debug the C++ code by "./mach run --debugger=gdb" or "./mach run --debugger=rr" on the slow laptop running Linux. So, maybe crashreport-symbols is not adequate for this purpose, and we might need the unstripped debug binaries?
Flags: needinfo?(swu)
Product: Core → Firefox Build System

Hi Nicholas,
I've heard that you recently worked on symbols and tests.
Do you have any update about this bug?

While running mochitests with debug artifact builds, we still get things like this:

21:45.64 GECKO(17917) Assertion failure: mRawPtr != nullptr (You can't dereference a NULL nsCOMPtr with operator->().), at /builds/worker/workspace/obj-build/dist/include/nsCOMPtr.h:859
21:45.64 GECKO(17917) #01: ??? (/mnt/desktop/gecko-dev/obj-firefox-artifact-debug/dist/bin/libxul.so + 0x2338a37)
21:45.64 GECKO(17917) #02: ??? (/mnt/desktop/gecko-dev/obj-firefox-artifact-debug/dist/bin/libxul.so + 0x33c41ff)

It looks like there is at least a path issue, where /builds/worker/workspace/obj-build/ is the path for the build machine, while the local artifacts are in /mnt/desktop/gecko-dev/obj-firefox-artifact-debug/.

Flags: needinfo?(n.nethercote)

Alexandre: are you asking about a particular platform, or in general? I ask because stack fixing is different on each platform.

  • For local builds on Linux, debug info is in the binary (the .so files and the executable).
  • For local builds on Windows, debug info is in separate PDB files.
  • For local builds on Mac, debug info is read from the .o files that are produced during the build.
  • On automation, debug info is read from Breakpad symbol files.

I don't know much about artifact builds so I don't know if any of that debug info gets incorporated. Linux is the easiest platform in general because the debug info is not in separate files. Breakpad symbols are cross-platform, though, so getting them working might be the easiest way to get things working on all platforms. The "full debug symbols in a separate zip" that Mike mentioned might be Breakpad symbols?

Flags: needinfo?(n.nethercote)

(In reply to Nicholas Nethercote [:njn] from comment #6)

Alexandre: are you asking about a particular platform, or in general?

I was asking for linux in particular as that's where I misses the stacks.

I don't know much about artifact builds so I don't know if any of that debug info gets incorporated.

Could it be that we are missing some particular build flag when building the artifact debug on automation?
Or is there something wrong in my local environment?
I was refered to bug 1624980, it looks like artifact builds are kind of supported.
Could it be that fix-stacks isn't involved when running mochitests? This is where I was missing correct stack traces.

fix-stacks is definitely used when running mochitests. E.g. see here and here.

I think a build configured with ac_add_options --enable-debug should have debuginfo in it, but I'm not even sure if that's necessary. If you can find the binary files for your artifact build, run the file utility on libxul.so. When I do that I get this:

./o64/toolkit/library/build/libxul.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=f460d49b8bba92c4323fd18f4bbc6014fcb3c755, with debug_info, not stripped

The with debug_info indicates that debug info is present. Do you see that?

Flags: needinfo?(poirot.alex)

I just did an artifact build (well, download) on Linux and I see that libxul.so does not have debug info.

Better: can you try mach artifact install --symbols and then re-run your failing test and see if it works?

(In reply to Nicholas Nethercote [:njn] from comment #10)

Better: can you try mach artifact install --symbols and then re-run your failing test and see if it works?

When running that, I can see
2:05.84 Not updating /mnt/desktop/gecko-dev/obj-firefox-artifact-debug/dist/bin/libxul.so

And file against libxul.so stays the same before/after:

$ file obj-firefox-artifact-debug/dist/bin/libxul.so
obj-firefox-artifact-debug/dist/bin/libxul.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=65274851da48de6f9cdcc7d448753eb873ec2aa4, not stripped
$ mach artifact install --symbols
$ file obj-firefox-artifact-debug/dist/bin/libxul.so
obj-firefox-artifact-debug/dist/bin/libxul.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=65274851da48de6f9cdcc7d448753eb873ec2aa4, not stripped

But
Before running mach artifact install --symbols

Initializing stack-fixing for the first stack frame, this may take a while...
 1:03.30 GECKO(50891) #01: ??? (/mnt/desktop/gecko-dev/obj-firefox-artifact-debug/dist/bin/libxul.so + 0x50a6909)
 1:03.30 GECKO(50891) #02: ??? (/mnt/desktop/gecko-dev/obj-firefox-artifact-debug/dist/bin/libxul.so + 0x1735a43)
 1:03.30 GECKO(50891) #03: ??? (/mnt/desktop/gecko-dev/obj-firefox-artifact-debug/dist/bin/libxul.so + 0x50a6f5c)
 1:03.30 GECKO(50891) #04: ??? (/mnt/desktop/gecko-dev/obj-firefox-artifact-debug/dist/bin/firefox-bin + 0x10c0e)
 1:03.30 GECKO(50891) #05: ??? (/mnt/desktop/gecko-dev/obj-firefox-artifact-debug/dist/bin/firefox-bin + 0x10fb5)
 1:03.31 GECKO(50891) #06: __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6 + 0x21b97)
 1:03.31 GECKO(50891) #07: ??? (/mnt/desktop/gecko-dev/obj-firefox-artifact-debug/dist/bin/firefox-bin + 0x10ade)
 1:03.31 GECKO(50891) #08: ??? (???:???)

and after:

Initializing stack-fixing for the first stack frame, this may take a while...
 1:23.60 GECKO(55500) #01: XRE_TermEmbedding() [toolkit/xre/nsEmbedFunctions.cpp:223]
 1:23.61 GECKO(55500) #02: mozilla::ipc::ScopedXREEmbed::Stop() [ipc/glue/ScopedXREEmbed.cpp:91]
 1:23.61 GECKO(55500) #03: XRE_InitChildProcess(int, char**, XREChildData const*) [toolkit/xre/nsEmbedFunctions.cpp:745]
 1:23.77 GECKO(55500) #04: content_process_main(mozilla::Bootstrap*, int, char**) [ipc/contentproc/plugin-container.cpp:57]
 1:23.77 GECKO(55500) #05: main [browser/app/nsBrowserApp.cpp:303]
fix-stacks error: failed to read breakpad symbols dir `/mnt/desktop/gecko-dev/obj-firefox-artifact-debug/dist/crashreporter-symbols/libc.so.6` for `/lib/x86_64-linux-gnu/libc.so.6`
fix-stacks note:  this is expected and harmless for system libraries on debug automation runs
 1:23.78 GECKO(55500) #06: __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6 + 0x21b97]
 1:23.78 GECKO(55500) #07: ??? [/mnt/desktop/gecko-dev/obj-firefox-artifact-debug/dist/bin/firefox-bin + 0x10ade]
 1:23.78 GECKO(55500) #08: ??? (???:???)

So, while mach artifact install --symbols has not updated the .so, it downloaded crashreporter-symbols folder, which seems to be used by your stack fixing script.

Should we communicate about mach artifact install --symbols? That's the first time I hear about this command.
May be here: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Artifact_builds ?

Flags: needinfo?(poirot.alex)

Yay! I'm glad to hear it worked. Updating the the docs sounds good -- are you happy to do it?

If you change the doc, please document ac_add_options --enable-artifact-build-symbols rather than mach artifact install --symbols. The latter is an implementation detail, the former is the supported interface.

(In reply to Mike Hommey [:glandium] from comment #13)

If you change the doc, please document ac_add_options --enable-artifact-build-symbols rather than mach artifact install --symbols. The latter is an implementation detail, the former is the supported interface.

Please also note --enable-artifact-build-symbols=full, which tries to do the helpful thing for local artifact builds: download "full" crashreporter symbols rather than only breakpad symbols. See Bug 1525968, which made this work for Android targets; and Bug 1637388, which will make this work for non-Android targets. (The latter is waiting for me to update.)

I've started with the following mention on MDN:

# Download debug info so that stack traces refers to file and columns rather than library and Hex address
ac_add_options --enable-artifact-build-symbols 

It sounds like =full option should be documented once it works for desktop build?

If the phrasing sounds good to you I'll also update the in-tree file.

Otherwise, this bug should be closed once we get this documented, right?

That's awful that we have almost the same documentation twice: once on MDN, and once for the in-tree docs. Can the former be replaced with a link to the latter?

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.