Investigate why we're seeing Unified_foo_bar.cpp files in some Tinderbox crash stacks

RESOLVED WONTFIX

Status

()

RESOLVED WONTFIX
5 years ago
5 years ago

People

(Reporter: Ehsan, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

5 years ago
See <https://tbpl.mozilla.org/php/getParsedLog.php?id=30894121&tree=Mozilla-Central#error0> from bug 931642 for example:

06:46:12     INFO -  Crash dump filename: /var/folders/dp/zsbthwfj6zq83v8r9hwpv6f000000w/T/tmpBs1N9x/minidumps/B3B46475-31A5-4CCD-9C41-67C345DA9DAD.dmp
06:46:12     INFO -  Operating system: Mac OS X
06:46:12     INFO -                    10.7.2 11C74
06:46:12     INFO -  CPU: amd64
06:46:12     INFO -       family 6 model 23 stepping 10
06:46:12     INFO -       2 CPUs
06:46:12     INFO -  Crash reason:  EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
06:46:12     INFO -  Crash address: 0x0
06:46:12     INFO -  Thread 0 (crashed)
06:46:12     INFO -   0  XUL!mozilla::storage::Service::Observe(nsISupports*, char const*, char16_t const*) [nsTArray.h : 871 + 0x0]
06:46:12     INFO -      rbx = 0x00007fff74b3f630   r12 = 0x0000000103f088c0
06:46:12     INFO -      r13 = 0x0000000000000006   r14 = 0x0000000000000003
06:46:12     INFO -      r15 = 0x000000010d1099a0   rip = 0x00000001016e1493
06:46:12     INFO -      rsp = 0x00007fff5fbfecd0   rbp = 0x00007fff5fbfed40
06:46:12     INFO -      Found by: given as instruction pointer in context
06:46:12     INFO -   1  XUL!_ZThn8_N7mozilla7storage7Service7ObserveEP11nsISupportsPKcPKDs [Unified_cpp_storage_src1.cpp : 923 + 0x8]
06:46:12     INFO -      rbx = 0x0000000000000005   r12 = 0x000000010692b4f8
06:46:12     INFO -      r13 = 0x0000000000000000   r14 = 0x0000000000000000
06:46:12     INFO -      r15 = 0x0000000103dbc713   rip = 0x00000001016e14bd
06:46:12     INFO -      rsp = 0x00007fff5fbfed50   rbp = 0x00007fff5fbfed50
06:46:12     INFO -      Found by: call frame info

Ted, any idea what might be going on here?
Flags: needinfo?(ted)
A possibility is that CCACHE_CPP2 doesn't work as expected.
(as in, we might not end up using it as we think we are)
(Reporter)

Comment 3

5 years ago
(In reply to comment #2)
> (as in, we might not end up using it as we think we are)

I just remembered that a while ago I was hitting a problem and I had to set the CCACHE_CPP2 environment variable explicitly on my mac to fix my builds, which probably means that whatever the build system is doing here it's not working... :(
(Reporter)

Comment 4

5 years ago
Should we try to set that variable in our mozconfigs?
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #4)
> Should we try to set that variable in our mozconfigs?

I think we should figure out why what is in config/config.mk doesn't work.
(Reporter)

Comment 6

5 years ago
(In reply to comment #5)
> (In reply to :Ehsan Akhgari (needinfo? me!) from comment #4)
> > Should we try to set that variable in our mozconfigs?
> 
> I think we should figure out why what is in config/config.mk doesn't work.

Do you have any ideas for me to pursue?
(Reporter)

Comment 8

5 years ago
Yeah we do hit the export command in config.mk.
I downloaded the symbols here and looked at them. I'm pretty sure this is a toolchain-level problem. The unified source files are definitely listed in the Breakpad symbol files:
FILE 449 hg:hg.mozilla.org/integration/mozilla-inbound:obj-firefox/accessible/sr
c/base/Unified_cpp_accessible_src_base0.cpp:7257efcd7d50

Breakpad just reads the filenames directly out of the DWARF, so this must be coming from the compiler. I'll spin a local build and see if this reproduces there.
Flags: needinfo?(ted)
I see this on my local Mac build with no ccache involved:
FILE 42 /Users/luser/build/debug-mozilla-central/accessible/src/base/Unified_cpp_accessible_src_base0.cpp

$ clang --version
Apple LLVM version 4.2 (clang-425.0.27) (based on LLVM 3.2svn)
Target: x86_64-apple-darwin12.5.0
Thread model: posix

Running dsymutil on XUL and then dwarfdump on XUL.dSYM shows that this is baked into the debug data:
                 AT_decl_file( "/Users/luser/build/debug-mozilla-central/xpcom/io/Unified_cpp_xpcom_io0.cpp" )
(Reporter)

Comment 11

5 years ago
What's AT_decl_file?  I'm not sure what comment 10 means to be honest.
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #11)
> What's AT_decl_file?  I'm not sure what comment 10 means to be honest.

It means that the DWARF says that entities are declared in the unified file, rather than whatever files are #include'd from the unified file.
Looking at this a little more, what we have is that the .debug_info section says:

- Compilation unit: unified_file_foo.cpp
  - Function: foo
    Starts at PC: ...
    Ends at PC: ...
    ...other interesting information about foo...
  - Function: bar
    Starts at PC: ...
    Ends at PC: ...
    ...

irrespective of where |foo| and |bar| are actually defined.  (This was on my Mac without ccache, fwiw.)  The .debug_line section records those [start PC, end PC) combinations as coming from the expected files, but not .debug_info.

FWIW, this same sort of thing happens with GCC, too.  It would not surprise me to learn that Windows works the same way.  Unified sources appears to imply suboptimal crashreports. :(
Don't we use .debug_line to extract the data for breakpad?
Flags: needinfo?(ted)
(Reporter)

Comment 15

5 years ago
And don't we have this same problem for functions defined in headers for example without unified builds?
Okay, I think this is just some localized weirdness. A few things to note:
1) Frame 0 of the crash stack shows a correctly attributed inline function, it shows nsTArray.h as the file.
2) Frame 1 is "non-virtual thunk to mozilla::storage::Service::Observe(nsISupports*, char const*, char16_t const*)", so I assume the compiler created this function out of thin air and needed a place to attribute it, so it used the unified source file it was compiling.
3) The breakpad symbols for the actual function in frame 0 seems to accurately attribute its file:
FUNC 6e0630 483 0 mozilla::storage::Service::Observe(nsISupports*, char const*, 
char16_t const*)
6e0630 17 884 16544
6e0647 13 885 16544
6e065a c 364 16544
6e0666 11 364 16544
6e0677 17 887 16544
6e068e 5 889 16544
6e0693 4 566 16582
(I'll spare you the rest, it's a long function.)

From earlier in the file
FILE 16544 hg:hg.mozilla.org/integration/mozilla-inbound:storage/src/mozStorageService.cpp:7257efcd7d50

I don't think there's a real bug here, except that "compiler-generated thunks will get weird unified source file names".
Flags: needinfo?(ted)
(Reporter)

Comment 17

5 years ago
So would you recommend a WONTFIX here?
Sounds right. I don't think these are harmful, and it's not broken in the general case.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → WONTFIX

Comment 19

5 years ago
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #15)
> And don't we have this same problem for functions defined in headers for
> example without unified builds?

I suspect that is correct. I have tons of backtraces in |make mozmill| run of DEBUG BUILD of thunderbird, and I often am puzzled to see functions declared in header files (but some of them ARE defined for real in headers). Very confusing.
So sometimes ""compiler-generated thunks will get weird header file names".

In the end, so the backtrace in Apple OS X case mentioned is just a nuisance we have to somehow work with if we go the unified complation route, then?
(I noticed that disabling unified compilation is suggested in dev-platform mailing list, anyway.)
(Reporter)

Comment 20

5 years ago
(In reply to comment #19)
> In the end, so the backtrace in Apple OS X case mentioned is just a nuisance we
> have to somehow work with if we go the unified complation route, then?
> (I noticed that disabling unified compilation is suggested in dev-platform
> mailing list, anyway.)

For now we're going to disable those builds on aurora/beta/release, so this will only be an issue on Nightly.  Given the investigation performed here, it should not cause a lot of trouble even on Nightly.
You need to log in before you can comment on or make changes to this bug.