Closed
Bug 1268328
Opened 9 years ago
Closed 8 years ago
Crash reports from Rust code are inadequate
Categories
(Toolkit :: Crash Reporting, defect)
Toolkit
Crash Reporting
Tracking
()
RESOLVED
FIXED
People
(Reporter: n.nethercote, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: [stylo])
Bug 1266309 is about a Rust crash (well, abort). The crash report is here:
https://crash-stats.mozilla.com/report/index/fe3a0851-557e-4ef8-be95-4c1782160417
The crash report is missing some vital data.
- The function name is not demangled.
- The source file is not listed, which means there is also no link to the source.
Also, Rust symbols end with 'h' followed by a 16 digit hex number. If this number changes frequently it will mess with Socorro's signature-based grouping. Perhaps we could strip it?
Comment 1•9 years ago
|
||
I notice that the Rust frames are unwound via the frame pointer tactic, rather than CFI. Could it just be that we're not building Rust code with -g?
Comment 3•9 years ago
|
||
Thanks for the needinfo, dbaron.
The crash report in the description is on windows. There's some upstream discussion of the issue on mac in https://github.com/rust-lang/rust/issues/24346
It...kinda looks like we don't pass -g to rustc. I'll fix that.
Comment 4•9 years ago
|
||
No, I haven't looked at this at all. I seem to recall having a discussion with jrmuizel about llvm's debug info support in the past, this may be partially an issue with that.
rillian: you can test this in local builds by doing a build, then either running `./mach buildsymbols`, or just running $objdir/dist/host/bin/dump_syms $objdir/toolkit/library/$libxul, and looking for those functions in the output.
Updated•9 years ago
|
Flags: needinfo?(ted)
Comment 5•9 years ago
|
||
Looking at the symbol file from the crash in comment 0 (http://symbols.mozilla.org/xul.pdb/59C3D44296B248DFBCBCD486778345432/xul.sym, 20MB), it looks like we don't have source line info for these functions:
PUBLIC 3c560 0 mp4parse_read
If you look at the PDB dumping code in Breakpad:
https://dxr.mozilla.org/mozilla-central/rev/86730d0a82093d705e44f33a34973d28b269f1ea/toolkit/crashreporter/google-breakpad/src/common/windows/pdb_source_line_writer.cc#379
You can see that we ask DIA for `SymTagFunction`, which results in printing `FUNC` lines, and fall back to `SymTagPublicSymbol`, which prints `PUBLIC` lines if we can't find a function.
Also, re: demangling, I'm not surprised. We get the mangled names from DIA and call `get_undecoratedNameEx` to demangle it, which isn't likely to support Rust function names any time soon:
https://dxr.mozilla.org/mozilla-central/rev/86730d0a82093d705e44f33a34973d28b269f1ea/toolkit/crashreporter/google-breakpad/src/common/windows/pdb_source_line_writer.cc#972
If there's a way we can detect Rust symbols and demangle them we could fix that in Breakpad.
Comment 6•9 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #4)
> rillian: you can test this in local builds by doing a build, then either
> running `./mach buildsymbols`, or just running
> $objdir/dist/host/bin/dump_syms $objdir/toolkit/library/$libxul, and looking
> for those functions in the output.
Thanks. Using `rustc -g` on linux I see FILE entries for the source, but still only two PUBLIC function entries:
PUBLIC 34f70d0 0 mp4parse_get_track_audio_info
PUBLIC 34f7140 0 mp4parse_get_track_video_info
These are defined in
FILE 6935 $(topsrcdir)/media/libstagefright/binding/mp4parse/capi.rs
but are the only symbols to show up there either way.
Reporter | ||
Comment 7•9 years ago
|
||
Another problem with Rust crash reports is discussed in bug 1271529 -- currently lots of crashes get conflated unhelpfully into a single "rust_panic" signature.
Comment 8•9 years ago
|
||
So bug 1268617 should have changed the state of affairs here significantly. Looking at the generated symbol files we seem to have sensible function names and source line info.
Comment 9•9 years ago
|
||
Although unfortunately bug 1272278 (manifesting as bug 1272738) means we're probably not seeing the benefit in actual crash reports yet, since all the Windows crash reports are probably broken.
Comment 10•8 years ago
|
||
So I only just now noticed bug 1291650, which is about crashes on 32-bit Linux on CPUs without SSE2, like:
https://crash-stats.mozilla.com/report/index/28af2d67-07ad-4d75-bba5-921c02160801
That crash report looks mostly usable, with the caveat that frame 0 is in a source file in libcore, so we don't have a useful link to the source. If we fixed that (partially covered by bug 1275424), that crash report would be no worse than equivalent C++ crash reports.
I poked around crash-stats to see if we had other reported crashes in mp4parse:
https://crash-stats.mozilla.com/search/?product=Firefox&signature=~mp4parse&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature
There aren't very many, but there are a few. The most frequent (all of 11 crashes when I ran that query) are in mp4parse_read, all on Windows. These crash reports fall squarely into the "not very helpful" category, they all look something like:
https://crash-stats.mozilla.com/report/index/aec4c22b-f836-46b6-bb4d-22c762160830
We don't have any source line info for any of the frames, and any frames that aren't using C linkage are not even demangled. That's pretty poor.
Moving down the list we see a few OOM crashes in mp4parse_new, which all look roughly like:
https://crash-stats.mozilla.com/report/index/e80213b5-3c79-4069-8483-017042160831
These are also not a stellar example of a crash report, as there's no source line info for the mp4parse_new frame, and we appear to have mucked up the unwind there somehow. The "raw dump" tab tells me that the stackwalker actually found frame 3 (the mp4parse_new frame) by scanning, which would indicate that the frame *before* it was missing unwind info, which is odd.
After that we have a couple of signatures with only a single report each:
mp4parse_get_track_info: https://crash-stats.mozilla.com/report/index/b90e3581-156b-4ed7-89de-c55f22160826
This is on OS X, and has a useful stack but no source info in the top frame, so it'd be pretty hard to diagnose.
mp4parse_new: https://crash-stats.mozilla.com/report/index/083d0c90-b410-4a95-a9db-1a71f2160827
This is on Windows and actually looks nicer than any of the other crash reports here. Maybe because the top frame is actually in Rust code from m-c, and not something inlined or monomorphized from the standard library? In any event, it's got source info that links to VCS, so that's pretty great.
Comment 11•8 years ago
|
||
It's annoying to have to rely on bugs in our Rust code to track improvements here, so I wrote some patches on bug 1300152.
Comment 12•8 years ago
|
||
Related: https://github.com/rust-lang/rust/issues/36452
Rust's libstd crate (that ships with the compiler) doesn't include debug symbols, which means that any non-generic methods that we link from it won't have function names or source line info when linked into libxul.
See Also: → https://github.com/rust-lang/rust/issues/36452
Comment 13•8 years ago
|
||
(In reply to Jed Davis [:jld] {⏰UTC-6} from comment #1)
> I notice that the Rust frames are unwound via the frame pointer tactic,
> rather than CFI. Could it just be that we're not building Rust code with -g?
We are now building with -g, which is good, but per above libstd is not, so that's a problem for unwinding. Rust apparently omits frame pointers on most archs:
https://github.com/rust-lang/rust/blob/c87ba3f1222ba20d491e8ed76a04977283280742/src/librustc_back/target/mod.rs#L267-L268
Comment 14•8 years ago
|
||
I think bug 1311458 should fix most of what ails us here. Looking at some random Win32/Win64 nightly symbols it looks like we have decent-looking function names for Gecko Rust code as well as source line info:
FUNC 3e820 1ad 0 read<&mut mp4parse::BMFFBox<mp4parse_capi::mp4parse_io>>
3e820 9 1491 1387
On Mac/Linux the symbols demangle slightly differently:
FUNC 4055d00 269 0 {{inlined-root}}::read<mp4parse::BMFFBox<mp4parse::BMFFBox<mp4parse_capi::mp4parse_io>>>
4055d00 17 397 7075
FUNC 3621670 210 0 {{inlined-root}}::read<mp4parse::BMFFBox<mp4parse::BMFFBox<mp4parse_capi::mp4parse_io>>>
3621670 17 397 6675
This is because rustc uses GCC-style C++ name mangling, and dump_syms uses `cxx_demangle` if it finds a mangled name, and so we get sort-of-C++-looking symbol names back. We could fix dump_syms to not do this (upstream Breakpad took a patch to handle Swift symbols that way: https://codereview.chromium.org/2147523005/ ) and then do the Rust demangling ourselves in symbolstore.py in order to get function names that match across platforms.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #13)
> (In reply to Jed Davis [:jld] {⏰UTC-6} from comment #1)
> > I notice that the Rust frames are unwound via the frame pointer tactic,
> > rather than CFI. Could it just be that we're not building Rust code with -g?
>
> We are now building with -g, which is good, but per above libstd is not, so
> that's a problem for unwinding. Rust apparently omits frame pointers on most
> archs:
> https://github.com/rust-lang/rust/blob/
> c87ba3f1222ba20d491e8ed76a04977283280742/src/librustc_back/target/mod.
> rs#L267-L268
To follow up on that--apparently enabling debuginfo currently forces the use of frame pointers in rustc: https://github.com/rust-lang/rust/issues/36452#issuecomment-254678281 . If that ever gets fixed we'll be unhappy on Win32 unless someone fixes LLVM to support writing FPO data on Win32. Other platforms should be fine, as we'll use CFI for non-Windows and Win64 has its own unwind tables that LLVM does have support for.
Comment 15•8 years ago
|
||
I reproduced the crash from bug 1318943 with the latest nightly, which was built with Rust 1.14 (which has debuginfo in libstd), and the crash report looks a lot better:
https://crash-stats.mozilla.com/report/index/95603eb6-4f4d-49a1-9752-3ff1c2161129
Comment 16•8 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #13)
> (In reply to Jed Davis [:jld] {⏰UTC-6} from comment #1)
> > I notice that the Rust frames are unwound via the frame pointer tactic,
> > rather than CFI. Could it just be that we're not building Rust code with -g?
>
> We are now building with -g, which is good, but per above libstd is not, so
> that's a problem for unwinding. Rust apparently omits frame pointers on most
> archs:
> https://github.com/rust-lang/rust/blob/
> c87ba3f1222ba20d491e8ed76a04977283280742/src/librustc_back/target/mod.
> rs#L267-L268
To follow up on this point: if you look at the "raw dump" tab of the crash from comment 15, you can see that since bug 1311458 landed we have debug info to unwind through Rust's libstd.
Updated•8 years ago
|
Whiteboard: [stylo]
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #16)
> To follow up on this point: if you look at the "raw dump" tab of the crash
> from comment 15, you can see that since bug 1311458 landed we have debug
> info to unwind through Rust's libstd.
Looks adequate to me. Can this bug be marked as fixed?
Flags: needinfo?(ted)
Comment 18•8 years ago
|
||
Yes, I think this is sufficiently fixed as filed. We can move the still-open dependencies to a followup bug for improvements to what we currently have.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(ted)
Resolution: --- → FIXED
Updated•8 years ago
|
Comment 19•8 years ago
|
||
Filed bug 1348896 to track those.
You need to log in
before you can comment on or make changes to this bug.
Description
•