Open Bug 1398533 Opened 5 years ago Updated 7 days ago

Support inlined functions in crash reports

Categories

(Toolkit :: Crash Reporting, enhancement)

enhancement
Not set
normal

Tracking

()

People

(Reporter: jrmuizel, Unassigned)

References

(Depends on 1 open bug, Blocks 4 open bugs)

Details

Attachments

(2 files)

Because there's so much inlining in rust code, crash reports can be hard to interpret because of all of the missing stack frames. The dwarf debug info contains the information required to reconstruct a full stack and it would be nice if we could get it. Bug 524410 contains a first step needed to be able to do this.
Depends on: 524410
Blocks: 1397407
Blocks: 1348896
It would be really nice to get this fixed sooner rather than later. It's quite frustrating to get crash stacks like [1] which is missing the line number on the most useful stack frame because the Option::unwrap() got inlined into the caller. So it just points to result.rs:0 which is not helpful.

[1] https://treeherder.mozilla.org/logviewer.html#?job_id=165238515&repo=try&lineNumber=2115
Duplicate of this bug: 1492539
We continue to get bit by this problem. See bug 1508714 for the most recent one.

Anthony, what's needed to move this forward?
Flags: needinfo?(ajones)
Blocks: 1508714
Blocks: 1509817
Blocks: 1521095

(In reply to Jeff Muizelaar [:jrmuizel] from comment #3)

We continue to get bit by this problem. See bug 1508714 for the most recent
one.

Anthony, what's needed to move this forward?

It is on our radar but is a fair bit of work.

Flags: needinfo?(ajones)

I looked at the crash in bug 1521095 as part of the work in bug 524410. Part of the problem with those particular crashes is that rustc doesn't seem to be generating line information that spans the crashing instruction. Fixing that would require significant compiler changes, if the problem is even fixable. (ISTR a recent llvm-dev discussion about precise debug info and somebody saying that for certain transformations like CSE, LLVM is not interested in assigning debug information to the newly-created instruction, or something like that.)

Here's another crash report that's currently unusable because of missing inline callframes: https://crash-stats.mozilla.com/report/index/02d5f21e-3c44-403d-af8b-fd4890190329

 	static void webrender::scene_builder::SceneBuilder::run() 	gfx/wr/webrender/src/scene_builder.rs:0

I don't know for certain because Windows, but on Linux a line number of 0 has typically meant that there's not even real debug/line information covering the instruction at this PC. The current inlined function work is for non-Windows, but I'd be skeptical if there's something we could do here short of fixing rustc.

I've seen line number 0 show up as macro expansion. I'll take a look at the raw dump to see what visual studio thinks.

Flags: needinfo?(jmuizelaar)

Visual studio didn't do any better. I think the debug information is probably just no good.
Here's a crash that could use better output: https://crash-stats.mozilla.org/report/index/89167b96-0d9f-48e0-ae05-a85460190510#tab-details

It reports a panic in:
webrender::renderer::TextureResolver::bind src/libcore/option.rs:312

I'm interested in the line number in the caller and not in option.rs.

Flags: needinfo?(jmuizelaar)
Duplicate of this bug: 1565229

Here's a good example: https://crash-stats.mozilla.org/report/index/92b985d5-4f64-419c-89f4-2b54a0200506

The stack from the crash report is basically useless. Here's what visual studio gives:

 	[Inline Frame] xul.dll!MOZ_Crash(const char * aFilename, int aLine, const char * aReason) Line 332	C++
>	xul.dll!RustMozCrash(const char * aFilename, int aLine, const char * aReason) Line 16	C++
 	xul.dll!mozglue_static::panic_hook(core::panic::PanicInfo * info) Line 89	Unknown
 	xul.dll!core::ops::function::Fn::call<fn(core::panic::PanicInfo*),(core::panic::PanicInfo*)>(void(*)(core::panic::PanicInfo *) *, core::panic::PanicInfo *) Line 72	Unknown
 	xul.dll!std::panicking::rust_panic_with_hook() Line 478	Unknown
 	xul.dll!std::panicking::begin_panic_handler() Line 375	Unknown
 	xul.dll!core::panicking::panic_fmt() Line 84	Unknown
 	xul.dll!core::result::unwrap_failed() Line 1188	Unknown
 	[Inline Frame] xul.dll!core::result::Result<webrender::renderer::GpuCacheTexture, webrender::renderer::RendererError>::unwrap(core::result::Result<webrender::renderer::GpuCacheTexture, webrender::renderer::RendererError> self) Line 956	Unknown
 	[Inline Frame] xul.dll!webrender::renderer::Renderer::prepare_gpu_cache(webrender::frame_builder::Frame * self) Line 3717	Unknown
 	[Inline Frame] xul.dll!webrender::renderer::{{impl}}::render_impl::{{closure}}(webrender::renderer::{{impl}}::render_impl::closure-1) Line 3433	Unknown
 	[Inline Frame] xul.dll!webrender::profiler::TimeProfileCounter::profile(webrender::renderer::{{impl}}::render_impl::closure-1 self) Line 467	Unknown
 	xul.dll!webrender::renderer::Renderer::render_impl(core::option::Option<euclid::size::Size2D<i32, webrender_api::units::DevicePixel>> self) Line 3412	Unknown
 	xul.dll!webrender::renderer::Renderer::render(euclid::size::Size2D<i32, webrender_api::units::DevicePixel> self) Line 3192	Unknown
 	xul.dll!webrender_bindings::bindings::wr_renderer_render(webrender::renderer::Renderer * renderer, int width, int height, webrender::renderer::RendererStats * out_stats, thin_vec::ThinVec<euclid::rect::Rect<i32, webrender_api::units::DevicePixel>> * out_dirty_rects) Line 594	Unknown
 	xul.dll!mozilla::wr::RendererOGL::UpdateAndRender(const mozilla::Maybe<mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits>> & aReadbackSize, const mozilla::Maybe<mozilla::wr::ImageFormat> & aReadbackFormat, const mozilla::Maybe<mozilla::Range<unsigned char>> & aReadbackBuffer, mozilla::wr::RendererStats * aOutStats) Line 155	C++
 	xul.dll!mozilla::wr::RenderThread::UpdateAndRender(mozilla::wr::WrWindowId aWindowId, const mozilla::layers::BaseTransactionId<mozilla::VsyncIdType> & aStartId, const mozilla::TimeStamp & aStartTime, bool aRender, const mozilla::Maybe<mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits>> & aReadbackSize, const mozilla::Maybe<mozilla::wr::ImageFormat> & aReadbackFormat, const mozilla::Maybe<mozilla::Range<unsigned char>> & aReadbackBuffer) Line 487	C++
 	xul.dll!mozilla::wr::RenderThread::HandleFrameOneDoc(mozilla::wr::WrWindowId aWindowId, bool aRender) Line 369	C++
 	[Inline Frame] xul.dll!mozilla::detail::RunnableMethodArguments<unsigned long long,bool>::applyImpl(mozilla::layers::APZCTreeManager * o, void(mozilla::layers::IAPZCTreeManager::*)(unsigned __int64, bool) args, mozilla::Tuple<StoreCopyPassByConstLRef<unsigned long long>,StoreCopyPassByConstLRef<bool>> &, std::integer_sequence<unsigned long long,0,1>) Line 1168	C++
Duplicate of this bug: 1665367

As I understand, this is not specific to Rust. Would it make sense to rephrase the title of this bug? (When I filed 1665367, I tried to find an existing bug, but failed).

Yes, this should be clearer.

Summary: Rust crash reports would be benefit from supporting inlined functions → Support inlined functions in crash reports
Depends on: 1636194
No longer depends on: 1636194
Blocks: 1731262
Depends on: 1721278
Blocks: 1721278
No longer depends on: 1721278
Blocks: 1738381
No longer blocks: 1738381
Blocks: 1765277

:mstange started working on this by adding support to dump_syms (see his PR #392).

I've created a draft PR in rust-minidump to include inline information in the minidump JSON. There's an example minidump JSON in the PR. I'd appreciate feedback on the JSON format.

Some work in Socorro is necessary in order to consume this information and to include the new stack frames in the crash signature and in the HTML tables that show the call stack.

Depends on: 1779631

So, summing up the activities that still need to be handled before we can start outputting inline directives for our builds:

  • Make sure the new much larger symbol files can be handled by the symbol server (Tecken). Storage is going to go up 4 or 5 times and so will upload/download times for new symbols
  • Make sure that Eliot (aka the other symbol server) can handle files with INLINE/INLINE_ORIGIN directives (maybe this was already done and I just don't remember?)
  • Deploy a version of rust-minidump's stack walker to Socorro that supports INLINE/INLINE_ORIGIN directives
  • Adjust Socorro's UI to take into account inlined frames
  • Adjust the signature generation to work well with inlined functions, this probably requires running the new stack walker for a while on the staging server, adjusting the signature generation lists as we go until we're satisfied that we can move it to production. Even assuming everything works correctly we can't deploy straight to production because many signatures will change once we have inlined frames, so it would cause plenty of trouble with our existing bugs
  • Adjust fix-stacks so that it processes INLINE/INLINE_ORIGIN directives

Am I missing something?

(In reply to Gabriele Svelto [:gsvelto] from comment #19)

So, summing up the activities that still need to be handled before we can start outputting inline directives for our builds:

Some of these are already tracked in the dependency tree of this bug:

  • Make sure the new much larger symbol files can be handled by the symbol server (Tecken). Storage is going to go up 4 or 5 times and so will upload/download times for new symbols

no bug yet (and I'm not sure what this involves concretely, other than shipping the change, monitoring it, and backing it out if needed)

  • Make sure that Eliot (aka the other symbol server) can handle files with INLINE/INLINE_ORIGIN directives (maybe this was already done and I just don't remember?)

bug 1779633

  • Deploy a version of rust-minidump's stack walker to Socorro that supports INLINE/INLINE_ORIGIN directives

bug 1779630

  • Adjust Socorro's UI to take into account inlined frames

no bug yet / could be done in this bug

  • Adjust the signature generation to work well with inlined functions, this probably requires running the new stack walker for a while on the staging server, adjusting the signature generation lists as we go until we're satisfied that we can move it to production. Even assuming everything works correctly we can't deploy straight to production because many signatures will change once we have inlined frames, so it would cause plenty of trouble with our existing bugs

no bug yet / could be done in this bug

  • Adjust fix-stacks so that it processes INLINE/INLINE_ORIGIN directives

bug 1781819

Am I missing something?

in-tree rust-minidump, bug 1781930

I think that's all.

You need to log in before you can comment on or make changes to this bug.