Closed Bug 1614928 Opened 5 years ago Closed 3 years ago

symbolicate API doesn't return line numbers

Categories

(Eliot :: General, enhancement, P2)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Unassigned)

References

Details

The symbolicate API v5 is described here:

https://tecken.readthedocs.io/en/latest/symbolication.html#what-it-is

The stacks section in the response is a list of frame objects. The frame objects don't appear to include the line number (and possibly some other information).

https://github.com/mozilla-services/tecken/blob/dd97609c90cd7da57e23f076710daab40883406f/tecken/symbolicate/views.py#L351-L358

The sym file parsing is part of Tecken, so line number information should be available and maybe we just need to improve the parser and then include it in the response.

This bug covers adding line number information to the symbolication v5 api response.

The symbolication API is used by perf.html and I don't think they need this. However, it's also used to symbolicate stacks for PHC and they need line numbers. Further, we're going to want stacks with line numbers when we symbolicate crash pings in Telemetry. I'm a little surprised this hasn't come up already.

Making this a P2.

Type: task → enhancement
Priority: -- → P2

As a side note, we might want to switch to a Rust parser for sym files. calixte says we're using this one here:

https://docs.rs/symbolic-debuginfo/6.1.4/symbolic_debuginfo/breakpad/struct.BreakpadObject.html

That's maintained by the Sentry folks. It has Python bindings, too:

https://github.com/getsentry/symbolic#python-usage

Can we ditch our home-grown sym parser and switch to that?

Moving bugs to Tecken product.

Component: Tecken → Symbolication
Product: Socorro → Tecken
Depends on: 1621638
Assignee: nobody → willkg
Status: NEW → ASSIGNED

There are a couple of questions here that need answering:

  1. If we add line numbers, how does that symbol info affect caching in redis? My concern here is that it means we have to do a rewrite.
  2. If we add line numbers, does that require us to do this in a v6?

Line number records in the SYM files look like this:

address size line filenum

For example, this is line 59 in file 4 and it's at address c184 and lasts for 7 bytes:

c184 7 59 4

See: https://chromium.googlesource.com/breakpad/breakpad/+/master/docs/symbol_files.md#line-records

Tecken currently skips those lines--it doesn't parse them at all. To add support for this, we need to:

  1. add support to the SYM file parser for line numbers
  2. completely change the address -> symbol lookup code to account for line numbers
  3. change how caching works

There are tons of line number records in the SYM files I looked at, so this adds a ton more information in the symbol maps. Maybe 10x more.

I think this is a non-trivial rewrite and I'm concerned that it'll change the performance characteristics enough that we'll need to rethink caching and work on other optimizations.

(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #1)

The symbolication API is used by perf.html and I don't think they need this.

We don't need it yet, but we will at some point. This is tracked in issue #820.
A related feature request is inline stacks, though those can be treated as orthogonal.
I had an intern last year who was implementing a strawman v6 symbol API in Rust which would be used for profiler symbolication for local builds only (not for downloaded builds, those would still use the regular symbol server), and that strawman v6 API included both inline stack information and file+line information per inline stack frame. However, this work was not finished, and it is not in use today. I am planning to pick it up at some point in the future, in bug 1563318.

(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #5)

I think this is a non-trivial rewrite and I'm concerned that it'll change the performance characteristics enough that we'll need to rethink caching and work on other optimizations.

It may be worth talking to the author of this blog post: https://blog.sentry.io/2019/06/13/building-a-sentry-symbolicator

The post mentions that they used a symcache representation which was optimized for quick lookup of function name + file + line based on an address. However, symcaches seem to be no longer in use, see e.g. this PR which removes some references to their existence. It would be interesting to find out what the replacement is and what the problems with it were.

The PR you mentioned drops use of symcache and cficache from sentry, but not from symbolic and the PR (May 2019) happened before they wrote that blog post (June 2019). I think they were moving code around between projects.

I spent last week seeing if I could slide symbolic into Tecken under bug #1621638 which this is blocked on. I was planning to write a blog post about my experiences including how symbolication using it would work.

I think rewriting the symbolicate APIs using symbolic is the way forward here, but it's a medium-sized project, so I spent some time looking at other options--hence comment #5.

I'll know more about the plans here in the next week or two after talking over my experiences with my team. It's good to know that perf.html has plans that need line numbers.

If you need inline stack information via the symbolicate API, we should open up a new bug for that. I think it needs some additional things than what we're planning here.

(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #7)

The PR you mentioned drops use of symcache and cficache from sentry, but not from symbolic and the PR (May 2019) happened before they wrote that blog post (June 2019). I think they were moving code around between projects.

Oh, great, thanks.

The performance team is also looking for line numbers in frames from the symbolication API. I added Doug to the cc: list accordingly.

Depends on: 1636210

Unassigning myself from this. It'll get fixed when we do bug #1636210.

Assignee: willkg → nobody
Status: ASSIGNED → NEW

We finished the work in bug #1636210 and symbolication API v5 has line numbers. Marking as FIXED.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Component: Symbolication → General
Product: Tecken → Eliot
You need to log in before you can comment on or make changes to this bug.