The missing symbol list has empty code id and code path fields
Categories
(Tecken :: General, defect, P2)
Tracking
(Not tracked)
People
(Reporter: gsvelto, Assigned: willkg)
References
Details
I'm looking at the Windows ones specifically. I get a list that looks like this:
debug_file,debug_id,code_file,code_id
gpapi.pdb,90A5C54A6976C0BA95739590B4C441D41,,
sysfer.pdb,29E108EB51E24DC5945E77676A9214041,,
atidxx64.pdb,E6F8E64F9F9D485890C2BF6A221904F61,,
QIPCAP64.pdb,66D447DDC7564E189CAD2B9AED0F320B1,,
...
Note how the last two fields are empty. This is preventing our scraping script from retrieving the missing symbols.
Reporter | ||
Comment 1•3 years ago
|
||
Also in the generic list I don't see any Linux/macOS symbols. Maybe it's a related issue?
Reporter | ||
Comment 2•3 years ago
|
||
Or maybe we only store the code ID and file if it's been requested to Tecken? That would be a problem because AFAIK we never request those, hence we'd never store them in the missing symbols list... chicken and egg. Will is that correct? Do we only store the code ID and file in the missing symbol list if they were requested and the server replied with a 404? I see we're doing it here but I don't know where code_id
and code_file
are coming from. Requests from Socorro should be for .sym files so they should not contain the code_id
or code_file
, just the debug_id
and debug_file
.
Reporter | ||
Comment 3•3 years ago
|
||
Never mind, there's something really funny going on. If I look for the missing .sym file for bug 1751571 I can see that it's been requested. So far so good. If I look up the code id, I can see that it's also been requested. Here's the catch: the fields I get are for the debug_id and debug_file, not for the code_id and code_file. It seems that Tecken mixes them up, hence the code_id and code_file never appear as such, but they appear as debug_id and debug_file entries.
Reporter | ||
Comment 4•3 years ago
|
||
Note, in the case of bug 1751571 the different fields should be like this:
- code_path: DWrite.dll
- code_id: 5D9F4C0E198000
- debug_path: DWrite.pdb
- debug_id: CF7FD86D2CE048138B109F92739257821
Assignee | ||
Comment 5•3 years ago
|
||
I didn't write the download API or the missing symbol code, so I'm not sure what's going on. I'll look into it.
Assignee | ||
Comment 6•3 years ago
|
||
The missing symbols list comes from the download_missingsymbol
table. That table is populated when a download API request can't find the symbol requested. Symbols are requested using the download API:
https://tecken.readthedocs.io/en/latest/download.html
Which has this url scheme:
GET /<DEBUG_FILENAME>/<DEBUG_ID>/<SYMBOL_FILE>
It also takes two querystring parameters: code_id
and code_filename
.
Socorro uses the download API as the last url in its symbols_urls list so when the Socorro processor runs the stackwalker on minidumps and a symbol is missing, it'll result in an item added to the download_missingsymbol
table.
I think rust-minidump's HttpSymbolSupplier isn't adding the code_id
and code_filename
querystring params:
This is what we had in ye olde stackwalker:
I'll write up a rust-minidump issue for this.
Assignee | ||
Comment 7•3 years ago
|
||
I wrote it up here: https://github.com/luser/rust-minidump/issues/479
Assignee | ||
Comment 8•3 years ago
|
||
Going back a bit, I can't speak to what's requesting symbols from Tecken and switching the code_id/code_filename for the debug_id/debug_filename.
Eliot also uses Tecken to download symbols, but it doesn't know anything about code_id or code_filename.
Maybe someone has a script that's using code_id/code_filename for downloading symbols? I know we talked about something like that a while back (or my memory is bad on this).
Maybe someone is sending symbolication requests that use code_id/code_filename instead of debug_id/debug_filename?
I'll look at server logs tomorrow and see if I can see anything interesting.
Reporter | ||
Comment 9•3 years ago
|
||
Thanks for chasing this particular wild goose for me 🙏 It went a lot farther than I had anticipated. I had thought of every possible place but not the stack walker.
Reporter | ||
Comment 10•3 years ago
|
||
(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #8)
Maybe someone has a script that's using code_id/code_filename for downloading symbols? I know we talked about something like that a while back (or my memory is bad on this).
People opening minidumps in Visual Studio or windbg would do that. They'd try to fetch the DLL/EXE and use the code ID for that.
Assignee | ||
Comment 11•3 years ago
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #10)
(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #8)
Maybe someone has a script that's using code_id/code_filename for downloading symbols? I know we talked about something like that a while back (or my memory is bad on this).
People opening minidumps in Visual Studio or windbg would do that. They'd try to fetch the DLL/EXE and use the code ID for that.
Oh, right! I found the bug: bug #1746940. I added this as a use case there.
Reporter | ||
Comment 12•3 years ago
|
||
Aria cooked up the 0.10.0 release of rust-minidump which contains my fix for this problem. Can you pull it in Socorro? Most of the changes compared with bug 1752424 are under-the-hood stability improvements (many of which were contributed!). There were also some small changes here and there but nothing that should affect the output in a significant way.
Reporter | ||
Comment 13•3 years ago
|
||
Also I'll keep manually scraping the missing symbols until the fix is in place.
Assignee | ||
Comment 14•3 years ago
|
||
I updated rust-minidump last week and the latest missing symbols report includes code id and code filename, so I think we're good here.
Reporter | ||
Comment 15•3 years ago
|
||
FYI this is confirmed to be working, the latest scraping task has nicely filled up its output artifact with symbols for various missing libraries.
Description
•