In https://bugzilla.mozilla.org/show_bug.cgi?id=1361809 we're ultimately trying to remove all dealings with missing symbols from the socorro processor. Instead, the plan is to start querying from symbols.mozilla.org too which logs all 404s and thus solves this problem of ultimately producing a CSV file of symbols that were missing. Because stackwalker needs to emit slightly more information, about which symbols couldn't be looked up, we need to now instead pack that into the URL as query string parameters. As the stackwalker symbolicates it knows the name of the file in S3 (e.g. 'foo.pdb/946C0C63132015DD88/foo.sym`) but it also has a code_file and code_id. These need to be sent as query string paramters. I.e. `GET foo.pdb/946C0C63132015DD88/foo.sym?code_file=CODE_FILE&code_id=CODE_ID`. Once the stackwalker has this functionality, we can through configuration in Socorro Processor, add a third URL to the invocation of stackwalker.
Ted, Dare I assign this to you? You might notice from my language above that I'm not entirely sure what this entails for the innards of stackwalker. But I'm eager to learn and to help. By the way, stackwalker is currently invoked with two `--symbols-url=...` flags.  Considering that soon https://symbols.mozilla.org will work the same as https://s3-us-west-2.amazonaws.com/org.mozilla.crash-stats.symbols-public/v1 we could consider changing Socorro to use symbols.mozilla.org directly (the first of the two flags) for the public symbols. Theoretically we're pretty certain the http_symbol_supplier.cc code copes with redirects through curl. I don't know what the performance and availability difference is going to be yet.  https://github.com/mozilla/socorro/blob/bf9c1d86c17ea91d83f565bc051a01ec2400467b/socorro/processor/breakpad_transform_rules.py#L516-L517
Assignee: nobody → ted
Commit pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/9cba3bd800c2d9d53aa8e679c06559affc0326bd bug 1363177 - send code_id and code_file in query string when fetching symbols via HTTP (#3775) This will allow the symbol server to store the info necessary for missing symbols so we can fetch them from the Microsoft symbol server.
One last thing I'd like to do... Once this goes into stage the URL that's hitting our S3 buckets is going to change. I should be able to see that if I enabled S3 bucket logging temporarily and inspect the URLs that it requests.
I did some logging on the S3 symbol buckets, downloaded some of those logs and lo and behold there's a bunch of GET requests that have the query string. E.g. GET /org.mozilla.crash-stats.symbols-public/v1/sysfer.pdb/28695B7D959F48ACB087879FD7CDC4211/sysfer.sym?code_file=sysfer.dll&code_id=587B937A7b000 Also, I looked at a bunch of random rows in `missing_symbols_20170515 WHERE date_processed='2017-05-16'` and the data there looks equally sane. I.e. no '?code_file' within the 'debug_file' column.
Status: NEW → RESOLVED
Last Resolved: 8 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.