Closed Bug 419879 Opened 12 years ago Closed 6 years ago

provide a way for third parties to upload symbols (for binary extensions/plugins)

Categories

(Socorro :: General, task)

task
Not set

Tracking

(Not tracked)

RESOLVED WORKSFORME
Future

People

(Reporter: ted, Unassigned)

Details

Given that a large portion of our topcrashers involve plugins and binary components in extensions, we should provide a simple way for their developers to upload symbols that the processor can use to provide better stack traces. I don't know what the best way to do this is, maybe just a web form that could be POSTed to? This would make it easy to use curl or some other commandline tool to upload symbols as part of a build process. We might need to protect against someone trying to flood it with crap, or people trying to overwrite already uploaded symbols, but I think both of those are manageable. I'm not sure if upload size will be a problem, most of the .sym files from Firefox are less than 1Mb in size, except for libxul, which is usually something like 30+Mb, but I don't expect anyone to be distributing extensions or plugins that are that big. We should be able to just add a new symbols_other directory (or something like that) that's writable by whatever we setup for this, so there wouldn't be any conflicts with the Firefox symbols etc.

This would benefit both us and the people making the symbols available, as we'd get better stack traces, and they'd get actual stack traces from users crashing in their extension or plugin.
We also need to think about the security implications of letting someone upload "symbols" and letting our open source, code-available processor use them. Without an understanding of how our processor works, I can envision someone trying to use this inappropriately, but if it's a non-issue, I'm more than happy to be wrong. ;)
CCing mento, the code makes heavy use of the STL, so I think that buffer overruns and the like are not that big of an issue.

Keep in mind we already allow the entire world to submit binary dumps to be processed by this same code. I'd be more worried about someone exploiting that code, honestly.
i would love symbols for java and flash, they'd be the two biggest things. note: i expect we'd only get their public symbols, not private symbols, but it'd still be somewhat helpful.

it should be fairly hard to stomp on someone else's symbols, and we definitely shouldn't let anything be stomped. we should have a provision for someone to ask us to delete or replace a set of symbols.

it'd also be good to provide some way for people to browse to see a list of which extensions/versions/platforms have provided symbols. It might be best for this not to be available to the general public. We should provide when people upload symbols a private random token which they can use to make the revoke, replace, or browse requests. (Curl shouldn't have any problem with this, right?)

Not that I'm advocating it, but in some ways, this feels like WebDAV :).

As for security, generally I agree w/ ted, I'd be more worried about the .dmp processor, and I'd hope that we can have Coverity scan it.
ok, spoke w/ ted, it turns out there is a security edge. it involves the source server. It should be fairly easy to write a filter which has a fairly strict source server acceptance list and anything it doesn't recognize gets stuck in a pending bin w/ a report sent to someone. The filters are of two forms:

RED = reject
GREEN = accept

things which don't get flagged RED or GREEN will get stuck in the pending YELLOW bin :).

and yes, the repository for these symbols can't be the same url as the standard ones because the trust relationship is different.

our instructions for configuring this/these symbol servers should include the commands to disable the source server (this is a one way flag, once you're disabled, your session will not let you continue) and should use a *different* local file path for the symbols (otherwise the poisoned output will be dangerous the next go around).

and of course, we should have at least a scary pointer to something which explains what can happen if you use the symbols and it turns out they weren't safe :(.
I wasn't expecting to accept PDB files here, just breakpad-formatted .sym files. My thought was to write a MDC doc about how to run dump_syms, and make available statically-linked copies of dump_syms for all our tier-1 platforms. Vendors would be required to dump the symbols to breakpad's format, and then upload them via a web form.
Target Milestone: --- → Future
I changed my mind and think that uploading PDB files will probably get us more traction than requiring people to run dump_syms. I don't think timeless' security concerns are valid here, since we're not using symsrv, just the DIA to parse PDB files.

We should probably do two things though:
1) Put this behind auth of some kind, hand out logins to third parties that need/want them.
2) Refuse to overwrite existing symbols. New versions of modules should have new debug IDs anyway, so they shouldn't have to overwrite.
ccing mcoates and clyon for infrasec feedback.
We'd like to take a look when this is implemented.  A big security decision is whether the uploaded data is processed/executed and whether we can do this securely.

The second issue is the security of the web upload page.  We can review this and ensure we are good.  Please file a security review request when its ready so I have all the needed data: https://wiki.mozilla.org/WebAppSec/Security_Review_Request

Considerations:
Ensuring uploads can't clobber each other
Prevent uploads from editing other directories
Deciding on access restrictions to upload feature - individual accounts vs shared secret
Implementing alerting to detect bad behavior
The uploaded data is processed using the dump_syms tool, which is part of google breakpad.
Right, uploaded PDB files would be processed by dump_syms first:
http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/windows/pdb_source_line_writer.cc#797

This code uses Microsoft's Debug Interface Access SDK to read PDB files and output some parts of them in a textual format.

(In reply to comment #8)
> We'd like to take a look when this is implemented.  A big security decision is
> whether the uploaded data is processed/executed and whether we can do this
> securely.

The output of dump_syms will be parsed by some code server-side, but it's a fairly simple parse:
http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/basic_source_line_resolver.cc#63

Some of the input data will be included in the output of the stackwalk tool, and presented in the Socorro web UI (function names, file names). I certainly hope that we are properly escaping that data in the web UI already, but we can sanity-check that.

I think if we limit access by putting it behind auth, and implement some logging for accountability, our actual risk is pretty low.

Anyway, I'll look into putting together a prototype and we can hash out more of the details. Right now I am manually dumping symbols from PDB files provided by Adobe and a few other people and hand-copying the symbols to the symbol server, which is not a good solution.
Component: Socorro → General
Product: Webtools → Socorro
Is this still wanted?
Yes. Right now I have to manually upload symbols for new Flash Player versions.
We gave Adobe access to upload their own symbols. bsmedberg is designing a REST API for uploading symbols for use by B2G partners which will make this better in the general case.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.