Closed
Bug 529946
Opened 15 years ago
Closed 12 years ago
Query access to backend data
Categories
(Socorro :: General, task)
Socorro
General
Tracking
(Not tracked)
RESOLVED
FIXED
16
People
(Reporter: dmandelin, Assigned: laura)
Details
(Whiteboard: [Q22012wanted][qa-])
I like to do historical analysis on crash report data. Direct query access (SQL or whatever) to the backend would make it a lot easier. Currently I'm using Python scripts to drive the web interface and parse results, but that's clunky and inefficient.
Comment 1•15 years ago
|
||
Some concern that long-running queries might impact the ability of the server to timely accept incoming data; and that allowing free-form queries would lead to such a state. Would it be reasonable to add csv format from the web interface and use that output? I believe at least some pages already provide that.
Updated•15 years ago
|
Assignee: nobody → server-ops
Component: Socorro → Server Operations
Product: Webtools → mozilla.org
QA Contact: socorro → mrz
Version: Trunk → other
Comment 2•15 years ago
|
||
mrz - can we get dave access to this data somehow? Aravind? If it helps them to query the database directly via some read-only account, or give them a stage db that contains a copy, I'm all for it. Possible?
Comment 3•15 years ago
|
||
I share Frank's concern. Is there a way to do this and cut off David if it becomes a problem? David, you'd need to know access will be cut off if it's service impacting (shoot first, ask questions later). Aravind, what do you think?
Assignee: server-ops → aravind
Reporter | ||
Comment 4•15 years ago
|
||
(In reply to comment #3) > David, you'd need to know access will be cut off if it's service impacting > (shoot first, ask questions later). Of course. Or maybe you could run my jobs at a lower priority if possible. Btw, in the past I have done some historical bug analysis by writing scripts that query the web page, e.g., one query for each day in a 3-month time period. You guys never complained about that :-) so I assume it wasn't service impacting. What I'm really asking for in this bug is a way to do that kind of thing that is less clunky (some sort of query language or map-reduce expression vs. a patchwork of HTML parsing scripts) and hopefully faster because it's more direct. Ideally, it would be less perf-impacting overall but with same or better perf since I wouldn't be touching the web/HTML parts as much.
Comment 5•15 years ago
|
||
I think this would be better achieved through the python middleware layer we have deployed. @lars: is this feasible, would it make sense to provide access like this to folks through the middleware layer?
Comment 6•15 years ago
|
||
+1 on comment #5
Comment 7•15 years ago
|
||
(In reply to comment #5) Another thing the Postgres folks advocated was creating a special user/password. This will allow us to have different max memory, query timeout, etc for this privileged user to fine tune resource usage.
Updated•14 years ago
|
Assignee: aravind → nobody
Component: Server Operations → Socorro
Product: mozilla.org → Webtools
QA Contact: mrz → socorro
Assignee | ||
Comment 8•14 years ago
|
||
Let's hold this on rewriting the middleware (1.7) and review for 1.8. Should be a generally useful feature.
Target Milestone: --- → 1.8
Assignee | ||
Comment 10•13 years ago
|
||
dmandelin: for right now, if you have a query file a bug for Metrics. We'll likely solve this with ES ad hoc queries.
Target Milestone: 2.0 → 2.2
Reporter | ||
Comment 11•13 years ago
|
||
(In reply to comment #10) > dmandelin: for right now, if you have a query file a bug for Metrics. We'll > likely solve this with ES ad hoc queries. OK, thanks.
Assignee | ||
Updated•13 years ago
|
Target Milestone: 2.2 → 2.3
Updated•13 years ago
|
Target Milestone: 2.3 → 2.3.1
Comment 12•13 years ago
|
||
(In reply to David Mandelin from comment #0) > I like to do historical analysis on crash report data. Direct query access > (SQL or whatever) to the backend would make it a lot easier. Currently I'm > using Python scripts to drive the web interface and parse results, but > that's clunky and inefficient. Just a note that we do drop a CSV file (which is a sanitized dump of the main reports table) nightly: https://crash-analysis.mozilla.com/crash_analysis/20111005/20111005-pub-crashdata.csv.gz Given the crash ID in that file, you can access individual reports like so: https://crash-stats.mozilla.com/dumps/446e67d0-def4-43e7-8228-7d62e2110316.jsonz
Assignee | ||
Comment 13•13 years ago
|
||
We have plans for this, but not until early 2012. De-targeting until then. If you need something specific let us know.
Whiteboard: [1Q2012]
Target Milestone: 2.3.1 → ---
Assignee | ||
Updated•13 years ago
|
Whiteboard: [1Q2012] → [Q12012wanted]
Assignee | ||
Comment 14•13 years ago
|
||
Basic plans here are: - Add a VM to run SQL against (some subset of? secondary?) PostgreSQL, depending on DBA advice - Add a VM to run queries against HBase - Open up our API internally for other people to run apps/scripts against
Assignee: nobody → laura
Assignee | ||
Updated•13 years ago
|
Target Milestone: --- → 2.4.2
Updated•13 years ago
|
Component: Socorro → General
Product: Webtools → Socorro
Assignee | ||
Comment 15•12 years ago
|
||
Re item 2, above: Dave: You now have access to query HBase via Hive (looks like SQL, maps to MapReduce) and Pig (uses its own query language, called Pig Latin, maps to MapReduce). Details coming in https://bugzilla.mozilla.org/show_bug.cgi?id=708452 Re item 1: Do you also need access to run SQL against the secondary PostgreSQL instance? If so I'll get a bug on file for that. Re item 3: We're planning on opening the API internally around the end of Q1. Let me know if you have any questions.
Assignee | ||
Updated•12 years ago
|
Target Milestone: 2.4.2 → 2.4.3
Updated•12 years ago
|
Target Milestone: 2.4.3 → 2.4.4
Assignee | ||
Comment 16•12 years ago
|
||
Done except for the API part, revisit in a month or so.
Target Milestone: 2.4.4 → 2.6
Assignee | ||
Updated•12 years ago
|
Target Milestone: 4 → 5
Assignee | ||
Updated•12 years ago
|
Whiteboard: [Q12012wanted] → [Q22012wanted]
Target Milestone: 5 → 9
Assignee | ||
Updated•12 years ago
|
Target Milestone: 9 → 13
Assignee | ||
Updated•12 years ago
|
Target Milestone: 13 → 14
Assignee | ||
Updated•12 years ago
|
Target Milestone: 14 → 16
Assignee | ||
Updated•12 years ago
|
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
Whiteboard: [Q22012wanted] → [Q22012wanted][qa-]
You need to log in
before you can comment on or make changes to this bug.
Description
•