Closed Bug 879822 Opened 10 years ago Closed 9 years ago
Store Bugzilla data in Elastic Search
This is a tracking bug for the steps required to set up an ElasticSearch database containing the same kind of data as metrics' internal cluster, but publicly available and thus restricted to public bugs only. Background copied from bug 872363: metrics has an ES cluster that contains not only the current state of Bugzilla, but past history as well. ElasticSearch provides a better method of performing certain types of analyses, and use of it doesn't impact the main Bugzilla database. However, it contains *some* information on confidential bugs, so it is behind LDAP, which means that any tools that use it generally must be behind LDAP as well, which limits their usefulness. It would thus be really great to have a publicly accessible ES cluster which contains *only* data on public bugs (or to word it another way, *no* data on confidential bugs of any type). mcoates has already signed off on this idea from a security standpoint, provided that bugs with any limitations on visibility are excluded, and that all data concerning bugs that are made confidential some time after creation (i.e. were at one point public, but no longer) is removed in a timely fashion.
Redefining this bug a little to reflect expanded scope. Since metrics would like to decommission their Bugzilla ES cluster entirely, we will need a replacement. This will be a FULL mirror of Bugzilla data, including confidential bugs, and hence any applications using it MUST be behind LDAP. This is how the current metrics-run cluster works. This is in addition to the cluster containing only public bugs. We can use the same ETL software for both; I'm not sure if we will need two separate machines for adequate processing power, though. Filing request for machine separately.
Summary: ElasticSearch cluster for public bugs → Store Bugzilla data in Elastic Search
What will it mean to be behind LDAP? What LDAP accounts will have access to it?
Josh, There are now two parts. The first is public with confidential bugs removed on ES cluster. The second ES cluster includes confidential bugs, but with summary and comments removed. This second one will remain behind LDAP. I believe anyone with LDAP access should be allowed access to this data, just like the current Metrics setup. Security may have another opinion. Both clusters will be fed by one common piece of ETL code (https://github.com/klahnakoski/Bugzilla-ETL)
Yeah, as Kyle said, right now the metrics cluster is accessible by anyone with an LDAP account. We can revisit this if necessary. At least Kyle and I will need access to it for starters.
Please add suggestions
Attachment #828667 - Flags: feedback?(mcote)
Comment on attachment 828667 [details] Architecture Drawing (v1) I think this is good--however I would much prefer the diagram to be in a standard picture format like PNG.
Attachment #828667 - Flags: feedback?(mcote) → feedback+
Just found out that we *can* create Gliffy diagrams in mana; you just have to have the correct permissions for the space. I suggest creating your own space (in the drop-down, top right) and using Gliffy to publish the diagram there.
Added port numbers, and client/server relationships. Showed LDAP machines as cloud, and Kyle's ES Cluster specifically.
Attachment #828667 - Attachment is obsolete: true
This looks a little off in LibreOffice. Can you export to a PNG?
png is easier to view
Attachment #8337726 - Attachment is obsolete: true
\o/ An example of a public dashboard: http://claw.cs.uwaterloo.ca/~okononen/
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.