Closed
Bug 860256
Opened 11 years ago
Closed 11 years ago
Populate OrangeFactor's dev PHX ES DB with data from the ES SCL3 production instance
Categories
(Tree Management Graveyard :: OrangeFactor, defect)
Tree Management Graveyard
OrangeFactor
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: emorley, Unassigned)
Details
* OrangeFactor currently uses the Metrics ES instance, to which I have direct access using MPT-VPN. * Bug 772503 wants us to move to a new SCL3 prod ES instance, to which we're not being allowed access (bug 849161 comment 15 onwards). * In order to work on OrangeFactor I need access to recent production data - so we'll need to periodically mirror the prod ES DB to dev. TBD: * Automatic cron vs a script I can run by hand on brasstacks (machine on which the prod OF instance runs, which does have access to prod). * How many weeks worth of records to copy across (think we'll only need the last 2-4 weeks worth of some of the tables or whatever ES calls them).
Comment 1•11 years ago
|
||
adding jakem (webops) and removing the DBAs, the ES db isn't one that the db team administers, webops does.
Updated•11 years ago
|
Assignee: server-ops-database → server-ops-webops
Component: Server Operations: Database → Server Operations: Web Operations
QA Contact: cshields → nmaul
Updated•11 years ago
|
Flags: needinfo?(emorley)
Whiteboard: [triaged 20130411]
Reporter | ||
Comment 2•11 years ago
|
||
(In reply to Ed Morley [:edmorley UTC+1] from comment #0) > TBD: > * Automatic cron vs a script I can run by hand on brasstacks (machine on > which the prod OF instance runs, which does have access to prod). > * How many weeks worth of records to copy across (think we'll only need the > last 2-4 weeks worth of some of the tables or whatever ES calls them). For the moment, let's make this a manual script that can be run from brasstacks. The last 4 weeks worth of records would be great. jgriffin, I'm guessing some of the ES tables will be needed wholesale? I can't find a schema checked in anywhere, do you know which we'll need all of?
Flags: needinfo?(emorley) → needinfo?(jgriffin)
Comment 3•11 years ago
|
||
ES doesn't use schemas, per se. The indices OrangeFactor uses are: logs, tbpl, bugs, and bzcache.
Flags: needinfo?(jgriffin)
Reporter | ||
Comment 4•11 years ago
|
||
So I'm guessing we'll need all of bugs and bzcache, but just the last 4 weeks of logs and tbpl?
Comment 5•11 years ago
|
||
We should just need all of bzcache; the bugs index contains a different view of the starred bugs from TBPL, so we can just copy the most recent 4 weeks of that, as well as logs and tbpl.
Reporter | ||
Comment 6•11 years ago
|
||
From IRC, before I lose it: phrawzty: you can't really, like, "mirror" elasticsearch indexes. phrawzty: assuming the raw data is still availble, you can re-index it into a new instance, for example. phrawzty: then you have two indices phrawzty: with the same data :) phrawzty: in terms of keeping them live, generally speaking the most sane approach is to have the client - which is to say the thing that's actually feeding ES - communicate with both instances phrawzty: if re-indexing from the original data is not possible, then there are two options phrawzty: one is to literally copy the entire file system contents of a complete index. this is fine for small indices that aren't sharded across multiple nodes phrawzty: in The Real World, however, that's not generally possible edmorley: yeah I can imagine, particularly given the size if the indices here phrawzty: which leads to the second option, which is performing a scroll re-index from one ES instance directly to another phrawzty: this generally works, but it can be.. quirky, especially if strange things are being done to the source index. it works, but it can be finnicky, is what i'm trying to say. phrawzty: does.. does that help at all ? edmorley: yes thank you :-) edmorley: I'll have a think about this edmorley: maybe modifying the client is the easiest thing here edmorley: and then scheduling something to purge the dev instance periodically phrawzty: so you can set an expire time on indexed documents in ES edmorley: oh phrawzty: dunno if that is interesting to your use case, but ther eyou go edmorley: it may indeed be, thank you for that phrawzty: http://www.elasticsearch.org/guide/reference/mapping/ttl-field/ -- Moving back to OF so we can decide on the best strategy for this. In addition, I think I'm going to not make this block bug 772503 / bug 848834 any longer, given that OF has been holding them up long enough as it is.
Assignee: server-ops-webops → nobody
Component: Server Operations: Web Operations → Orange Factor
Product: mozilla.org → Testing
QA Contact: nmaul
Version: other → Trunk
Reporter | ||
Updated•11 years ago
|
Summary: Periodically mirror OrangeFactor's production ES SCL3 DB to the dev PHX ES instance → Populate OrangeFactor's dev PHX ES DB with data from the ES SCL3 production instance
Whiteboard: [triaged 20130411]
Reporter | ||
Comment 7•11 years ago
|
||
Production: elasticsearch-zlb.webapp.scl3.mozilla.com:9200 Dev: elasticsearch-zlb.dev.vlan81.phx.mozilla.com:9200
Comment 8•11 years ago
|
||
If we opt for the write-everything-to-two-ES-instances approach, we'll need to update logparser and bzcache as well so they know how to do this.
Comment 9•11 years ago
|
||
And technically this *does* block the move because we need log data for OF to be useful.
Reporter | ||
Comment 10•11 years ago
|
||
(In reply to Mark Côté ( :mcote ) from comment #9) > And technically this *does* block the move because we need log data for OF > to be useful. This bug is only about the dev/staging instance, which we're not currently using, so it doesn't block :-)
Comment 11•11 years ago
|
||
Yeah, agreed that this shouldn't block us from getting off of the metrics cluster--except that this bug was still blocking on bug 848834, which itself blocks bug 772503. :) I've cleared the blocker list, since as you say we can really do this anytime. We'll come back to this sometime after the migration. Also agreed that we probably don't have to actually import any data to the dev cluster (unlike the prod cluster (bug 8705590)), as long as we get tbpl and the logparser writing to the dev cluster as well as prod, and (optionally) get the bzcache refresh script writing to dev as well (but only for the last 4 weeks, not 6 months). We also need to be sure that tbpl, logparser, and the bzcache refresher can all gracefully handle failed writes of various types (host not found, port not open, errors when submitting data, etc.), since this will be a *dev* system and may not always be up or fully functional.
No longer blocks: 848834
Reporter | ||
Comment 12•11 years ago
|
||
I think we just need to give up on this idea for now - short/medium term much easier just to tunnel to prod data :-)
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → INCOMPLETE
Assignee | ||
Updated•10 years ago
|
Product: Testing → Tree Management
Updated•4 years ago
|
Product: Tree Management → Tree Management Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•