860256 - Populate OrangeFactor's dev PHX ES DB with data from the ES SCL3 production instance

Reporter

Description

•

12 years ago

* OrangeFactor currently uses the Metrics ES instance, to which I have direct access using MPT-VPN. * Bug 772503 wants us to move to a new SCL3 prod ES instance, to which we're not being allowed access (bug 849161 comment 15 onwards). * In order to work on OrangeFactor I need access to recent production data - so we'll need to periodically mirror the prod ES DB to dev. TBD: * Automatic cron vs a script I can run by hand on brasstacks (machine on which the prod OF instance runs, which does have access to prod). * How many weeks worth of records to copy across (think we'll only need the last 2-4 weeks worth of some of the tables or whatever ES calls them).

Sheeri Cabral [:sheeri]

Comment 1

•

12 years ago

adding jakem (webops) and removing the DBAs, the ES db isn't one that the db team administers, webops does.

Shyam Mani [:fox2mike]

Updated

•

12 years ago

Assignee: server-ops-database → server-ops-webops

Component: Server Operations: Database → Server Operations: Web Operations

QA Contact: cshields → nmaul

Daniel Maher [:phrawzty]

Updated

•

12 years ago

Flags: needinfo?(emorley)

Whiteboard: [triaged 20130411]

Ed Morley [:emorley]

Reporter

Comment 2

•

12 years ago

(In reply to Ed Morley [:edmorley UTC+1] from comment #0) > TBD: > * Automatic cron vs a script I can run by hand on brasstacks (machine on > which the prod OF instance runs, which does have access to prod). > * How many weeks worth of records to copy across (think we'll only need the > last 2-4 weeks worth of some of the tables or whatever ES calls them). For the moment, let's make this a manual script that can be run from brasstacks. The last 4 weeks worth of records would be great. jgriffin, I'm guessing some of the ES tables will be needed wholesale? I can't find a schema checked in anywhere, do you know which we'll need all of?

Flags: needinfo?(emorley) → needinfo?(jgriffin)

Jonathan Griffin (:jgriffin)

Comment 3

•

12 years ago

ES doesn't use schemas, per se. The indices OrangeFactor uses are: logs, tbpl, bugs, and bzcache.

Flags: needinfo?(jgriffin)

Ed Morley [:emorley]

Reporter

Comment 4

•

12 years ago

So I'm guessing we'll need all of bugs and bzcache, but just the last 4 weeks of logs and tbpl?

Jonathan Griffin (:jgriffin)

Comment 5

•

12 years ago

We should just need all of bzcache; the bugs index contains a different view of the starred bugs from TBPL, so we can just copy the most recent 4 weeks of that, as well as logs and tbpl.

Ed Morley [:emorley]

Reporter

Comment 6

•

12 years ago

From IRC, before I lose it: phrawzty: you can't really, like, "mirror" elasticsearch indexes. phrawzty: assuming the raw data is still availble, you can re-index it into a new instance, for example. phrawzty: then you have two indices phrawzty: with the same data :) phrawzty: in terms of keeping them live, generally speaking the most sane approach is to have the client - which is to say the thing that's actually feeding ES - communicate with both instances phrawzty: if re-indexing from the original data is not possible, then there are two options phrawzty: one is to literally copy the entire file system contents of a complete index. this is fine for small indices that aren't sharded across multiple nodes phrawzty: in The Real World, however, that's not generally possible edmorley: yeah I can imagine, particularly given the size if the indices here phrawzty: which leads to the second option, which is performing a scroll re-index from one ES instance directly to another phrawzty: this generally works, but it can be.. quirky, especially if strange things are being done to the source index. it works, but it can be finnicky, is what i'm trying to say. phrawzty: does.. does that help at all ? edmorley: yes thank you :-) edmorley: I'll have a think about this edmorley: maybe modifying the client is the easiest thing here edmorley: and then scheduling something to purge the dev instance periodically phrawzty: so you can set an expire time on indexed documents in ES edmorley: oh phrawzty: dunno if that is interesting to your use case, but ther eyou go edmorley: it may indeed be, thank you for that phrawzty: http://www.elasticsearch.org/guide/reference/mapping/ttl-field/ -- Moving back to OF so we can decide on the best strategy for this. In addition, I think I'm going to not make this block bug 772503 / bug 848834 any longer, given that OF has been holding them up long enough as it is.

Assignee: server-ops-webops → nobody

Component: Server Operations: Web Operations → Orange Factor

Product: mozilla.org → Testing

QA Contact: nmaul

Version: other → Trunk

Ed Morley [:emorley]

Reporter

Updated

•

12 years ago

Summary: Periodically mirror OrangeFactor's production ES SCL3 DB to the dev PHX ES instance → Populate OrangeFactor's dev PHX ES DB with data from the ES SCL3 production instance

Whiteboard: [triaged 20130411]

Ed Morley [:emorley]

Reporter

Comment 7

•

12 years ago

Production: elasticsearch-zlb.webapp.scl3.mozilla.com:9200 Dev: elasticsearch-zlb.dev.vlan81.phx.mozilla.com:9200

Jonathan Griffin (:jgriffin)

Comment 8

•

12 years ago

If we opt for the write-everything-to-two-ES-instances approach, we'll need to update logparser and bzcache as well so they know how to do this.

Mark Côté [:mcote]

Comment 9

•

12 years ago

And technically this *does* block the move because we need log data for OF to be useful.

Ed Morley [:emorley]

Reporter

Comment 10

•

12 years ago

(In reply to Mark Côté ( :mcote ) from comment #9) > And technically this *does* block the move because we need log data for OF > to be useful. This bug is only about the dev/staging instance, which we're not currently using, so it doesn't block :-)

Mark Côté [:mcote]

Comment 11

•

12 years ago

Yeah, agreed that this shouldn't block us from getting off of the metrics cluster--except that this bug was still blocking on bug 848834, which itself blocks bug 772503. :) I've cleared the blocker list, since as you say we can really do this anytime. We'll come back to this sometime after the migration. Also agreed that we probably don't have to actually import any data to the dev cluster (unlike the prod cluster (bug 8705590)), as long as we get tbpl and the logparser writing to the dev cluster as well as prod, and (optionally) get the bzcache refresh script writing to dev as well (but only for the last 4 weeks, not 6 months). We also need to be sure that tbpl, logparser, and the bzcache refresher can all gracefully handle failed writes of various types (host not found, port not open, errors when submitting data, etc.), since this will be a *dev* system and may not always be up or fully functional.

No longer blocks: 848834

Ed Morley [:emorley]

Reporter

Comment 12

•

12 years ago

I think we just need to give up on this idea for now - short/medium term much easier just to tunnel to prod data :-)

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → INCOMPLETE

Nobody; OK to take it and work on it

Assignee

Updated

•

11 years ago

Product: Testing → Tree Management

BMO Automation

Updated

•

5 years ago

Product: Tree Management → Tree Management Graveyard

Bugzilla

Populate OrangeFactor's dev PHX ES DB with data from the ES SCL3 production instance

Categories

(Tree Management Graveyard :: OrangeFactor, defect)

Tracking

(Not tracked)

People

(Reporter: emorley, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Updated

Updated