[bunker] test elasticsearch-reindex plugin

RESOLVED INCOMPLETE

Status

Infrastructure & Operations
WebOps: IT-Managed Tools
P4
normal
RESOLVED INCOMPLETE
5 years ago
4 years ago

People

(Reporter: phrawzty, Assigned: zounese)

Tracking

Details

(Reporter)

Description

5 years ago
Let's test the elasticsearch-reindex[1] plugin to see if it can handle our re-indexing processes in a more efficient (read: faster) way.

I'm unclear as to whether this plugin needs to be installed on all of the nodes, just the master(s), or whatever - so that needs to be determined as well.  Since the ES bunker nodes cannot speak to the Internet, these plugins cannot (and should not) be installed via the command line tool.  In the past, what I've done is installed ES on a test machine (localhost), installed the plugins there, and then packaged whatever got downloaded into an RPM.  Sort of janky, but that's one approach.
(Assignee)

Comment 1

5 years ago
It looks like this plugin needs to be installed on an http enabled master. So I installed it on node62 using the ES plugin utility. (which I think literally copies the files over to the plugin dir) I ran a test using logstash-old-2013.08.27 which is roughly 16gb, and the reindex process averaged 15gb/hr which is still slow, but it would work if we have a reindex start immediately when the new index is created. If a reindex takes around 12 hours we're going to need twice as much storage which seems likely to happen if we get the disk upgrade. I'll continue testing with a new index and I'll write the necessary automation to make this happen. I suggest we run the script on node62, rather than have the admin trigger the run.
(Assignee)

Comment 2

5 years ago
I am currently running a demo reindex of logstash-new-2013.08.28 on node62 in a screen session. I'm working on a Python script that cron can call when a new index is made.
(Reporter)

Comment 3

5 years ago
Since I mentioned in comment #0 that these nodes cannot speak to the web directly, I'm can only assume that the installation workflow described in comment #1 is incomplete - please provide further details on exactly what files you put where, and how.  Please understand that it is important to be rigorous with these details because we will - should this solution prove to be acceptable - need to replicate this in a more concrete, production-ready way.
Assignee: server-ops-webops → ezounes
Flags: needinfo?(ezounes)
(Assignee)

Comment 4

5 years ago
(In reply to Daniel Maher [:phrawzty] from comment #3)
> Since I mentioned in comment #0 that these nodes cannot speak to the web
> directly, I'm can only assume that the installation workflow described in
> comment #1 is incomplete - please provide further details on exactly what
> files you put where, and how.  Please understand that it is important to be
> rigorous with these details because we will - should this solution prove to
> be acceptable - need to replicate this in a more concrete, production-ready
> way.

Indeed. I'll share exactly what I've done so far to install this plugin, how it is used, and my thoughts for how it should be run.

First of all, the plugin can be found here at the following github page:
https://github.com/karussell/elasticsearch-reindex

I installed openjdk-1.7 locally and built the package using apache-maven. The instructions are in the README. I then copied it over to node62,63 and using /usr/share/java/elasticsearch/bin/plugin -url file:<plugin_package> -install <name>

The plugin will modify the ES REST api and expose a new route which can start the reindex process. In our case, the request will look like this:

curl -XPUT 'http://node62.bunker.scl3.mozilla.com:9200/logstash-old-YYYY.MM.DD/<doc_type>/_reindex?searchIndex=logstash-new-YYYY.MM.DD' -d {<optional_filter>}

The only downside to this plugin is that it leaves the http connection open until it finishes. In addition to that, it will only reindex one type at a time. Thus, I propose we have multiple reindexes triggered at the same time. This will speed up the reindex process significantly. At the moment it averages 15gb/hr, however, it for 90gb+ indices it slows to about 11-12gb/hr. Considering that the builk of our logs will be of the document type "zlb-access" we shouldn't expect a huge increase in speed. I'll have an actual estimate in a day or so. I'll update the notes with the changes I've made to the ES mapping which makes the paralell reindex possible. I'm also working on a simple Python script to do the reindex. We can just call it from cron in the same way Bunktate was called. I'll have more updates on this tomorrow.
Flags: needinfo?(ezounes)
(Reporter)

Updated

5 years ago
Blocks: 911124
(Assignee)

Comment 5

5 years ago
I'm going to package the plug-in even if we don't end up using it. I've proposed a new strategy which consists of relocate + reindex. We should only reindex if we care about how many shards indices are stored on. If not then we don't need the plugin. See 914787
Depends on: 914787
(Reporter)

Comment 6

4 years ago
The plugin was packaged and tested by :zounese to some degree; however, as Bunker is now cancelled, this bug is moot.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.