Closed Bug 960172 Opened 11 years ago Closed 11 years ago

Upgrade ElasticSearch PHX1 Development cluster to 0.90

Categories

(Infrastructure & Operations :: IT-Managed Tools, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: cliang, Assigned: cliang)

References

Details

(Whiteboard: [change - configuration])

Attachments

(1 file)

The version of Elasticsearch running on the development cluster in PHX1 should be upgraded from 0.20.5 to 0.90.10.  This will involve a full shutdown of the cluster.  I'd like to do this on January 30th, starting at 10 AM Pacific Time.  

Please let me know if you have issues / concerns with doing this upgrade.

If you have been CC'ed on this bug, I believe that you either have an index on the ES development cluster or could probably CC the correct person / people to this bug:

       autolog : jgriffin
          logs : emorley (OF testing?)
          mdn* : lcrouch
    mozillians : hoosteeno
   *inputindex : willkg
         sumo* : willkg
          xtag : aschaar [the 'really' index looks like a copy of xtag?] 

There have been a number of breaking changes in the intermediate versions, so there may need to be some application changes as a result of this upgrade.
(In reply to C. Liang [:cyliang] from comment #0)
>           logs : emorley (OF testing?)

We're not actively using the dev instance, so this is fine with us whenever :-)
>       autolog : jgriffin

We aren't routinely using the dev instance either.
cc:ing Ricky since losing ES for a while has implications for SUMO.

For Input, this all sounds fine to me! I look forward to our new 0.90.10 overlords!
cc:ing Mike before I forget again.
needinfo? :jezdez to verify for MDN, but it looks like we only use this cluster for -dev and -stage servers so it should be fine.
Flags: needinfo?(jezdez)
:groovecoder Yep, 0.90.10 sounds great!
Flags: needinfo?(jezdez)
No issues for SUMO for taking the *dev* cluster offline for a bit. For prod, we'll have to figure out what we can do to minimize or avoid any downtime.
(In reply to Ricky Rosario [:rrosario, :r1cky] from comment #7)
> No issues for SUMO for taking the *dev* cluster offline for a bit. For prod,
> we'll have to figure out what we can do to minimize or avoid any downtime.

Likewise for Mozillians.org.
Blocks: 948920
0.90.10 rpm built and put in the Mozilla yum repository

bburton@genericadm: ~
$ yum info elasticsearch                                                                                                                                                                                                                                                                                           [16:28:03]
Loaded plugins: downloadonly, rhnplugin, security
This system is receiving updates from RHN Classic or RHN Satellite.
Available Packages
Name        : elasticsearch
Arch        : x86_64
Version     : 0.90.10
Release     : 2.el6
Size        : 16 M
Repo        : mozilla
Summary     : A distributed, highly available, RESTful search engine
URL         : http://www.elasticsearch.com
License     : ASL 2.0
Description : A distributed, highly available, RESTful search engine
Ugh.  This did not go well: this cluster is still on 0.20.x.  We may need to wait to see what comes of https://github.com/elasticsearch/elasticsearch/issues/4936 before upgrading again.  (The gist shows the errors that were appearing in the logs.)  I tried to cajole it, but even without loading the plugin, shard initialization would freeze somewhere.  After waiting an hour (with no decrease in unassigned shards), I ended up reverting everything (plugins and all) to 0.20.x versions and a backup copy of the indexes I'd made before the upgrade attempt started.
Group: infra
There's a new version of the plugin that should, hopefully, address this issue.  Attempt #2 will, again, involve a full shutdown of the cluster.  I'd like to do this on Tuesday, February 11th, starting at 8 AM Pacific Time.  

Please let me know if you have issues / concerns with doing this upgrade at that time.
The upgrade process seems to have gone much more smoothly this time.  The cluster is currently reporting as healthy.  Can folks please test their applications and let me know if things are working?  (As I mentioned before, there have been a number of breaking changes in the intermediate versions, so there may need to be some application changes as a result of this upgrade. )
(In reply to C. Liang [:cyliang] from comment #12)
> The upgrade process seems to have gone much more smoothly this time.  The
> cluster is currently reporting as healthy.  Can folks please test their
> applications and let me know if things are working?  (As I mentioned before,
> there have been a number of breaking changes in the intermediate versions,
> so there may need to be some application changes as a result of this
> upgrade. )

What applications are using that cluster? Can you print out the list of indexes it has?
On support-dev, we got this brand new indexing error:

MapperParsingException[No analyzer found for [snowball-english] from path [_analyzer]]

Traceback:

IndexingTaskError: Traceback (most recent call last):
  File "/data/www/support-dev.allizom.org/kitsune/kitsune/search/tasks.py", line 162, in index_task
    cls.index(cls.extract_document(id_), id_=id_)
  File "/data/www/support-dev.allizom.org/kitsune/kitsune/wiki/models.py", line 817, in index
    super(cls, cls).index(document, **kwargs)
  File "/data/www/support-dev.allizom.org/kitsune/kitsune/search/models.py", line 130, in index
    super(SearchMappingType, cls).index(*args, **kwargs)
  File "/data/www/support-dev.allizom.org/kitsune/vendor/src/elasticutils/elasticutils/__init__.py", line 1873, in index
    force_insert=force_insert)
  File "/data/www/support-dev.allizom.org/kitsune/vendor/src/pyelasticsearch/pyelasticsearch/client.py", line 96, in decorate
    return func(*args, query_params=query_params, **kwargs)
  File "/data/www/support-dev.allizom.org/kitsune/vendor/src/pyelasticsearch/pyelasticsearch/client.py", line 344, in index
    query_params)
  File "/data/www/support-dev.allizom.org/kitsune/vendor/src/pyelasticsearch/pyelasticsearch/client.py", line 255, in send_request
    self._raise_exception(resp, prepped_response)
  File "/data/www/support-dev.allizom.org/kitsune/vendor/src/pyelasticsearch/pyelasticsearch/client.py", line 269, in _raise_exception
    raise error_class(response.status_code, error_message)
ElasticHttpError: (400, u'MapperParsingException[No analyzer found for [snowball-english] from path [_analyzer]]')
Attached file indexes.txt
List of indexes
I've uploaded a list of indexes, in case that's helpful for anyone.  The original post lists a rough breakdown of the indexes into "buckets" but I'm not 100% what app is using which bucket. 


Ricky:

Quick question, to make sure I'm looking at the correct set of indexes: are you working with the sumo-dev_sumo-20130913 indexes?  I do see these settings:

      "index.analysis.analyzer.snowball-english.type" : "snowball",
      "index.analysis.analyzer.snowball-english.language" : "English",
(In reply to C. Liang [:cyliang] from comment #17)
> Ricky:
> 
> Quick question, to make sure I'm looking at the correct set of indexes: are
> you working with the sumo-dev_sumo-20130913 indexes?  I do see these
> settings:
> 
>       "index.analysis.analyzer.snowball-english.type" : "snowball",
>       "index.analysis.analyzer.snowball-english.language" : "English",

Correct! I just triggered a reindex so we *might* be good now. I'll test a little more. Thanks!
input-dev and input-stage look ok.
After reindexing support-dev, things seem to be just fine. support-stage is looks good so far too.

bd
Creating pages in MDN stage works as well: https://developer.allizom.org/en-US/search?q=jezdez
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: