Fix incorrect field types for "published" field in ES on staging and production

RESOLVED FIXED

Status

Webmaker
MakeAPI
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: pomax, Assigned: jbuck)

Tracking

Details

(Whiteboard: preworkweek)

(Reporter)

Description

4 years ago
According to https://gist.github.com/mjschranz/910b0023ef11b833caa0, the elasticsearch field type for "published", which should be boolean, is actually type "object" on production, and a multi-type on staging. This is preventing us from making use of the flag in work that separates saving from publishing.

We need to investigate what is required to update this type on staging and production: elasticsearch is rather odd in that you can't update a type and then do a rolling reindex, it would seem based on the elasticsearch docs that you'd actually want to make a new index with the correct mapping, insert all the data from the old index into the new index, and then switch over indices.

The "solution" elasticsearch says to use is to consider the to-change field "lost", and to tack on a new field instead with the type that you need. This pollutes the schema, and only passes the buck in terms of a proper solution to "the next time we run into this".

ref: http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/

Depending on how much time it would take to have ES generate a new index off of the MakeAPI data, we oculd simply say "makeapi is now in maintenance for a few hours" and regenerate the index with an updated mapping.
(Reporter)

Updated

4 years ago
Flags: needinfo?(jon)
Flags: needinfo?(david.humphrey)
Flags: needinfo?(cade)
I think Matt is doing something about this already..
Flags: needinfo?(cade) → needinfo?(schranz.m)
Well, the importance here is exactly how we decide to handle bug 919710. We could just do what my patch has there and remove this from the schema which would solve the inconsistencies although by making the flag 100% obsolete rather than the partial state it's in.

With the work your doing Chris soon for dumping Mongo and have a more manual approach to it all with MySQL I'm thinking we could just avoid worry about this for now because it will wind up going away then? In the end I'm unsure.
Flags: needinfo?(schranz.m) → needinfo?(cade)
(Reporter)

Updated

4 years ago
Depends on: 945865
(Reporter)

Comment 3

4 years ago
I've filed https://bugzilla.mozilla.org/show_bug.cgi?id=945865 as a blocker here, since that's the general blocking case for this specific field. https://bugzilla.mozilla.org/show_bug.cgi?id=919710 is a patch that exists only because we don't have a migration strategy at the moment, so I strongly recommend we shelve 919710 and figure out how to do 945865 first.
(Assignee)

Updated

4 years ago
Assignee: nobody → jon
Status: NEW → ASSIGNED
Flags: needinfo?(jon)
Flags: needinfo?(david.humphrey)
Flags: needinfo?(cade)
(Assignee)

Comment 4

4 years ago
So, I think no matter what, we'll need to have some sort of downtime to create the alias, which appears to be the proper way to do things: http://stackoverflow.com/questions/13851044/is-there-a-smarter-way-to-reindex-elasticsearch

Updated

4 years ago
Whiteboard: preworkweek
(Assignee)

Comment 5

4 years ago
According to brett, we see the least amount of traffic on weekend mornings. That'd be an ideal time to do downtime.
(Assignee)

Updated

4 years ago
No longer depends on: 945865
(Assignee)

Comment 7

4 years ago
Hooray, this is all done on staging and prod now: https://gist.github.com/jbuck/8636152

Etherpad for WIP things I did: https://etherpad.mozilla.org/FqWy448rQR

There's now an alias backing the index, so this should make updates slightly easier in the future.
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
(Assignee)

Updated

4 years ago
Blocks: 964067
You need to log in before you can comment on or make changes to this bug.