Closed Bug 944816 Opened 11 years ago Closed 10 years ago

Fix incorrect field types for "published" field in ES on staging and production

Categories

(Webmaker Graveyard :: MakeAPI, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: michiel, Assigned: jon)

References

Details

(Whiteboard: preworkweek)

According to https://gist.github.com/mjschranz/910b0023ef11b833caa0, the elasticsearch field type for "published", which should be boolean, is actually type "object" on production, and a multi-type on staging. This is preventing us from making use of the flag in work that separates saving from publishing.

We need to investigate what is required to update this type on staging and production: elasticsearch is rather odd in that you can't update a type and then do a rolling reindex, it would seem based on the elasticsearch docs that you'd actually want to make a new index with the correct mapping, insert all the data from the old index into the new index, and then switch over indices.

The "solution" elasticsearch says to use is to consider the to-change field "lost", and to tack on a new field instead with the type that you need. This pollutes the schema, and only passes the buck in terms of a proper solution to "the next time we run into this".

ref: http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/

Depending on how much time it would take to have ES generate a new index off of the MakeAPI data, we oculd simply say "makeapi is now in maintenance for a few hours" and regenerate the index with an updated mapping.
Flags: needinfo?(jon)
Flags: needinfo?(david.humphrey)
Flags: needinfo?(cade)
I think Matt is doing something about this already..
Flags: needinfo?(cade) → needinfo?(schranz.m)
Well, the importance here is exactly how we decide to handle bug 919710. We could just do what my patch has there and remove this from the schema which would solve the inconsistencies although by making the flag 100% obsolete rather than the partial state it's in.

With the work your doing Chris soon for dumping Mongo and have a more manual approach to it all with MySQL I'm thinking we could just avoid worry about this for now because it will wind up going away then? In the end I'm unsure.
Flags: needinfo?(schranz.m) → needinfo?(cade)
Depends on: 945865
I've filed https://bugzilla.mozilla.org/show_bug.cgi?id=945865 as a blocker here, since that's the general blocking case for this specific field. https://bugzilla.mozilla.org/show_bug.cgi?id=919710 is a patch that exists only because we don't have a migration strategy at the moment, so I strongly recommend we shelve 919710 and figure out how to do 945865 first.
Assignee: nobody → jon
Status: NEW → ASSIGNED
Flags: needinfo?(jon)
Flags: needinfo?(david.humphrey)
Flags: needinfo?(cade)
So, I think no matter what, we'll need to have some sort of downtime to create the alias, which appears to be the proper way to do things: http://stackoverflow.com/questions/13851044/is-there-a-smarter-way-to-reindex-elasticsearch
Whiteboard: preworkweek
According to brett, we see the least amount of traffic on weekend mornings. That'd be an ideal time to do downtime.
No longer depends on: 945865
Hooray, this is all done on staging and prod now: https://gist.github.com/jbuck/8636152

Etherpad for WIP things I did: https://etherpad.mozilla.org/FqWy448rQR

There's now an alias backing the index, so this should make updates slightly easier in the future.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Blocks: 964067
You need to log in before you can comment on or make changes to this bug.