Closed
Bug 838638
Opened 11 years ago
Closed 11 years ago
overhaul search to use separate mapping types
Categories
(support.mozilla.org :: Search, defect, P1)
support.mozilla.org
Search
Tracking
(Not tracked)
RESOLVED
FIXED
2013Q1
People
(Reporter: willkg, Assigned: willkg)
References
Details
(Whiteboard: u=dev c=search p=10 s=2013.9)
Way back when we switched from bucketed search (where the code would do three separate ES searches and then show the results for the kb, then the results for the support forum (aka questions), then the results for the contributor forum (aka forums)) to unified search (where the code would do one single ES search and get back all the data sorted by _score), we decided to implement it using a unified mapping type (aka doctype). The primary reason for this (as I recall) was problems with searching across multiple mapping types (aka doctypes) with ElasticUtils which was at the time using pyes. Times have changed! ElasticUtils now uses pyelasticsearch and it's possible to search across multiple mapping types. The current implementation of a unified doctype and the code surrounding it is pretty adamant about things and not very flexible. It'll be difficult to add new mapping types and increasingly difficult to adjust the existing ones. Plus we wrote a bunch of scaffolding for Kitsune that does the same thing that ElasticUtils can do, so we've got a lot of additional code we should remove. This bug covers: 1. updating to ElasticUtils/pyelasticsearch 2. overhauling search to use multiple mapping types again (one for each Django model) 3. removing the scaffolding in Kitsune that also exists in ElasticUtils
Comment 1•11 years ago
|
||
I think we should do it this quarter!
Whiteboard: u=dev c=search p= s=
Target Milestone: --- → 2013Q1
Assignee | ||
Comment 3•11 years ago
|
||
I've already covered item 1 from the list. That was easy-ish and in a different bug. I'm going to tackle item 2 now. I think it entails: Stage 1: 1. write a KitsuneMappingType class that handles Kitsune peculiarities like the index we write to and keeps track of all registered mapping types 2. write DocumentMappingType, QuestionMappingType and ThreadMappingType subclasses 3. rewrite the command line indexing code to use the MappingType classes 4. rewrite the admin indexing code to use the MappingType classes 5. fix the cron jobs that do indexing 6. rewrite anything else that either does indexing or looks at index stats 7. rewrite all the indexing-related tests Stage 2: 1. rewrite the view code to search across the various doctypes (this is one step, but is actually a non-trivial project) 2. adjust the "articles like this" code to use the right classes Because we're writing all new MappingType-based indexing code that is parallel to the SearchMixin-based indexing code, I think we can have both simultaneously. That should make it easier to transition from the old ways to the new ways without taking search down and without writing a bunch of transition code. Ricky, Mike: Does that look right? Am I missing anything?
Assignee | ||
Comment 4•11 years ago
|
||
Stage 1 is more complicated than I thought. In order for the tests to pass, I need to keep the existing indexing system and build a parallel indexing system along-side it rather than replace the existing indexing system with the new one. State 2 is interesting in that creating an S with a mapping type doesn't let us search across doctypes, so we have to create an untyped S. That's fine. It just means we can't put any business logic in our mapping type classes. We didn't really do that before, so it's not a big deal.
Assignee | ||
Comment 5•11 years ago
|
||
In the interests of time, I think I'm going to go with the original plan and we'll expect all the tests to fail after the stage 1 commit which will require us to do manual testing (which we do anyhow). I'll write a command to make that easier. As a side note, while I'm overhauling search, I'm fixing the infrastructure to allow for the following: 1. document-level boosting at index time 2. setting index settings when we create the index 3. indexing things other than support questions, wiki documents and forum threads
Assignee | ||
Comment 6•11 years ago
|
||
I'm pretty sure I'm done this: Stage 1: 1. write a SearchMappingType class that handles Kitsune peculiarities like the index we write to and keeps track of all registered mapping types 2. write DocumentMappingType, QuestionMappingType and ThreadMappingType subclasses 3. rewrite the command line indexing code to index both to the unified doctype and the new MappingType-based doctypes (i.e. we're indexing everything twice for a bit) 4. rewrite the admin indexing code index both to the unified doctype and the new MappingType-based doctypes 5. fix the cron jobs that do indexing 6. rewrite anything else that either does indexing or looks at index stats I'm left with this: 7. write tests for MappingType-based doctype indexing I need to write some tests that test the new indexing code to make sure it's working properly. Once I'm done that, I'll move on to stage 2.
Assignee | ||
Comment 7•11 years ago
|
||
Done! PR: https://github.com/mozilla/kitsune/pull/1345
Assignee | ||
Comment 8•11 years ago
|
||
The deployment plan is this: 1. ask the other devs to land anything incoming now 2. after that, rebase against master and run tests 3. land the code 4. deploy stage 1 commit to support.allizom.org (stage site) 5. go into admin and create new index wait forever because staging server is slow meanwhile, make sure the site continues to work * search from the front page * aaq suggestions * related documents in the kb * questions/stats/ histogram 6. when that's done, deploy stage 2 commit to support.allizom.org (stage site) make sure the site continues to work * search from the front page * aaq suggestions * related documents in the kb * questions/stats/ histogram If that's all good, let's do production, but during "off hours". Possibly later tonight (Wednesday May 8th).
Assignee | ||
Comment 9•11 years ago
|
||
Adjusted plan: 1. ask the other devs to land anything incoming now 2. after that, rebase against master and run tests 3. land the code 4. deploy stage 1 commit to support.allizom.org (stage site) 5. go into admin and create new index wait forever because staging server is slow meanwhile, make sure the site continues to work * search from the front page * aaq suggestions * related documents in the kb * questions/stats/ histogram * https://support.mozilla.org/en-US/products/firefox/get-started * run qa staging tests 6. when that's done, deploy stage 2 commit to support.allizom.org (stage site) make sure the site continues to work * search from the front page * aaq suggestions * related documents in the kb * questions/stats/ histogram * https://support.mozilla.org/en-US/products/firefox/get-started * run qa staging tests If that's all good, let's do production, but during "off hours". Possibly later tonight (Wednesday May 8th).
Assignee | ||
Comment 10•11 years ago
|
||
Landed in production: https://github.com/mozilla/kitsune/commit/e9c14eb https://github.com/mozilla/kitsune/commit/72a50c1 https://github.com/mozilla/kitsune/commit/32966f5 https://github.com/mozilla/kitsune/commit/8726ee3 YAY!!!!!11! Making this a 10 pointer because it took me the greater part of this quarter to unblock this work and then finally do it.
Status: NEW → RESOLVED
Closed: 11 years ago
Priority: -- → P1
Resolution: --- → FIXED
Whiteboard: u=dev c=search p= s= → u=dev c=search p=10 s=2013.9
Comment 11•11 years ago
|
||
lulz 10pt omg
You need to log in
before you can comment on or make changes to this bug.
Description
•