Closed
Bug 1065380
Opened 11 years ago
Closed 10 years ago
rework bigrams/trigrams generation to use shingle analyzer
Categories
(Input Graveyard :: Backend, defect)
Input Graveyard
Backend
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: willkg, Unassigned)
Details
We currently generate bigrams, but we do it "by hand" which is done in Python.
Elasticsearch has a shingle filter which can do it for us plus incorporate stopwords and all the other things. Blog post here:
http://www.elasticsearch.org/blog/searching-with-shingles/
We should switch. That'd probably cut down on indexing time plus it makes it possible for us to do trigrams which we weren't doing because it was so expensive.
| Reporter | ||
Comment 1•10 years ago
|
||
We decided to remove bigram/trigram computation code. That work is being done in bug #1215105. Closing this as WONTFIX.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
| Assignee | ||
Updated•9 years ago
|
Product: Input → Input Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•