Closed Bug 1065380 Opened 11 years ago Closed 10 years ago

rework bigrams/trigrams generation to use shingle analyzer

Categories

(Input Graveyard :: Backend, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: willkg, Unassigned)

Details

We currently generate bigrams, but we do it "by hand" which is done in Python. Elasticsearch has a shingle filter which can do it for us plus incorporate stopwords and all the other things. Blog post here: http://www.elasticsearch.org/blog/searching-with-shingles/ We should switch. That'd probably cut down on indexing time plus it makes it possible for us to do trigrams which we weren't doing because it was so expensive.
We decided to remove bigram/trigram computation code. That work is being done in bug #1215105. Closing this as WONTFIX.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Product: Input → Input Graveyard
You need to log in before you can comment on or make changes to this bug.