Closed Bug 927235 Opened 12 years ago Closed 12 years ago

Exact-name search not working

Categories

(Marketplace Graveyard :: Search, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: cvan, Assigned: robhudson)

Details

We've gotten complaints on irc of this. And I've been noticing this a lot in the past week or so. https://marketplace.firefox.com/app/wiadomosci?src=dt-pl-popular Try searching for this app by its name "INTERPIA.PL": https://marketplace.firefox.com/api/v1/apps/search/?dev=firefoxos&lang=en-US&q=interpia.pl https://marketplace.firefox.com/api/v1/apps/search/?dev=firefoxos&lang=en-US&q=INTERPIA.PL No luck. Now try its slug: https://marketplace.firefox.com/api/v1/apps/search/?dev=firefoxos&lang=en-US&q=wiadomosci That works. /me wonders if this was a recent regression
https://marketplace.firefox.com/app/movistar-recomienda also not working. /me wonders if related to `default_locale` not being `en`
Assignee: nobody → robhudson.mozbugs
(In reply to Christopher Van Wiemeersch [:cvan] from comment #0) > Try searching for this app by its name "INTERPIA.PL": > > https://marketplace.firefox.com/api/v1/apps/search/?dev=firefoxos&lang=en- > US&q=interpia.pl > https://marketplace.firefox.com/api/v1/apps/search/?dev=firefoxos&lang=en- > US&q=INTERPIA.PL It's "interia.pl" btw, not "interpia.pl". This works: https://marketplace.firefox.com/api/v1/apps/search/?dev=firefoxos&lang=en-US&q=interia.pl
(In reply to Rob Hudson [:robhudson] from comment #3) > (In reply to Christopher Van Wiemeersch [:cvan] from comment #0) > > Try searching for this app by its name "INTERPIA.PL": > > > > https://marketplace.firefox.com/api/v1/apps/search/?dev=firefoxos&lang=en- > > US&q=interpia.pl > > https://marketplace.firefox.com/api/v1/apps/search/?dev=firefoxos&lang=en- > > US&q=INTERPIA.PL > > It's "interia.pl" btw, not "interpia.pl". This works: > > https://marketplace.firefox.com/api/v1/apps/search/?dev=firefoxos&lang=en- > US&q=interia.pl Ah good call. However, I'm still seeing the issue described in comment 2: 3 results: https://marketplace.firefox.com/api/v1/apps/search/?_user=---&lang=es&q=movistar-recomienda&region=None 1 result: https://marketplace.firefox.com/api/v1/apps/search/?_user=---&lang=en-US&q=movistar-recomienda&region=None
(In reply to Christopher Van Wiemeersch [:cvan] from comment #2) > Interesting… > > https://marketplace.firefox.com/api/v1/apps/search/?_user=--- > &lang=es&q=movistar-recomienda&region=None > > makes it show up. Notice lang=es Using this app as the example, whose name is "Recomienda", neither of these works as expected: https://marketplace.firefox.com/api/v1/apps/search/?lang=en-US&q=recomienda https://marketplace.firefox.com/api/v1/apps/search/?lang=es&q=recomienda I'm looking at the actual query sent to elasticsearch to get a hint as to why.
(In reply to Christopher Van Wiemeersch [:cvan] from comment #4) > 3 results: > https://marketplace.firefox.com/api/v1/apps/search/?_user=--- > &lang=es&q=movistar-recomienda&region=None I actually just get one result here. And the one result doesn't include the "Recomienda" app because that app is excluded in region=2 and the query it is performing is coming through with '{"not": {"filter": {"term": {"region_exclusions": 2}}}}'. > 1 result: > https://marketplace.firefox.com/api/v1/apps/search/?_user=---&lang=en- > US&q=movistar-recomienda&region=None I get zero on this one. Same reason about the excluded regions coming through to the backend.
One discovery is this: For searches that contain characters like "." or "-", elasticsearch splits them while indexing. So "movistar-recomienda" actually indexes 2 tokens in the index based on the slug: $ curl -s $ES/apps/_analyze -d 'movistar-recomendia' | json { "tokens": [ { "end_offset": 8, "position": 1, "start_offset": 0, "token": "movistar", "type": "<ALPHANUM>" }, { "end_offset": 19, "position": 2, "start_offset": 9, "token": "recomendia", "type": "<ALPHANUM>" } ] } For the name field, since the name is only in the Spanish locale, it uses a different analyzer which stores "movistar" and "recomendi" b/c of different stemming rules. A couple things I've noted: 1. For the app slug, tokenizing is probably not what we want. I can update the mapping so app slugs don't get tokenized and an exact match query can work better for slugs. However, this may break the opposite way if "movistar-recomendia" is stored and we search for "recomendia" it won't match because the slug wasn't stored as 2 tokens. For slugs, I'm curious if this might be an ok trade-off. Especially since "recomendia" may match other fields, like the name on which the slug is usually based. 2. For names, I don't think the above trade off makes sense. I think we do want tokenizing and the trade-off for names is that matches will need to match on the tokens, not the original string (and technically wouldn't be an "exact" match).
I've experimented with changing the app_slug analyzer and running some queries and can't reproduce the problems described above. I believe the ultimate cause of confusion is that the app in question is not showing up because of region exclusions. Is it possible to find another example that is more clear that exact searching is not working?
This looks good to me: https://marketplace.firefox.com/api/v1/apps/search/?_user=---&lang=es&q=movistar&region=es and https://marketplace.firefox.com/api/v1/apps/search/?_user=---&lang=en-US&q=movistar&region=es return the same thing This was likely my imagination. If I ever have issues again, I'll reopen, but I think we're good here. Thanks, Rob, for investigating this. Much appreciated!
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.