Closed Bug 778437 Opened 12 years ago Closed 5 years ago

Display localized content before English one

Categories

(support.mozilla.org :: Search, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX
Future

People

(Reporter: scoobidiver, Unassigned)

References

Details

(Whiteboard: u=user c=search p= s=2013.backlog)

In case you search terms similar in your language or in English, display the localized content first (localized articles, and localized questions for the future - F9).

See https://support.mozilla.org/fr/search?q_tags=desktop&product=desktop&q=babylon, the right answer is https://support.mozilla.org/fr/kb/resoudre-problemes-firefox-logiciels-malveillants that shows up only on the 4th page.
Scoobi, correct me if I'm wrong, but the article you have linked doesn't have the word "babylon" in it, right? We'd need to add that as a keyword, and then it would show up higher, because we rate KB articles higher than forum threads. Also, currently we don't show any English articles on locale pages.
(In reply to Kadir Topal [:atopal] from comment #1)
> Scoobi, correct me if I'm wrong, but the article you have linked doesn't
> have the word "babylon" in it, right?
You haven't checked keywords for this article.

> because we rate KB articles higher than forum threads
That's not true, at least for this example. The first KB article shows up in the third page.
Scoobi, you are right, I didn't see that the keywords in French included much, much more than the English version.

Will, do we base the weight of keywords in relation to the total number of words in the keyword section? Or in other words: Do we assign a lower priority to keywords that are 1 in 20 then to keywords that are say 1 in 2?
Kadir: Talk to Matt. He can run the searches and look at the score explanation and tell you what it's doing.

Alternatively, if you want me to do it, we should add a 1 point research bug to the sprint.
Matt, did you maybe try this already?
Kadir: All input fields have a normalization applied based on the ratio of entries in the field. If you have 30 keywords and you match 1, it will have a lower score than matching 1 out 2 keywords. That is why shorter articles tend to trend higher in search results. There are things we can do to combat this. One of them is going to be the clean up keywords as they were being used for a different purpose previously. Verdi and I have already done some work here. Does that answer the question?
(In reply to Matt G, Mozilla SUMO (irc: Matt_G) from comment #6)
> All input fields have a normalization applied based on the ratio of entries in
> the field.
For me, the more synonyms (even partial like Babylon for malware) there are as keywords, the more chance there is that the article is hit. So I don't think there should be a normalization for keywords and maybe the title. I am OK for the content.

> One of them is going to be the clean up keywords as they were being used for a
> different purpose previously
Except for Mobile and Sync articles, I think keywords are currently well chosen and only a fine tuning is required.

(In reply to Kadir Topal [:atopal] from comment #1)
> we rate KB articles higher than forum threads.
If it was true, this would be the solution for this problem. Not so long ago, the first ten articles were shown in search results, then the forum threads. Now, it's completely messed.
Matt, that does answer the question indeed. Thanks!

Scoobi has a point with the keyword field. Keywords will always only be used once and unlike other text the words are not part of statements or sentences. It might be worth changing our scoring for that field.
Please do not make that change. At least not yet. Right now keywords are one of the best tools we have for search tuning. In cases where an article is not ranking as high as we'd like, we can manipulate the content fields only so much. In some cases we need to use the keyword field for "keyword bombing" to boost the scores. Once we have true tailored searches, it won't be an issue. For now though, I'd say we leave it as is.
Scoobi, you were right, we don't give higher weight based on document type, that was just a misconception on my part. 

So, for this bug, there are two options: 

* We can stop showing English forum threads on localized pages
* You can keyword bomb the article, as Matt explained

I wouldn't want to stop showing English forum threads, since for a number of locales English as the fallback is actually okay, and certainly better than having no results at all. That only leaves the manual solution of changing the frequency of keywords.

Scoobi, I'd close this bug based on the information above. Let me know if you have another opinion on this.
(In reply to Kadir Topal [:atopal] from comment #10)
> That only leaves the manual solution of changing the frequency of keywords.
I don't understand what you mean as kw1 has currently less weight amongst kw1, kw2, kw3, kw4 ... kwn keywords than kw1 alone. For me, kw1 should have the same weight whatever the number of keywords.
Sorry Scoobi, missed this. What I meant is. If you want a KB article to show up for a specific keyword, you can just type that keyword several times. This is a hack, until we have true tailored searches, but it should give you the result for specific cases.
I don't ask a hack for me because I know enough the KB to find the appropriate article, but this behavior pollutes the French support forum while there are answers in the KB (about one thread about uninstalling Babylon a day!).
I don't this is still an issue - if an article isn't localized then the English version shows up.
(In reply to Feer56 (Andrew T.) from comment #15)
> I don't this is still an issue
It's still an issue. See https://support.mozilla.org/fr/search?q=sec_error_expired_issuer_certificate (first French article is #3) or https://support.mozilla.org/fr/search?q=incredibar (first French article is #12).

> if an article isn't localized then the English version shows up.
It's not the point of this bug. It's not about browsing articles but searching with a term that exists both in a language and in English.
Yeah, we should get to this in this quarter. Setting the flag accordingly. 

And btw, we don't display English articles in non-English searches, just English forum posts.
Priority: -- → P2
Whiteboard: u=user c=search p= s=2013.backlog
Target Milestone: --- → 2013Q3
I think this might be fixed already? See, for example:

https://support.mozilla.org/fr/search?q=babylon&esab=a&w=1

I don't see any english articles there. I think we may have changed our filtering along the way to only show documents in the locale searched?
(In reply to Ricky Rosario [:rrosario, :r1cky] from comment #18)
> I think this might be fixed already?
No. There are no unlocalized articles in search results (only when navigating). This bug is about English questions that show up above French articles. See comment 16.
Ah, so bug 885092 is related.
Depends on: 885092
Or duplicated... the problem that we are trying to solve is the same: Threads in English seem to be:
- Not necessary
AND/OR
- Too high on the ranking.
Unblocking because I misunderstood.
No longer depends on: 889890
I am not asking to remove English questions like in bug 885092 because they can be useful for locales with few articles localized. See https://support.mozilla.org/id/search?q=MyStart&product=firefox
I know Scooby...That said...I am.

Taking 3 countries that were mentioned to be fairly anglophone: Netherlands, Germany and Denmark. The first one with a fairly complete localization. The second one half way through and Danish with a fairly non localized KB.

Results:

Netherlands: 12% of the people leave the search results page. But only 2.5% of users click on forum results.
German: 14% of the people leave the search results page. Only 1% of the users actually click on forum results.
Danish: 22% of the people leave the page. 4% of the people selects a forum results (in context...they are 73 out of 1600).

With the data about the helpful votes that I shared in the other thread...I advocate for showing results only in the language of the user. If this doesn't exist...don't show it. Or we should auto-translate it.
Adding a new feature has a cost but removing a feature (displayed English questions) that helps some people is not the right thing to do for me.
Maintaining features has 2 hidden costs too (not necessarily present in every case):
- Technical cost: When a feature adds complexity to the code that needs to be kept in mind whenever a new feature is added. Not necessarily a real example: If articles have tags, whenever we modify how search works, we need to consider that articles are tagged. 

- Usability cost: Users not using that particular feature face extra complexity of the product because of it. As a user, you want to have only those features that you need...anything else present in the UI is noise and distraction that works against your positive experience with the site.

We need to be very critical with those features who only help a 1% of the users, because it's potentially damaging the experience of the other 99%. We need to think if we can tackle the problem that is solving with a different solution...or if the pain that is generating is even worth it.

In this case, if we remove the non localized results...a user who finds no results in its language but knows English, could still find help by changing the language of SUMO. Perhaps we could even propose the change to English when no results are shown.

Keep in mind that a lot of people doesn't speak a second language and showing random results in English (random because they don't understand what they say) could cause a serious level of confusion. We don't want that.
Considering only the French case which has few unlocalized articles, I am not against removing questions in Related questions and search results because it shows questions wrongly posted in French that encourage to ask again new questions in French by hacking the locale in /en-US/questions/new. See https://support.mozilla.org/forums/contributors/709017 that was posted soon after implementing the AAQ string localization but before bug 871589.
Q3 is over... => Q4
Target Milestone: 2013Q3 → 2013Q4
Moving 2013Q4 bugs to the Future since we didnt care enough about them in 2014Q1.
Target Milestone: 2013Q4 → Future
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.