Closed Bug 489386 Opened 15 years ago Closed 15 years ago

Search in Swedish locale for "firefox consumes" suggests bogus English "firefox consumer" article

Categories

(support.mozilla.org :: General, defect)

x86
All
defect
Not set
major

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: stephend, Assigned: ecooper)

References

()

Details

(Whiteboard: sumo_only)

Attachments

(2 files, 1 obsolete file)

If you set your language in Tiki to Swedish (Svenska), and search for "firefox consumes" (https://support-stage.mozilla.org/tiki-newsearch.php?locale=en&q=firefox+consumes&where=all&sa=&filter_lang=1&l=en&en_too=1&lastmodif=0&type=0&author=) (once you set Swedish as your preferred language in SUMO's preference, it stays, and overrides the URL, apparently), you get:

Actual Results:

About 2 search results for "firefox consumes" i Svenska and for "firefox consumer" i English

as the suggested results.

Expected Results:

"firefox consumer" shouldn't show up as a suggested result
Assignee: nobody → smirkingsisyphus
Target Milestone: --- → 1.0.2
(In reply to comment #1)
> This is with language-English:
> 
> https://support-stage.mozilla.org/tiki-newsearch.php?locale=en&q=Private+data+is+used+by+the+browser+to+enhance+your+experience&where=f&sa=&filter_lang=1&l=en&en_too=1&lastmodif=0&type=0&author=

&where=f is searching forum threads only, so that might explain why you're not
getting any results for this.
I can seem to replicate this (on staging).

This is what I'm doing:

1. Log in
2. Set preferred language to Swedish.
3. Return to KB home page via 'Knowledge Base' link
4. Search for 'firefox consumes' in the search box from the homepage

I end up with <https://support-stage.mozilla.org/tiki-newsearch.php?where=d&locale=sv-SE&q=firefox+consumes&sa=>, which doesn't reflect the bug summary. That is, I get no results with no recommendations.

Honestly, I think I'm just slow. Can I get a better STR?
I think the key is the filter "All" (KB/forums) in the temporary search UI we have: http://screencast.com/t/DDEXPYPuHu
Just clicking on this link while being logged out is a solid way to reproduce it here: https://support-stage.mozilla.org/tiki-newsearch.php?where=d&locale=sv-SE&q=firefox+consumes&sa=
Attached image screenshot of what I reproduce (obsolete) —
The screencast is wonky for me, so I can't view it properly.

The attached is what I see when visiting the following two urls. The first is from my STR, and the second is the one from the bug summary. They both are identical.

https://support-stage.mozilla.org/tiki-newsearch.php?locale=sv-SE&q=firefox+consumes&where=all&sa=&filter_lang=1&l=sv-SE&en_too=1&lastmodif=0&type=0&author=

https://support-stage.mozilla.org/tiki-newsearch.php?locale=en&q=firefox+consumes&where=all&sa=&filter_lang=1&l=en&en_too=1&lastmodif=0&type=0&author=

:(
This is happening because google (where we get our translations, apparently) is telling us that 'firefox consumes' in Swedish is 'firefox consumer' in English.

http://translate.google.com/translate_t?sl=sv&tl=en&ie=UTF8&hl=en&text=firefox+consumes
Wait - so this is a feature? Are we still using Google for some services even though the search engine is Sphinx?

Of course, there is no such word as "consumes" in Swedish...
(In reply to comment #8)
> Wait - so this is a feature? Are we still using Google for some services even
> though the search engine is Sphinx?
> 
> Of course, there is no such word as "consumes" in Swedish...

Sphinx can't translated Swedish to English, or anything to another language. Google is used to translate the query for when search is set to search in english as well.
Isn't it better to just search for the exact same phrase? That's what I'd expect as a user, and not some automagical (and sometimes failing) Google translation.
(In reply to comment #10)
> Isn't it better to just search for the exact same phrase? That's what I'd
> expect as a user, and not some automagical (and sometimes failing) Google
> translation.

Maybe? 

If you expect users to search in their native language a majority of the time, then it makes sense to use a translated term. For example, searching in English articles for term 'enthält' will net no results. However, searching in English articles for the term 'contains' will.

If you don't expect users to search in their native language, then it's probably unnecessary.  

It's probably worth mentioning that translate.google.com has an 'auto' feature where it tries to detect the 'served' language. So, if you used 'firefox consumes' from a Swedish search, it would see that the originating language is English and not mis-translate (i.e., it will use 'firefox consumes' for the English search). This would probably be more error prone than the current method, however.
So after a group discussion we think the behavior should be as follows:

- search for "foo" in Swedish etc
- if nothing returned, search for both literal "foo" and translated-to-English "foo" in English locale.

David, does that work for you?
Target Milestone: 1.0.2 → 1.1
FWIW, top search Swedish search terms in April so far:
1.  	 	Unspecified  	 	48,921  	94.3%
2. 		Uniques exceeded 		191 	0.4%
3. 		startsida 		53 	0.1%
4. 		bokm㤲ken 		42 	0.1%
5. 		favoriter 		24 	0.0%
6. 		vista 		22 	0.0%
7. 		bookmarks 		22 	0.0%
8. 		flikar 		22 	0.0%
9. 		cookies 		20 	0.0%
10. 		language 		19 	0.0%
11. 		rss 		15 	0.0%
12. 		export bookmarks 		14 	0.0%
13. 		exportera bokm㤲ken 		14 	0.0%
14. 		hj㤬p/om mozilla firefox 		13 	0.0%
15. 		firefox 2 		13 	0.0%
16. 		filh㤭taren 		13 	0.0%
17. 		ftp 		13 	0.0%
18. 		java 		12 	0.0%
19. 		windows 98 		12 	0.0%
20. 		ssl 		12 	0.0%
21. 		mail 		12 	0.0%
22. 		crash 		11 	0.0%
23. 		bokm㤲ke 		11 	0.0%
24. 		cache 		11 	0.0%
25. 		proxy 		11 	0.0%
26. 		vista 64 		10 	0.0%
27. 		pdf 		10 	0.0%
28. 		change language 		10 	0.0%
29. 		popup 		10 	0.0%
30. 		rensa 		10 	0.0%
(In reply to comment #12)
> So after a group discussion we think the behavior should be as follows:
> 
> - search for "foo" in Swedish etc
> - if nothing returned, search for both literal "foo" and translated-to-English
> "foo" in English locale.
> 
> David, does that work for you?

FWIW, I have a patch ready for this. Just waiting for a nod from djst.
David gave it the thumbs up at this morning's meeting, so please post the patch.
(In reply to comment #12)
> So after a group discussion we think the behavior should be as follows:
> 
> - search for "foo" in Swedish etc
> - if nothing returned, search for both literal "foo" and translated-to-English
> "foo" in English locale.
> 
> David, does that work for you?

I was about to commit this, but a question hit me while I was reviewing this. Why don't we always just want to just include results for both 'foo' and 'foo-translated-to-English' to searches from non-English locales? That functionality is already built into tiki-newsearch.php (en_too paramter is turned on by default).

Or am I misreading this? This is what I get for missing the meeting. :( 

I basically just want to make sure I'm not misinterpreting something, really.
(In reply to comment #16)
> I was about to commit this, but a question hit me while I was reviewing this.
> Why don't we always just want to just include results for both 'foo' and
> 'foo-translated-to-English' to searches from non-English locales? That
> functionality is already built into tiki-newsearch.php (en_too paramter is
> turned on by default).

What you're describing is exactly what we agreed on would be the long term solution (SUMO 1.1) but that we'd just do the foo-translated-to-English thing now because that was supposed to be the easy fix. 

Are you suggesting that doing both 'foo' + 'foo-translated-to-English' is also an easy fix? In that case I'd say go for it!!
Right now we search for:
in locale: "foo"
in English "foo translated to English"

Proposed change:
in locale: "foo"
in English: "foo" + "foo translated to English"
Attached patch v1Splinter Review
This patch searches for 'foo' and 'foo-translated-to-english' in English after searching for 'foo' in the query's original language.

Laura, the searches are run against sphinx and then combined into one set as to not interfere with the get_bests_3 algorithm.

Also, I left the 'About 266 search results for "pour" dans Français and for "for" dans English' message at the top of the search results alone, but that may need changed.
Attachment #373982 - Attachment is obsolete: true
Attachment #375272 - Flags: review?(laura)
Attachment #375272 - Flags: review?(laura) → review+
This is in r25958/r25959 and ready for testing.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
I used a comparison of front page search results to test this since forum topics can obfuscate things sometimes.

https://support.mozilla.com/tiki-newsearch.php?where=d&locale=sv-SE&q=firefox+consumes&sa= 
vs.
https://support-stage.mozilla.org/tiki-newsearch.php?where=d&locale=sv-SE&q=firefox+consumes&sa=

I think getting an actual English result for 'firefox consumes' on stage shows the behavior we want. On prod, you don't get anything.

If this is about the translation message, I mentioned in comment 19 that it hadn't been changed, but probably should be.
David: I need to know if we want to change the output Eric mentions in comment 19 (the last paragraph), before I verify this bug.

Thanks!
Verified FIXED per today's meeting -- we decided it didn't matter at this time.
Status: RESOLVED → VERIFIED
Whiteboard: sumo_only
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: