Closed Bug 1106638 Opened 10 years ago Closed 9 years ago

Remove WP.pl searchplugin

Categories

(Mozilla Localizations :: pl / Polish, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: stef, Assigned: stef)

References

Details

(Keywords: productization)

Attachments

(4 files, 3 obsolete files)

Attached patch rm wp-pl, add ddg-pl (obsolete) — Splinter Review
After discussion on community-poland mailing list (https://mail.mozilla.org/pipermail/community-poland/2014/000322.html) we would like to replace WP.pl searchplugin with DDG for fx36+.

WP is using Google results and doesn't offer much value - it is better to use default Google searchplugin directly. DuckDuckGo (esp with POST requests, disabled redirects and results region set) may be interesting and different enough addition for our users.

This patch removes wp-pl.xml, adds ddg-pl.xml (basically en-US version with request method changed + license + kd, kg and kl params) and updates list.txt accordingly.
Attachment #8530973 - Flags: review?(francesco.lodolo)
I need to put this on hold for a bit, since we're still adding DDG at build time (it's going to be fixed, hopefully for Firefox 37).

I also need to clarify with a few people if it's OK to have a localized variant of DDG (I think it is, as long as we keep them in sync).
Trying to summarize the changes compared to the the en-US version (@stef, feel free to correct If I missed/misinterpreted any).

The Polish version has a MPL license header and uses the standard Mozilla format instead of opensearch (big fan of this change). Then there are some additional parameters https://duckduckgo.com/params

Suggestions:
* (added) kl=pl-pl - set the region

Search:
* switched method from GET to POST
* (added) kd=-1 - disable redirect
* (added) kg=p - set 'address bar' as POST
* (added) kl=pl-pl - set the region

While I definitely get the region, I'm not sure about the other two parameters and the switch GET->POST (in terms of compliance with the agreements).

1. Does kg have any practical effect in the searchplugin? I tried this version, then removed the parameter, but couldn't see any visible difference.
2. Isn't the redirect supposed to increase user's privacy? Am I misunderstanding this feature?

Nits on the XML itself:
* I'd use the same order we use for other searchplugins, because it makes easier to find things in the file (so images before the search urls).
* We can drop the <searchform> element and switch to the attribute on the search Url element.

Making also explicit NI to mconnor after our discussion.
Depends on: 1105092
Flags: needinfo?(mconnor)
(In reply to Francesco Lodolo [:flod] from comment #2)
> While I definitely get the region, I'm not sure about the other two
> parameters and the switch GET->POST (in terms of compliance with the
> agreements).

Changing GET to POST is a huge win for users privacy: the search terms are then not part of the URL then , and as such not part of the "metadata", which since Snowden we know, is not treated as personal data by the US and many other countries.
DDG without POST is not much of a win for users privacy, as anyone sitting between the browser and the DDG server, including your provider, knows what you were searching for just by analyzing the URLs.
Forgot: additionally, with POST and https, one first would have to get rid of the TLS encryption to know, what you were searching for. https with GET does not make much sense with search engines, as the search parameters in the URL remain plain text - and this is the most private part from a user perspective, not the received results.
(In reply to Adrian Kalla [:adriank] from comment #3)
> Changing GET to POST is a huge win for users privacy

I definitely get the privacy improvement for the user, my doubts are about making sure that this is OK for all involved parties. In that case we might want to apply the same fix to en-US (for example I'm not sure why we don't have a searchform anywhere in en-US).
(In reply to Francesco Lodolo [:flod] from comment #2)
> While I definitely get the region, I'm not sure about the other two
> parameters and the switch GET->POST (in terms of compliance with the
> agreements).

I know nothing about agreements but POST seems to be officially supported by DDG.

The most visible change to the user would be no search urls and tons of redirects in browser history, that should lead to "cleaner experience" in addressbar. We have concluded that as unexpected as it might be, it may be interesting and differentiating factor aligned with general DDG push for privacy.

> 1. Does kg have any practical effect in the searchplugin? I tried this
> version, then removed the parameter, but couldn't see any visible difference.

It sets the search field on the search results page to use POST among other things (ie no ia=meanings param/redirect when search for test).

> 2. Isn't the redirect supposed to increase user's privacy? Am I
> misunderstanding this feature?

Redirects are supposed to do many things and could also increase user privacy if you rewrite http referrer with redirecting server correctly. Since POST searches do not leak sensitive information via referrer we opt to disable redirects to allow users to do things like copy original search result urls.

> * I'd use the same order we use for other searchplugins, because it makes
> easier to find things in the file (so images before the search urls).

No problem, I just use line wrapping in some editors and pushing images down makes it more readable for me.

> * We can drop the <searchform> element and switch to the attribute on the
> search Url element.

Nothing against that but some documentation would help a lot in general.
(In reply to Adrian Kalla [:adriank] from comment #3)
(In reply to Adrian Kalla [:adriank] from comment #4)

Adrian: just to clarify, you are referring to the status of all of the data after search session (like logs on ddg server) and not to the security of HTTP headers in transit, right?
(In reply to Stefan Plewako [:stef] from comment #6)
> It sets the search field on the search results page to use POST among other
> things (ie no ia=meanings param/redirect when search for test).
Tried without but was still getting a POST request even when I tried to perform a new search from the results page. Can you try too without the parameter? Not sure if it was just some cache issue on my side.

> Redirects are supposed to do many things and could also increase user
> privacy if you rewrite http referrer with redirecting server correctly.
> Since POST searches do not leak sensitive information via referrer we opt to
> disable redirects to allow users to do things like copy original search
> result urls.
OK, makes more sense now.
 
> Nothing against that but some documentation would help a lot in general.
Tried to collect a bit of information, not sure how useful it is.

I suppose the en-US version doesn't have a SearchForm element because that's not part of the OpenSearch standard. 

SearchForm url should be used for empty searches (bug 325913) to point the user to a working search form. Not sure if there are other usages. The switch from element to attribute of the search Url was done in bug 990799 (and bug 557665).
(In reply to Francesco Lodolo [:flod] from comment #8)
> (In reply to Stefan Plewako [:stef] from comment #6)
> > It sets the search field on the search results page to use POST among other
> > things (ie no ia=meanings param/redirect when search for test).
> Tried without but was still getting a POST request even when I tried to
> perform a new search from the results page. Can you try too without the
> parameter? Not sure if it was just some cache issue on my side.

Strange… cookies maybe? I tested this on latest nightly with fresh profile and e10s disabled by installing modified plugin (name and param) with this code:

Components.utils.import("resource://gre/modules/Services.jsm"); Services.search.addEngine("file:///file/path/ddg-pl.xml",Components.interfaces.nsISearchEngine.TYPE_MOZSEARCH,"",false,{onSuccess: function (){alert(1)}, onError: function (){alert(2)}});

(ie from www console for about:addons) setting DDG as default search engine and then performing the test.

Just ia=meanings is no longer hidden as I retest this (could be verified by setting not to open instant answers param, kz = -1) but this doesn't change much (should be reported to and fixed on DDG side IMO).
Attached patch rm wp-pl, add ddg-pl (obsolete) — Splinter Review
Updated patch.
Attachment #8530973 - Attachment is obsolete: true
Attachment #8530973 - Flags: review?(francesco.lodolo)
Attachment #8533254 - Flags: review?(francesco.lodolo)
(In reply to Stefan Plewako [:stef] from comment #10)
> Created attachment 8533254 [details] [diff] [review]

With incorrect ShortName, will fix that later.
(In reply to Adrian Kalla [:adriank] from comment #3)
> DDG without POST is not much of a win for users privacy, as anyone sitting
> between the browser and the DDG server, including your provider, knows what
> you were searching for just by analyzing the URLs.

This is incorrect - HTTPS protects the entirety of the HTTP request, including the requested URL. There are user experience downsides to searching with POST (described on the DDG Privacy page: https://duckduckgo.com/privacy), so I don't think using it by default in Firefox is the right tradeoff.
(In reply to :Gavin Sharp [email: gavin@gavinsharp.com] from comment #12)
> There are user experience downsides to
> searching with POST (described on the DDG Privacy page:
> https://duckduckgo.com/privacy)

Which ones?
(In reply to Stefan Plewako [:stef] from comment #13)
> (In reply to :Gavin Sharp [email: gavin@gavinsharp.com] from comment #12)
> > There are user experience downsides to
> > searching with POST (described on the DDG Privacy page:
> > https://duckduckgo.com/privacy)
> 
> Which ones?

From DDG Privacy document: "You can turn on POST requests on our settings page, but it has its own issues. POST requests usually break browser back buttons, and they make it impossible for you to easily share your search by copying and pasting it out of your Web browser's address bar."
(In reply to Zibi Braniecki [:gandalf] from comment #14)
> From DDG Privacy document: "You can turn on POST requests on our settings
> page, but it has its own issues. POST requests usually break browser back
> buttons, and they make it impossible for you to easily share your search by
> copying and pasting it out of your Web browser's address bar."

Maybe POST requests usually break browser back buttons but in this case I see back button far from being broken.

Usefulnesses of sharing by url is already considerably limited by continuous scrolling and by how search engines depend on other (then url) factors to customize displayed content in general.

As mentioned earlier that (with other changes) should make it interesting tradeoff (cleaner browser addressbar and overall experience).
(In reply to Stefan Plewako [:stef] from comment #15)
> (In reply to Zibi Braniecki [:gandalf] from comment #14)
> > From DDG Privacy document: "You can turn on POST requests on our settings
> > page, but it has its own issues. POST requests usually break browser back
> > buttons, and they make it impossible for you to easily share your search by
> > copying and pasting it out of your Web browser's address bar."

In addition to those, there's also the potential downside that searches don't appear in your global history. This can be considered a benefit for some users, but probably not most (even in Poland). There are also already other ways to control that (either by configuring DDG or Firefox).

> As mentioned earlier that (with other changes) should make it interesting
> tradeoff (cleaner browser addressbar and overall experience).

There's nothing cleaner about the overall experience with this setting enabled, as far as I can tell, and I think "cleaner location bar" is a tenuous benefit.
As a pretty basic rule, I'm not in favour of landing a separate DDG plugin for any locale.  If these changes have general user benefit, they should be discussed and implemented for all locales, not cherry-picked for specific locales.

On the changes themselves:

* Switching to POST would break the search call-outs in the address bar, which is a new feature I'd prefer to not break.
* We're already storing search history from search bar/about:home/about:newtab separate from browser history, so I'm doubly unconvinced on the value.  Especially if it has negative interactions with the back button (even if it's rare).
* I'm not at all convinced killing the redirect is a good idea.  Tracking clicks is how search engines optimize results, especially the long tail, so it's hardly a huge win.
* On a personal level, I'm not at all convinced that using MozSearch in preference to OpenSearch is a good idea.  Especially now that <searchform> is deprecated.
* region detection should be automatic.  I don't think we need to force this on our side, but if we do we should fix it globally.
(In reply to :Gavin Sharp [email: gavin@gavinsharp.com] from comment #16)
> In addition to those, there's also the potential downside that searches
> don't appear in your global history. This can be considered a benefit for
> some users, but probably not most (even in Poland).

Keep in mind that most users will use default Google and anything else (esp. in Poland). With interesting searchplugin we may be able to attract some privacy oriented group (not very popular topic in Poland) but it will be slow process. 

> There are also already other ways to control that (either by configuring DDG or Firefox)

I don't know how could I control that while using provided searchplugin.

> There's nothing cleaner about the overall experience with this setting
> enabled, as far as I can tell, and I think "cleaner location bar" is a
> tenuous benefit.

Either there is nothing or it is tenuous. Most users may not interact with address bar too much but I don't think this group is primary DDG target.

(In reply to Mike Connor [:mconnor] from comment #17)
> As a pretty basic rule, I'm not in favour of landing a separate DDG plugin
> for any locale.  If these changes have general user benefit, they should be
> discussed and implemented for all locales, not cherry-picked for specific
> locales.
> 
> On the changes themselves:
> 
> * Switching to POST would break the search call-outs in the address bar,
> which is a new feature I'd prefer to not break.

I do not really know what search call-outs is but this sounds like a bug in that feature.

> * We're already storing search history from search
> bar/about:home/about:newtab separate from browser history, so I'm doubly
> unconvinced on the value.  Especially if it has negative interactions with
> the back button (even if it's rare).

I don't know how exactly and in every detail we are storing browser and other histories but searches seem to be presented exactly as every other regular website visit.

> * I'm not at all convinced killing the redirect is a good idea.  Tracking
> clicks is how search engines optimize results, especially the long tail, so
> it's hardly a huge win.

For search engines it may be simplest way indeed. For users it is hardly the best (as mentioned earlier).

> * On a personal level, I'm not at all convinced that using MozSearch in
> preference to OpenSearch is a good idea.  Especially now that <searchform>
> is deprecated.

No problem here - if OpenSearch is preferred, we should use it.

> * region detection should be automatic.  I don't think we need to force this
> on our side, but if we do we should fix it globally.

Region detection seems to be automatic (for me, correct flag next to the toggle) but regional results are disabled. I don't see a way to change this behavior on our side other then setting the region param explicitly.

All the changes (besides critical region activation part) are intended to attract users by differentiating it from Google searchplugin - because DDG cannot do this directly via search results quality for Poland and "privacy" may not be strong enough.
(In reply to Stefan Plewako [:stef] from comment #18)
> I don't know how could I control that while using provided searchplugin.

You can configure Firefox to not remember any history at all, or you can visit https://duckduckgo.com/settings#privacy.

> All the changes (besides critical region activation part) are intended to
> attract users by differentiating it from Google searchplugin - because DDG
> cannot do this directly via search results quality for Poland and "privacy"
> may not be strong enough.

I think DDG without the additional tweaks you're suggesting here is already plenty differentiated from Google.
Attached patch rm wp-pl, add ddg-pl (obsolete) — Splinter Review
If I understand correctly, POST related changes won't be accepted. Therefore, option minimum patch: remove wp-pl and adds ddg-pl searchplugin.

There is only one important change in ddg-pl when compared to mozilla-central/en-US version: region params (name "kl", value "pl-pl") added.
Attachment #8533254 - Attachment is obsolete: true
Attachment #8533254 - Flags: review?(francesco.lodolo)
Attachment #8545214 - Flags: review?(francesco.lodolo)
So, two things:

First, I'm going to escalate this to DDG, since this sounds like region detection may be busted.  I may have followup questions.  This should work without hardcoding things.

Second, if we do need to specify kl=, I'd rather not fork the search plugin, but instead implement this as a pref-based MozParam (which, yeah, will involve switching to MozSearch format), and localizers can set this pref if necessary as a locale-specific pref.  We can't use ab-CD directly, looking at the parameter docs, but doing this means we only have to worry about one search plugin.  After the infinite versions of Yahoo for locales, I'd prefer to draw the line here.  Francesco, does that make sense to you as well?
Flags: needinfo?(mconnor)
(In reply to Mike Connor [:mconnor] from comment #21)
> After the infinite versions of Yahoo for locales, I'd prefer
> to draw the line here.  Francesco, does that make sense to you as well?

Absolutely yes. Experience has proven that having multiple versions of the same searchplugin scattered across l10n repositories is a good way to create inconsistencies, and in some cases delay updates requested from partners.

Some teams (e.g. the Polish one) are definitely better than others in keeping up with these updates, but if we manage to fix the main issues with a centralized XML file, all the better.
I fully understand maintainability issues but this not the right place nor the time to solve this problem.
Please allow us to ship localized searchplugin or to remove inappropriate en-US version from pl builds.
FWIW, my preference is for DDG to fix this on the server side.  This should Just Work already, that it apparently isn't is a bug that impacts every locale.  I'm surprised this is the first I've heard of it, frankly.

For followup with DDG, I assume you get the same results on a new profile?  Can you send me screenshots of the results with/without the kl= parameter?
(In reply to Mike Connor [:mconnor] from comment #24)
> FWIW, my preference is for DDG to fix this on the server side.

I have nothing against fixing it server side and single searchplugin file for all locales but my preference is to have this fixed ASAP.

> This should
> Just Work already, that it apparently isn't is a bug that impacts every
> locale.  I'm surprised this is the first I've heard of it, frankly.

Unfortunately, this just doesn't work and it isn't something new. I have no data about how it impacts other locales but I'm signaling the issue to Mozilla since 2014.09.29…

> For followup with DDG, I assume you get the same results on a new profile?

Yes.
(In reply to Mike Connor [:mconnor] from comment #24)
> I'm surprised this is the first I've heard of it, frankly.

This is a known issue for years with DDG - and one of the reasons, why DDG usage outside English-speaking countries is more or less not existing.


> FWIW, my preference is for DDG to fix this on the server side.

I think they do it on purpose, as otherwise they would need to GeoIP the user and maybe they don't want to do it because of privacy concerns?


> This should
> Just Work already, that it apparently isn't is a bug that impacts every
> locale.

Imho, it impacts *every* locale, except English - just compare how many non-English results you get for searches in a non-English language with and without this setting. It impacts e.g. German too.


> For followup with DDG, I assume you get the same results on a new profile? 

I see the same results as Stef does - and the results are currently *useless* for Polish speaking people.
(In reply to Adrian Kalla [:adriank] from comment #28)
> > FWIW, my preference is for DDG to fix this on the server side.
> 
> I think they do it on purpose, as otherwise they would need to GeoIP the
> user and maybe they don't want to do it because of privacy concerns?


What I wonder though, is why don't they:

1) Respect the language the user is searching in? When I type "Polska" I excpect results in Polish, when I type "Poland", I expect results in English
 - exactly like Google does.
2) Why don't they use the browser-locale to determine the search-language? The already use it for the ui...
(In reply to Mike Connor [:mconnor] from comment #21)
> Second, if we do need to specify kl=, I'd rather not fork the search plugin,
> but instead implement this as a pref-based MozParam (which, yeah, will
> involve switching to MozSearch format)

FWIW, no need to switch to MozSearch to use a <MozParam>. Our parser is namespace-agnostic and no one cares to schema-validate these AFAIK.
As a quick update, I met with DDG today to discuss the concerns raised here.  Here's the quick update:


1) Forcing regions is not desirable as this will break a number of features on the site.
2) Language detection is on their roadmap, and the plan is to enable this for Polish as soon as possible, likely by next week. Once that happens I'll be asking for feedback.  This implementation will be replaced by a superior algorithm in the future, but based on the feedback here the current version represents a significant improvement.  Browser locale is an indicator, but their data is that it's not a 1:1 mapping (i.e. there's lots of English and other language searches in .pl), so they're focused on search terms as the key signal.
3) Stronger integration with local Wikipedia instances is progressing as well, and will help with various pieces like the definitions.

Given all this, we shouldn't make technical changes at this time, and look at how things work once the Polish language detection feature is enabled.
As a followup update, DDG has enabled #2 above for Polish.  Adrian and Stefan, can you take a look on your end?  My take is that this is a big improvement, and I don't see a major experience delta between region on/off.
Flags: needinfo?(akalla)
(In reply to Mike Connor [:mconnor] from comment #32)
I still see big difference between searches with and without region setting.
Can you give an example (i.e. search term)?
Warszawa
(In reply to Mike Connor [:mconnor] from comment #32)
> As a followup update, DDG has enabled #2 above for Polish.  Adrian and
> Stefan, can you take a look on your end?  My take is that this is a big
> improvement, and I don't see a major experience delta between region on/off.

To be honest: I don't see much improvement - my overall feeling is even, that it is worse than it was before...

See the attached results for "Warszawa" without the region set to Poland: in the Top10 results just 3 results are of any meaning for a Polish speaking person who does not look for the Warszawa restaurant in Santa Monica, CA or a song called Warszawa.

In compare to that: after setting the region setting to Poland, not only all Top10 results are relevant, but at least all Top30(!!!).
Flags: needinfo?(akalla)
I'd ask to remove the DDG search plugin from Firefox releases in Polish until either DDG seriously fixes this or we add the region setting to the plugin. As of now, the horrible user experience continues - and any user who tries out DDG now, will probably not ever try it again...
To add more about(In reply to Mike Connor [:mconnor] from comment #32)
> As a followup update, DDG has enabled #2 above for Polish.  Adrian and
> Stefan, can you take a look on your end?  My take is that this is a big
> improvement, and I don't see a major experience delta between region on/off.

To add more: it seems like the mechanism works better, if more words than just one are used when searching, e.g.:

If you are looking for "Walesa" or "Jaruzelski", you get *no* Polish results in the Top15.
But if you search for "Lech Walesa" or "Wojciech Jaruzelski", it looks much better.
The same goes for "miasto Warszawa" instead of "Warszawa", but honestly: no one will use the term "miasto Warszawa" in a search engine...
(In reply to Adrian Kalla [:adriank] from comment #38)
> To add more: it seems like the mechanism works better, if more words than
> just one are used when searching

...but not always. Lets say someone wants to know more about the history of Europe and writes "historia Europy": without region set to Poland, most of the results in the Top10 are either not about the history of Europe or are of poor quality. If you set the region to Poland, only one of the Top10 results seems to be a bit out of place (but only a bit).
Okay, so my main takeaway here is that proper nouns are still pretty flawed.  Are multi-word searches not involving names/places better?  I noted a big shift for "Polska" compared to previous results.  But I don't read or write polish.

This is also something that needs tuning, I suspect.  Language processing is hard, and takes iteration.

Let me cycle back to DDG with this feedback.  Thanks!
Hey guys, I know it's not much, but I'd like to point out that this meticulous work in this bug is a major step forward for DDG in being able to serve users outside of en-US locale.

So while it may feel painful at time, I believe that we are helping DDG tremendously here. Thanks a lot Adrian, Stef and Mike!
(In reply to Mike Connor [:mconnor] from comment #40)
> Okay, so my main takeaway here is that proper nouns are still pretty flawed.
> Are multi-word searches not involving names/places better?

Yes, multi-word searches not involving names, places or propers are less screwed when comparing to regional search results now. At least for those few that I tested so far not that easy to find one.

It would be also nice if DDG would use font with polish diacritical characters and wouldn't suggest incorrect terms (ie without diacritical characters).
Hi all, DDG has pushed the following fixes, and would appreciate any feedback from the local folks:

> 1) PL wikipedia is now live, so it will appear as Instant answers.
> 2) PL language detection is now significantly improved.
> 3) We now use locale info as a signal such that if you type in an ambiguous term (e.g. a proper noun referenced in many languages), it will prefer that locale.

Thanks in advance!
(In reply to Mike Connor [:mconnor] from comment #43)
> > 1) PL wikipedia is now live, so it will appear as Instant answers.
> > 2) PL language detection is now significantly improved.

This was made only for pl or for all languages?
More than PL, but not "all" necessarily.  It's an ongoing process.
(In reply to Mike Connor [:mconnor] from comment #43)
> > 1) PL wikipedia is now live, so it will appear as Instant answers.

And it doesn't use HTTPS :(
I don't think there's any search engine that links to Wikipedia using HTTPS.  Google doesn't, Bing doesn't, etc.  It's only recently that they've had enough SSL termination capacity for us to switch our search plugins, so it's going to be a process.
(In reply to Mike Connor [:mconnor] from comment #47)
> I don't think there's any search engine that links to Wikipedia using HTTPS.

DDG for en-US.
Comment on attachment 8545214 [details] [diff] [review]
rm wp-pl, add ddg-pl

Review of attachment 8545214 [details] [diff] [review]:
-----------------------------------------------------------------

Clearing flag for now, since this bug it's not actionable at the moment.
Attachment #8545214 - Flags: review?(francesco.lodolo)
Attached patch rm wp-plSplinter Review
OK, things are much better, lets remove WP and sort remaining issues with DDG in separate bugs.
Attachment #8545214 - Attachment is obsolete: true
Attachment #8579983 - Flags: review?(francesco.lodolo)
Comment on attachment 8579983 [details] [diff] [review]
rm wp-pl

Review of attachment 8579983 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, thanks (need to amend the bug subject at this point).
Attachment #8579983 - Flags: review?(francesco.lodolo) → review+
Summary: Replace WP.pl searchplugin with DDG → Remove WP.pl searchplugin
Depends on: 1145189
Depends on: 1145194
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: