Closed Bug 1109120 Opened 9 years ago Closed 9 years ago

implement more reliable geoIP-based search defaults

Categories

(Firefox :: General, defect)

defect
Not set
normal
Points:
2

Tracking

()

VERIFIED FIXED
Firefox 37
Iteration:
37.3 - 12 Jan
Tracking Status
firefox36 + verified
firefox37 + verified

People

(Reporter: Gavin, Assigned: markh)

References

Details

Attachments

(3 files, 2 obsolete files)

Bug 1102416 used the time zone offset to guesstimate market location, which is unreliable an inaccurate. We should either use system region information or a GeoIP-based lookup.
Flags: qe-verify+
Flags: firefox-backlog+
Broadening this a bit - not only useful for search.

The current requirement is for country-level precision (i.e. be able to answer "which country am I in?" on the client side).

The two options I see:
- GeoIP based (Firefox makes a request to a Mozilla geo server to get a country code)
- OS API based (e.g. bug 1102295, bug 1102297)
Component: Search → General
Summary: implement more reliable geo-based search defaults → implement reliable mechanism for client-side knowledge of geography
Morphing back!
Assignee: nobody → mhammond
Summary: implement reliable mechanism for client-side knowledge of geography → implement more reliable geoIP-based search defaults
I think this does as we discussed, and should be self-explanatory enough I won't bother summarizing it.  But I will call-out:

* Default timeout is 2000ms, but locally I'm seeing ~700ms, making me think that maybe 2000ms is too optimistic.  It's hard to know what the correct tradeoff is.

* We might want to consider expiring the telemetry probes in ~40 or so?  At the moment they never expire.  Both :gavin and :markh are in alert_emails.

* I *think* I've got the probe types correct:
** timeout is a simple linear type given the relatively short timeout we have specified.
** The "mismatched" probes are flags as they are only set when the condition matches.
** The "timed out" probe is also a flag as that is only set on a timeout.
** The "success" probe is a boolean as it is always set on both success and failure.  There's no attempt to record failure reasons other than timeout.
** hrm - I just thought - if we get a successful response but a null country-code in the JSON, telemetry will report success (although the core logic will remain correct).  Not sure if this is worth handling though...
Attachment #8538279 - Flags: feedback?(gavin.sharp)
This has test changes:

* xpcshell search tests set the geoip endpoint to an empty string, meaning no attempt is made to do the geoip thing there (ie, IIUC, the timezone check *is* still made for these tests).

* mochi tests set browser.search.isUS=true and browser.search.countryCode="US" which both (a) avoids geoip lookup and timezone logic and (b) ensures the tests pass when run outside the US.

Tests in toolkit/components/search/tests/ and browser/components/search/test/ both pass locally for me, full try at https://treeherder.mozilla.org/#/jobs?repo=try&revision=86759f5d7394
Attachment #8538279 - Attachment is obsolete: true
Attachment #8538279 - Flags: feedback?(gavin.sharp)
Attachment #8538289 - Flags: feedback?(gavin.sharp)
Try shows talos and "crash" tests are both failing with an attempt to contact our geoip endpoint (ie, "Non-local network connections are disabled and a connection attempt to geo.mozilla.org (63.245.215.82) was made.")  A quick search didn't make it obvious how to handle this, but I'll look further tomorrow if noone else beats me to it ;)
Status: NEW → ASSIGNED
Points: --- → 2
New push that also sets those 2 prefs for reftests is at https://treeherder.mozilla.org/#/jobs?repo=try&revision=05e11b2473f9.

Talos is a little trickier - according to https://developer.mozilla.org/en-US/docs/Mozilla/QA/Automated_testing, Talos config is outside of mozilla-central, and it's less clear that a simple "disable it" is really what we want there - so Gavin, it would be awesome if you can chase that up.
Attached patch geoipSplinter Review
One more xpcshell test needed the prefs love - green try at https://treeherder.mozilla.org/#/jobs?repo=try&revision=a6855aaeb2e8 - except for talos.

From IRC, Joel says something like this patch (plus some "deployment" magic in talos.json) would work and might even suit his deployment plans :)
Attachment #8538458 - Flags: feedback?(jmaher)
Comment on attachment 8538458 [details] [diff] [review]
geoip

Review of attachment 8538458 [details] [diff] [review]:
-----------------------------------------------------------------

this patch looks fine syntax wise!  Do we need to define these since we run Talos without any network access?  I blindly assume that setting these will prevent popups or other code from executing.

Deployment story:
land this on Talos repo
update testing/talos/talos.json in m-c (or related branch)
* I have a few pending changes which have already landed on talos and will be picked up on the next talos.json update

question- can we land these prefs now and deploy now without other changes landing?  Most likely I won't personally update talos right away, but if I were inclined tonight or tomorrow it would be good to know.
Attachment #8538458 - Flags: feedback?(jmaher) → feedback+
Iteration: --- → 37.2
(In reply to Mark Hammond [:markh] from comment #3)
> * Default timeout is 2000ms, but locally I'm seeing ~700ms, making me think
> that maybe 2000ms is too optimistic.  It's hard to know what the correct
> tradeoff is.

Let's see what the telemetry says on Nightly/Aurora and decide based on that.

> * We might want to consider expiring the telemetry probes in ~40 or so?  At
> the moment they never expire.  Both :gavin and :markh are in alert_emails.

I think that's fine.

> ** hrm - I just thought - if we get a successful response but a null
> country-code in the JSON, telemetry will report success (although the core
> logic will remain correct).  Not sure if this is worth handling though...

Might be worth a separate "request succeeded but response was null" flag to detect issues with our GeoIP service.
Comment on attachment 8538289 [details] [diff] [review]
0001-Bug-1109120-use-a-geoip-xhr-request-for-more-reliabl.patch

typo: servuce

>diff --git a/toolkit/components/search/nsSearchService.js b/toolkit/components/search/nsSearchService.js

>+// The preferences related to country determination.
>+const PREF_IS_US = "browser.search.isUS";
>+const PREF_COUNTRY_CODE = "browser.search.countryCode";
>+// Prefs that hold the geoip endpoint and timeout.
>+const PREF_GEOIP_ENDPOINT = "browser.search.geoip.endpoint";
>+const PREF_GEOIP_TIMEOUT = "browser.search.geoip.timeout";

I actually regret setting these constants up when I originally wrote this file - the indirection is more confusing than useful, and optimizes for changing the pref name, which never happens. So I would just hardcode the pref name strings where they are used.

>+// A less hacky method that tries to determine our country-code via an XHR
>+// geoip lookup.
>+// If this succeeds and we are using an en-US locale, we set the pref used by
>+// the hacky method above, so isUS() can avoid the hacky timezone method.
>+// If it fails or we are not en-US, we don't touch that pref so isUS() will
>+// use a timezone fallback if appropriate.

isUS() returns false if the locale is not en-US, so it's a bit confusing that this implies that it might fallback to timezones in that case. I would just omit mention of the "not en-US" case.

>+function fetchCountryCode() {

>+    let request = Cc["@mozilla.org/xmlextras/xmlhttprequest;1"].
>+                  createInstance(Ci.nsIXMLHttpRequest);

If you stick a Cu.importGlobalProperties(["XMLHttpRequest"]); at the top of this file, you should be able to just use new XMLHttpRequest() here.

>+      let histogram = Services.telemetry.getHistogramById("SEARCH_SERVICE_COUNTRY_FETCH_MS");
>+      histogram.add(took);

Getting rid of "histogram" and just using "Services.telemetry.getHistogramById("SEARCH_SERVICE_COUNTRY_FETCH_MS").add(took)" would probably make all of this code clearer.

>+    request.open("GET", endpoint, true);
>+    request.responseType = "json";
>+    request.send();

Discussion in bug 1112753 might require tweaking this slightly.

>   _asyncInit: function SRCH_SVC__asyncInit() {
>     return Task.spawn(function() {
>       LOG("_asyncInit start");
>       try {
>+        yield ensureKnownCountryCode();

This should probably be checkForSyncCompletion(ensureKnownCountryCode());
Attachment #8538289 - Flags: feedback?(gavin.sharp) → feedback+
Depends on: 1112753
Comment on attachment 8538458 [details] [diff] [review]
geoip

(In reply to Joel Maher (:jmaher) from comment #8)

> this patch looks fine syntax wise!  Do we need to define these since we run
> Talos without any network access?

Yep, exactly - we want to avoid the search service doing a geoip lookup at first startup on a profile.

> I blindly assume that setting these will
> prevent popups or other code from executing.

No popups or other UI - this is just about preventing that geoip request.

> Deployment story:
> land this on Talos repo
> update testing/talos/talos.json in m-c (or related branch)
> * I have a few pending changes which have already landed on talos and will
> be picked up on the next talos.json update
> 
> question- can we land these prefs now and deploy now without other changes
> landing?  Most likely I won't personally update talos right away, but if I
> were inclined tonight or tomorrow it would be good to know.

Yes, we can land these prefs now - the only real risk with that is that the pref names might change - but that seems unlikely given the f+ on the "real" patch (and is easy to undo anyway)

If I understand your deployment comments above, you'd be fine with me landing just those prefs changes, then you would take care of the actual deployment (ie, updating the json with the revision id) - so I'll request review now, verify with Gavin he's happy for me to make the push, then needinfo you for the actual deployment (which must happen before the "real" patch here lands)
Attachment #8538458 - Flags: review?(jmaher)
Comment on attachment 8538458 [details] [diff] [review]
geoip

Review of attachment 8538458 [details] [diff] [review]:
-----------------------------------------------------------------

please land away, I will be updating talos tomorrow! So if you have hesitations about this getting deployed on inbound/trunk do speak up :)
Attachment #8538458 - Flags: review?(jmaher) → review+
All feedback comments addressed.  I think this is ready for review apart from the geoip pref - ie:

// XXX - before landing, this pref needs to change to https://location.services.mozilla.com/v1/country?key=%MOZILLA_API_KEY%
pref("browser.search.geoip.url", "https://location.stage.mozaws.net/v1/country?key=%MOZILLA_API_KEY%");

Try at https://treeherder.mozilla.org/#/jobs?repo=try&revision=6a687730981d - but as discussed above, expect red talos.
Attachment #8538289 - Attachment is obsolete: true
Attachment #8538903 - Flags: feedback?(gavin.sharp)
(In reply to Joel Maher (:jmaher) from comment #12)
> please land away, I will be updating talos tomorrow!

Thanks!  https://hg.mozilla.org/build/talos/rev/316dcd6a2063

> So if you have
> hesitations about this getting deployed on inbound/trunk do speak up :)

That should be fine.

I just make a new try push that (I hope) specifies that new revision for talos - https://treeherder.mozilla.org/#/jobs?repo=try&revision=df5b7cf41254
try push looks good.  One question - Will android be affected by this?
For the URL, you can change it over to the production host location.services.mozilla.com now, we just did an emergency release to get this deployed.

Populating a real MOZ_MOZILLA_API_KEY for desktop builds is tracked in https://bugzilla.mozilla.org/show_bug.cgi?id=1113606. I'm not sure whom to assign this to, so if anyone here knows, please re-assign that bug so it gets attention.
Comment on attachment 8538903 [details] [diff] [review]
0001-Bug-1109120-use-a-geoip-xhr-request-for-more-reliabl.patch

>diff --git a/toolkit/components/search/tests/xpcshell/head_search.js b/toolkit/components/search/tests/xpcshell/head_search.js

>+do_register_cleanup(() => Services.prefs.clearUserPref("browser.search.geoip.url"));

Is this necessary for xpcshell tests? I forget how their profiles work.

We should really have some test coverage for this in general (testing that the fallback to timezone works correctly, and that things also work correctly if the synchronous initialization is triggered, etc.). We can do that relatively easily by setting the lookup pref to e.g. "data:application/json,{...}", I think?
Attachment #8538903 - Flags: feedback?(gavin.sharp) → review+
Better test coverage would spin up an http server to actually cause a load to occur I suppose (and test the "failure to load" cases).
Iteration: 37.2 → 37.3
https://hg.mozilla.org/mozilla-central/rev/413a320a2bbb
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 37
QA Contact: petruta.rasa
Depends on: 1115440
Comment on attachment 8538903 [details] [diff] [review]
0001-Bug-1109120-use-a-geoip-xhr-request-for-more-reliabl.patch

Approval Request Comment
[Feature/regressing bug #]: Fx locale-based search defaults
[User impact if declined]: Users not in the US may get US defaults, and vice-versa
[Describe test coverage new/current, TBPL]: landed with tests
[Risks and why]: Reasonably low
[String/UUID change made/needed]: None
Attachment #8538903 - Flags: approval-mozilla-aurora?
Attachment #8538903 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Depends on: 1116217
Depends on: 1117158
Depends on: 1117979
How should I proceed in order to verify that this bug is fixed? Thanks!
(In reply to Petruta Rasa [QA] [:petruta] from comment #25)
> How should I proceed in order to verify that this bug is fixed? Thanks!

Start Firefox with a clean profile.  Very soon after starting you should see a pref "browser.search.countryCode" be created that reflects the country you are testing from and "browser.search.isUS" be a boolean that is false if country code is not US, true if it is.
Also: assuming that requests succeeds (and doesn't time out), whether you get Yahoo or not should no longer depend on your timezone - if countryCode is "US", your default should be Yahoo, and otherwise it should be Google.
Thanks for the details.

I verified using Developer Edition 36.0a2 and Nightly 37.0a1 2014-01-06 under Ubuntu 12.04 LTS 32-bit, Windows 7 64-bit and Mac OS X 10.9.5 and for country code = RO the results are correct. Also, the search engine is Google even if the timezone is set to US.

Marking as verified.
Status: RESOLVED → VERIFIED
Blocks: 1115575
Depends on: 1116404
Depends on: 1116383
You need to log in before you can comment on or make changes to this bug.