Map between ISO 3166 country code and UN M.49 region code

NEW
Unassigned

Status

()

P2
normal
a year ago
5 months ago

People

(Reporter: tnguyen, Unassigned)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: pwphish-threathit)

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(2 attachments, 2 obsolete attachments)

(Reporter)

Description

a year ago
Follow bug 1351147 and bug 1331138 we should not send region information in ISO 3166 format to server because that info may be too specific.
We decided UN M.49 should be specific enough and we have to create a map between those 2 standards. 
I put this bug under Safe Browsing component (the only one may use this feature), but if anyone feel that putting this under other components (internalization, location, or anyelse, ...) would be better, plz feel free to do that
Priority: -- → P3
(Reporter)

Updated

a year ago
Summary: Map betwen ISO 3166 and UN M.49 → Map between ISO 3166 and UN M.49
(Reporter)

Comment 1

a year ago
Probably we should use supranational regions [1]
https://en.wikipedia.org/wiki/UN_M.49
rather than geographical country regions (may be too specific)
https://unstats.un.org/unsd/methodology/m49/
(Reporter)

Comment 2

a year ago
Created attachment 8890754 [details]
country-codes.csv
(Reporter)

Updated

a year ago
Attachment #8890754 - Attachment is obsolete: true
(Reporter)

Comment 3

a year ago
Created attachment 8890755 [details]
country-codes.csv
(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #1)
> Probably we should use supranational regions [1]
> https://en.wikipedia.org/wiki/UN_M.49

Good idea. How about starting with the following?

002 	Africa
419 	Latin America and the Caribbean
021 	Northern America
142 	Asia
150 	Europe
009 	Oceania

We could map all of the countries to one of these 6 regions and then revisit this mapping later if it turns out it's not fine-grained enough to be useful.
(Reporter)

Comment 5

a year ago
Created attachment 8891869 [details]
convert.csv

csv convert iso 3166 to continental/sub-continental code, we should use sub-continental code
Attachment #8890755 - Attachment is obsolete: true
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Reporter)

Updated

a year ago
Summary: Map between ISO 3166 and UN M.49 → Map between ISO 3166 country code and UN M.49 region code
(Reporter)

Updated

a year ago
Assignee: nobody → tnguyen
Status: NEW → ASSIGNED

Comment 8

a year ago
mozreview-review
Comment on attachment 8891915 [details]
Bug 1372456 - Convert ISO 3166 code to UN M49 region code

https://reviewboard.mozilla.org/r/162924/#review168410

::: toolkit/components/url-classifier/nsUrlClassifierUtils.h:62
(Diff revision 2)
>  
>    void CleanupHostname(const nsACString & host, nsACString & _retval);
>  
>    nsresult ReadProvidersFromPrefs(ProviderDictType& aDict);
>  
> +  // Our country code stored in preference with ISO 3166 standard is too

I would suggest rephrasing this comment to:

"Country codes are too granular and could lead to identification of users who live in small countries. This is a mapping from countries to large M.49 regions."

::: toolkit/components/url-classifier/nsUrlClassifierUtils.cpp:19
(Diff revision 2)
>  #include "mozilla/Sprintf.h"
>  #include "mozilla/Mutex.h"
>  
>  #define DEFAULT_PROTOCOL_VERSION "2.2"
>  
> +// Table to look up UN M.49 region code from ISO 3166 alpha 2 country code.

This is too granular, see my proposal in comment 4.

::: toolkit/components/url-classifier/nsUrlClassifierUtils.cpp:797
(Diff revision 2)
>  NS_IMETHODIMP
>  nsUrlClassifierUtils::Observe(nsISupports *aSubject, const char *aTopic,
>                                const char16_t *aData)
>  {
>    if (0 == strcmp(aTopic, NS_PREFBRANCH_PREFCHANGE_TOPIC_ID)) {
> +    if(!strcmp(aTopic, "browser.search.countryCode")) {

I'm not sure using the search country makes sense. A lot of users set their browser to US English, but they live in completely different places.

This indicator is probably only useful if it matches the IP address of the user.
Attachment #8891915 - Flags: review?(francois) → review-
(Reporter)

Comment 9

a year ago
mozreview-review
Comment on attachment 8891915 [details]
Bug 1372456 - Convert ISO 3166 code to UN M49 region code

https://reviewboard.mozilla.org/r/162924/#review168618

::: toolkit/components/url-classifier/nsUrlClassifierUtils.cpp:797
(Diff revision 2)
>  NS_IMETHODIMP
>  nsUrlClassifierUtils::Observe(nsISupports *aSubject, const char *aTopic,
>                                const char16_t *aData)
>  {
>    if (0 == strcmp(aTopic, NS_PREFBRANCH_PREFCHANGE_TOPIC_ID)) {
> +    if(!strcmp(aTopic, "browser.search.countryCode")) {

Basically browser.search.countryCode stores the country code we fetched from geoip service. But you are right, sometime we are blocked to fetch the current location but use the last location.
Assuming that we are not sending the report frequently, I will change to fetch location directly from geoip service
(Reporter)

Comment 10

a year ago
I discussed with geo's guy
"mmh. a day is awfully short. if this is every firefox desktop browser, we'd probably get in the order of 7000 requests per second that way. we are currently doing at most 1000 requests per second to serve everyone else
if we could do week long caching it would probably be fine"

not sure if we could find any information from OS system
(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #10)
> I discussed with geo's guy
> "mmh. a day is awfully short. if this is every firefox desktop browser, we'd
> probably get in the order of 7000 requests per second that way. we are
> currently doing at most 1000 requests per second to serve everyone else
> if we could do week long caching it would probably be fine"

I'm not sure where the 7000 requests per second comes from, but a report will only be sent when

1. a user encounters a Safe Browsing warning page, and
2. that user has opted into sharing more info with Google.

I would assume that #2 is going to be at best 10% of users (probably less) and I don't think a given user sees a warning page even once a day. Bug 1364611 will tell us the rate of #1.

> not sure if we could find any information from OS system

I don't know, but it needs to be at least as good as what could be determined from the IP address.
(Reporter)

Comment 12

a year ago
Thanks, Francois,
Indeed, Mozilla location service (mls) server could not suffer too many requests, so, the solution we can do at the moment is:
- When we should share information with google (encounter SB warning page and user decides to report), we will send an async http request to MLS
- After getting a response from MLS, getting country_code and continue sending threat information to google server.
It may take a little delay to get response from MLS (even timed out).
(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #12)
> It may take a little delay to get response from MLS (even timed out).

If it times out (or we can't get a valid response) we can simply omit the region code. A report without a country is still useful.
Whiteboard: pwphish-threathit

Updated

9 months ago
Assignee: tnguyen → dlee
I just thought of two edge cases we should make sure we handle properly:

1. If a user uses a VPN, we should use the VPN IP address for the geo lookup.
2. If a user uses a proxy (e.g. tor proxy), we should use the proxy IP address for the geo lookup.

Basically, we should not expose the real IP address if a user has taken steps to hide / spoof it.

I doubt we can write automated tests for this, so it's probably just a matter of testing these two things manually.
Assignee: dimidiana → nobody
Status: ASSIGNED → NEW
Priority: P3 → P2
The country code will be added to the ClientReport API like this:

  // Details about the user that encountered the threat.
  message UserInfo {
    // The UN M.49 region code associated with the user's location.
    optional string region_code = 1;

    // Unique user identifier defined by the client.
    optional bytes user_id = 2;
  }

  // Details about the user that encountered the threat.
  optional UserInfo user_info = 22;
You need to log in before you can comment on or make changes to this bug.