1331138 - [meta] Add opt-in reporting of matched URLs to Google (ThreatHit API)

Reporter

Description

•

7 years ago

Chrome allows users to opt into reporting the malware/phishing URLs (known to Safe Browsing) they encounter in the wild. This allows Google to make better decisions on which entries to send to clients first and to measure how effective their protection is. They can also make use of the referrer information to "go up the chain".

Google tells us that Firefox users have a different usage pattern when it comes to downloads and they believe it might be the case for browsing as well. If we let our users report this data, we could help them make the list better for Firefox users as well.

We should add this feature (disabled by default) and let users opt-in by clicking a checkbox on the Safe Browsing interstitial pages. The UI would therefore be very similar to how we let users report information about TLS errors they receive, but it would be a separate setting since it sends the data to Google instead of Mozilla.

List of information to send:

- unique browser identifier (randomized, stable for a week only)
- country/region information (granularity to be confirmed)
- full URL (without query string)
- hash that matched
- referrer chain

Also, we should stand up a proxy to hide our users' IP addresses.

François Marier [:francois]

Reporter

Comment 1

•

7 years ago

I put the private documentation we have on the Intranet Safe Browsing page (see URL field above).

For the region information, we'll have to work with Google to change the API, but perhaps we could be sending a UN M.49 code (https://en.wikipedia.org/wiki/UN_M.49) since that could be as granular as we want.

URL: https://intranet.mozilla.org/SafeBrow...

Thomas Nguyen (:tnguyen)

Updated

•

7 years ago

Assignee: nobody → tnguyen

Thomas Nguyen (:tnguyen)

Comment 2

•

7 years ago

(In reply to François Marier [:francois] from comment #0)

> List of information to send:
> 
> - unique browser identifier (randomized, stable for a week only)
> - country/region information (granularity to be confirmed)
> - full URL (without query string)
> - hash that matched
> - referrer chain
> 
> Also, we should stand up a proxy to hide our users' IP addresses.

The protobuf using in SafeBrowsing seems not cover all information we have to send to server

Only the followings information are included in the report request

  // The platform type reported.
  optional PlatformType platform_type = 2;

  // The threat entry responsible for the hit. Full hash should be reported for
  // hash-based hits.
  optional ThreatEntry entry = 3;

    // The URL of the resource.
    optional string url = 1;

    // The type of source reported.
    optional ThreatSourceType type = 2;

    // The remote IP of the resource in ASCII format. Either IPv4 or IPv6.
    optional string remote_ip = 3;

    // Referrer of the resource. Only set if the referrer is available.
    optional string referrer = 4;


There's no ASN and country/region. Could you please double check if we have to follow the [1], then only information which are relevant to SafeBrwosing v4: hash, url, ip, referrer could be sent to server. 
[1] http://searchfox.org/mozilla-central/rev/30fcf167af036aeddf322de44a2fadd370acfd2f/toolkit/components/url-classifier/chromium/safebrowsing.proto#201-246

Thomas Nguyen (:tnguyen)

Updated

•

7 years ago

Flags: needinfo?(francois)

Thomas Nguyen (:tnguyen)

Comment 3

•

7 years ago

I see [1], the protobuf report structure using in Chromium client side phishing detection is very similar to the requirement and contains asn + country
Is this something we can use?
[1] https://cs.chromium.org/chromium/src/components/safe_browsing/csd.proto?q=ClientSafeBrowsingReportRequest&dr=CSs&l=11

François Marier [:francois]

Reporter

Comment 4

•

7 years ago

(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #2)
> There's no ASN and country/region. Could you please double check if we have
> to follow the [1], then only information which are relevant to SafeBrwosing
> v4: hash, url, ip, referrer could be sent to server. 

Google will be extending the protobuf to allow us to report country/region as well as the temporary unique identifier.

Given that this is not yet available, we can start with just the ip, hash, url, referrer chain, etc.

(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #3)
> I see [1], the protobuf report structure using in Chromium client side
> phishing detection is very similar to the requirement and contains asn +
> country
> Is this something we can use?

No, we won't be reporting the same data as Chrome.

Flags: needinfo?(francois)

Thomas Nguyen (:tnguyen)

Comment 5

•

7 years ago

We may have to trace redirect history, and each time a channel does a redirect, we should collect the following information : remote address, referrer (if any), and url of the channel. 
Then, we may have to put them into an struct array.
I am writing a struct and adding an array of the struct to store redirect content.
It looks like
  struct ThreatHitTraceResource {
    // The URL of the resource.
    nsACString url;

    // The remote IP of the resource in ASCII format. Either IPv4 or IPv6.
    nsACString remoteIp;

    // Referrer of the resource. Only set if the referrer is available.
    nsACString referrer;
  };

  typedef nsTArray<ThreatHitTraceResource> ThreatHitResources;

Hi Patrick,
Do you have any concern if I extend Loadinfo and put those information to loadinfo like we did with principal array redirect chain [1]
[1] https://searchfox.org/mozilla-central/rev/7419b368156a6efa24777b21b0e5706be89a9c2f/netwerk/base/LoadInfo.h#147

Flags: needinfo?(mcmanus)

Patrick McManus [:mcmanus]

Comment 6

•

7 years ago

sgtm

Flags: needinfo?(mcmanus)

Thomas Nguyen (:tnguyen)

Updated

•

7 years ago

Depends on: 1351146

Thomas Nguyen (:tnguyen)

Updated

•

7 years ago

Depends on: 1351147

François Marier [:francois]

Reporter

Updated

•

7 years ago

Status: NEW → ASSIGNED

François Marier [:francois]

Reporter

Comment 7

•

7 years ago

Google has added a new field to specify the region that the user is in:

https://intranet.mozilla.org/index.php?title=SafeBrowsing&action=historysubmit&diff=182519&oldid=182360

François Marier [:francois]

Reporter

Updated

•

7 years ago

Depends on: 1358536

François Marier [:francois]

Reporter

Updated

•

7 years ago

Summary: Add opt-in reporting of matched URLs to Google → [meta] Add opt-in reporting of matched URLs to Google

François Marier [:francois]

Reporter

Updated

•

7 years ago

Depends on: 1372456

François Marier [:francois]

Reporter

Updated

•

7 years ago

Summary: [meta] Add opt-in reporting of matched URLs to Google → [meta] Add opt-in reporting of matched URLs to Google (ThreatHIt API)

François Marier [:francois]

Reporter

Comment 8

•

7 years ago

Unassigning Thomas from the meta bug since the work is happening in the dependent bugs.

Assignee: tnguyen → nobody

URL: https://intranet.mozilla.org/SafeBrow... → https://mana.mozilla.org/wiki/display...

Status: ASSIGNED → UNCONFIRMED

Ever confirmed: false

Thomas Nguyen (:tnguyen)

Updated

•

7 years ago

Depends on: 1387364

Ethan Tseng [:ethan]

Updated

•

7 years ago

Summary: [meta] Add opt-in reporting of matched URLs to Google (ThreatHIt API) → [meta] Add opt-in reporting of matched URLs to Google (ThreatHit API)

Ethan Tseng [:ethan]

Updated

•

7 years ago

Status: UNCONFIRMED → NEW

Ever confirmed: true

François Marier [:francois]

Reporter

Updated

•

7 years ago

Depends on: 1385156

François Marier [:francois]

Reporter

Updated

•

7 years ago

Depends on: 1414051

François Marier [:francois]

Reporter

Updated

•

7 years ago

Depends on: 1414056

Daniel Veditz [:dveditz]

Comment 9

•

7 years ago

> since it sends the data to Google instead of Mozilla.

Make sure we update our privacy policy if necessary. The current text is pretty broad so maybe it covers it.

François Marier [:francois]

Reporter

Updated

•

6 years ago

Depends on: 1442780

Sylvestre Ledru [:Sylvestre]

Updated

•

6 years ago

Keywords: meta

Dimi Lee [:dimi]

Updated

•

6 years ago

Priority: P2 → P3

Sylvestre Ledru [:Sylvestre]

Updated

•

5 years ago

Type: defect → task

BMO Automation

Updated

•

2 years ago

Severity: normal → S3