Closed Bug 1331139 Opened 7 years ago Closed 7 years ago

Update download protection for V4

Categories

(Toolkit :: Safe Browsing, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox54 --- fixed

People

(Reporter: francois, Assigned: tnguyen)

References

()

Details

(Whiteboard: #sbv4-m5)

Attachments

(2 files)

Download protection currently relies on V2 lists that will be going away:

- goog-badbinurl-shavar
- goog-downloadwhite-digest256

We need to move over to the V4 equivalents once they exist.
The V4 equivalents are documented on the Intranet (see URL field above).
Assignee: francois → nobody
Status: ASSIGNED → NEW
Assignee: nobody → dlee
Status: NEW → ASSIGNED
Assignee: dlee → tnguyen
See Also: → 1337260
Whiteboard: #sbv4-m5
Attachment #8837075 - Flags: review?(francois)
I added the table goog-badbinurl-proto,goog-downloadwhite-proto to google4 list.
Manually observed that the lists were updated successfully (but the completion gethash result is still ignored).
Changed a bit to make download protection works on both V2 and V4 simultaneously.
Comment on attachment 8837075 [details]
Bug 1331139 - Enable download protection for V4 lists.

https://reviewboard.mozilla.org/r/112332/#review115196

The first thing I'd suggest is splitting the Chromium-related changes into its own patch. That can be just "Bug 1331139 - Sync safebrowsing.proto from Chromium".

This will also need telemetry to keep track of the matches we see:

- `APPLICATION_REPUTATION_ALLOWLIST_MATCH`
- `APPLICATION_REPUTATION_BLOCKLIST_MATCH`

Both of these will be enumerated types with these values:

- 0 = no match
- 1 = both V2 and V4
- 2 = V2 only
- 3 = V4 only

Also, we need to make sure that we only send a telemetry ping when both V2 and V4 are enabled, like we do in bug 1311931.

::: modules/libpref/init/all.js:5098
(Diff revision 2)
>  #else
>  pref("urlclassifier.phishTable", "googpub-phish-shavar,test-phish-simple");
>  #endif
>  
>  // Tables for application reputation.
> -pref("urlclassifier.downloadBlockTable", "goog-badbinurl-shavar");
> +pref("urlclassifier.downloadBlockTable", "goog-badbinurl-shavar,goog-badbinurl-proto");

We need to restrict that to nightly only, like we do for the other lists.

::: modules/libpref/init/all.js:5104
(Diff revision 2)
>  
>  #ifdef XP_WIN
>   // Only download the whitelist on Windows, since the whitelist is
>   // only useful for suppressing remote lookups for signed binaries which we can
>   // only verify on Windows (Bug 974579). Other platforms always do remote lookups.
> -pref("urlclassifier.downloadAllowTable", "goog-downloadwhite-digest256");
> +pref("urlclassifier.downloadAllowTable", "goog-downloadwhite-digest256,goog-downloadwhite-proto");

ditto

::: toolkit/components/downloads/ApplicationReputation.cpp:350
(Diff revision 2)
> +LookupTablesInPrefs(const nsACString& tables, const char* aPref)
> +{
> +  nsAutoCString prefList;
> +  Preferences::GetCString(aPref, &prefList);
> +
> +  nsACString::const_iterator begin, iter, end;

nit: that might be more concise as a `for` loop

::: toolkit/components/downloads/ApplicationReputation.cpp:353
(Diff revision 2)
> +  Preferences::GetCString(aPref, &prefList);
> +
> +  nsACString::const_iterator begin, iter, end;
> +  tables.BeginReading(begin);
> +  tables.EndReading(end);
> +  while (begin != end) {

It seems like this loop assumes that there's a maximum of two tables in the pref. That's not something we should rely on.

Given that most of the time `tables` will only consist of a single entry, I propose the following pseudo-code:

```
all_tables = tables.split(",")
foreach table in all_tables:
    if FindInReadable(prefList, table):
       return true

return false
```

::: toolkit/components/url-classifier/chromium/README.txt:8
(Diff revision 2)
> -"package mozilla.safebrowsing;"
>  
> +1. Add "package mozilla.safebrowsing;"
>  to avoid naming pollution. We use this source file along with protobuf compiler (protoc) to generate safebrowsing.pb.h/cc for safebrowsing v4 update and hash completion. The current generated files are compiled by protoc 2.6.1 since the protobuf library in gecko is not upgraded to 3.0 yet.
>  
> +2. Remove Chrome specific element

What Chrome-specific things did you remove?

Did they prevent compilation or cause any problems?
Attachment #8837075 - Flags: review?(francois) → review-
Comment on attachment 8837075 [details]
Bug 1331139 - Enable download protection for V4 lists.

https://reviewboard.mozilla.org/r/112332/#review115196

> It seems like this loop assumes that there's a maximum of two tables in the pref. That's not something we should rely on.
> 
> Given that most of the time `tables` will only consist of a single entry, I propose the following pseudo-code:
> 
> ```
> all_tables = tables.split(",")
> foreach table in all_tables:
>     if FindInReadable(prefList, table):
>        return true
> 
> return false
> ```

I think this is similar, exept during "split" table. We have a loop to iterate the string (finding ',') to split table, and during that if we find a new table that is in prefList, then return true. This should be better.
This should be applied for multiple tables, not only 2.
Do you need I write a gtest for this (I probably have to make this function public in namspace to test)?

> What Chrome-specific things did you remove?
> 
> Did they prevent compilation or cause any problems?

https://chromium.googlesource.com/chromium/src.git/+/master/components/safe_browsing_db/safebrowsing.proto#335
They did not cause any problems, but the struct will be never used. This is designed for chrome brower.
Thanks you for your review.
I am working on bug 1337260 to add the telemetry about download protection (and I am waiting for bug 1311931 is landed first), but the probes should be added in url-classifier, not in ApplicationReputation, would that cover the probes you recommended?
They look very similar to me,
(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #7)
> I think this is similar, exept during "split" table. We have a loop to
> iterate the string (finding ',') to split table, and during that if we find
> a new table that is in prefList, then return true. This should be better.
> This should be applied for multiple tables, not only 2.
> Do you need I write a gtest for this (I probably have to make this function
> public in namspace to test)?

The current loop is a little hard to parse so I would suggest either rewriting it along the lines of what I proposed or adding a test to prove that it works with all of the cases.

> https://chromium.googlesource.com/chromium/src.git/+/master/components/
> safe_browsing_db/safebrowsing.proto#335
> They did not cause any problems, but the struct will be never used. This is
> designed for chrome brower.

If it doesn't cause any problems, let's avoid modifying the file we get from Chrome and try to stay as close as possible the "upstream" copy. It's fine if there are things we don't use in there.

(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #8)
> I am working on bug 1337260 to add the telemetry about download protection
> (and I am waiting for bug 1311931 is landed first), but the probes should be
> added in url-classifier, not in ApplicationReputation, would that cover the
> probes you recommended?

Ah, you're right, I had forgotten about these. Let's merge these two bugs together because there's no point in landing download protection V4 if we're not measuring it. Also, the names I suggested in comment 6 are probably better than the ones from bug 1337260.
Attachment #8837075 - Flags: review?(francois)
Attachment #8839845 - Flags: review?(francois)
Comment on attachment 8839845 [details]
Bug 1331139 - Sync safebrowsing.proto from Chromium.

https://reviewboard.mozilla.org/r/114422/#review116814

::: toolkit/components/url-classifier/nsUrlClassifierUtils.cpp:234
(Diff revision 2)
>    { "goog-malware-proto",  MALWARE_THREAT},            // 1
>    { "googpub-phish-proto", SOCIAL_ENGINEERING_PUBLIC}, // 2
>    { "goog-unwanted-proto", UNWANTED_SOFTWARE},         // 3
>    { "goog-phish-proto", SOCIAL_ENGINEERING},           // 5
>  
> +  // For application reputation

This should be in the other patch. We want to isolate the chromium changes so that this patch is only syncing with the upstream chromium file.
Attachment #8839845 - Flags: review?(francois) → review-
Comment on attachment 8837075 [details]
Bug 1331139 - Enable download protection for V4 lists.

https://reviewboard.mozilla.org/r/112332/#review116818

Looks all good! Only a few minor things to fix.

::: toolkit/components/downloads/ApplicationReputation.cpp:104
(Diff revision 4)
> +  eBothMatch = eV2Match | eV4Match,
> +};
> +
> +MOZ_MAKE_ENUM_CLASS_BITWISE_OPERATORS(TelemetryMatchInfo)
> +
> +// Helper function that look up all tables of a string which seperated

typo: `separated`

::: toolkit/components/downloads/ApplicationReputation.cpp:104
(Diff revision 4)
> +  eBothMatch = eV2Match | eV4Match,
> +};
> +
> +MOZ_MAKE_ENUM_CLASS_BITWISE_OPERATORS(TelemetryMatchInfo)
> +
> +// Helper function that look up all tables of a string which seperated

Here's a clearer way to explain this:

"Given a comma-separated list of tables which matched a URL, check to see if at least one of these tables is present in the given pref."

::: toolkit/components/downloads/ApplicationReputation.cpp:136
(Diff revision 4)
> +      }
> +    }
> +  }
> +
> +  bool shouldRecordTelemetry = isV2Enabled && isV4Enabled;
> +  TelemetryMatchInfo aInfo = TelemetryMatchInfo::eNoMatch;

This variable name should not have an `a` prefix because it's not a parameter to the function.

`telemetryInfo` would work.

::: toolkit/components/downloads/ApplicationReputation.cpp:156
(Diff revision 4)
> +      continue;
> +    }
> +    found = true;
> +    // We are checking if the table found is V2 or V4 to record telemetry
> +    // Both V2 and V4 begin with "goog" but V4 ends with "-proto"
> +    if (shouldRecordTelemetry) {

The top-level `if` statement (`shouldRecordTelemetry`) is not necessary if you move the `if (!shouldRecordTelemetry)` before this one.

::: toolkit/components/downloads/ApplicationReputation.cpp:167
(Diff revision 4)
> +        }
> +      }
> +    }
> +  }
> +
> +  if (!shouldRecordTelemetry) {

Move this before the previous `if` statement and then you don't need to check `if (shouldRecordTelemetry)` there.

::: toolkit/components/downloads/test/gtest/TestLookupTable.cpp:16
(Diff revision 4)
> +using namespace mozilla;
> +using namespace mozilla::downloads;
> +
> +TEST(PendingLookup, LookupTablesInPrefs)
> +{
> +  EXPECT_EQ(NS_OK, Preferences::SetCString("gtest.test", "goog-badbinurl-proto,goog-downloadwhite-proto,goog-badbinurl-sharva"));

typo: `sharva` -> `shavar`

::: toolkit/components/downloads/test/gtest/TestLookupTable.cpp:37
(Diff revision 4)
> +  EXPECT_EQ(NS_OK, Preferences::SetCString("gtest.test", ""));
> +
> +  result = LookupTablesInPrefs(NS_LITERAL_CSTRING("goog-phish-proto,goog-badbinurl-proto,goog-phish-sharva"), "gtest.test");
> +  ASSERT_FALSE(result);
> +
> +  Preferences::ClearUser("gtest.test");

I would suggest two more test cases:

- `gtest.test = "goog-badbinurl-proto"` and `LookupTablesInPrefs("goog-badbinurl-proto")`
- `gtest.test = "goog-badbinurl-proto"` and `LookupTablesInPrefs("goog-phish-proto,goog-badbinurl-proto,goog-phish-shavar")`

::: toolkit/components/telemetry/Histograms.json:130
(Diff revision 4)
> +    "alert_emails": ["safebrowsing-telemetry@mozilla.org"],
> +    "expires_in_version": "60",
> +    "kind": "enumerated",
> +    "n_values": 4,
> +    "bug_numbers": [1331139],
> +    "description": "Application reputation block list matching result (0 = no match, 1 = match only V2, 2 = match only V4, 3 = match both V2 and V4)"

Let's change the first part of this description to:

"For each Application Reputation lookup against both the V2 and V4 Google lists, note which version of the list returned a match (...)"

::: toolkit/components/telemetry/Histograms.json:138
(Diff revision 4)
> +    "alert_emails": ["safebrowsing-telemetry@mozilla.org"],
> +    "expires_in_version": "60",
> +    "kind": "enumerated",
> +    "n_values": 4,
> +    "bug_numbers": [1331139],
> +    "description": "Application reputation allow list matching result (0 = no match, 1 = match only V2, 2 = match only V4, 3 = match both V2 and V4)"

ditto
Attachment #8837075 - Flags: review?(francois) → review-
Comment on attachment 8839845 [details]
Bug 1331139 - Sync safebrowsing.proto from Chromium.

https://reviewboard.mozilla.org/r/114422/#review116920
Attachment #8839845 - Flags: review?(francois) → review+
Comment on attachment 8837075 [details]
Bug 1331139 - Enable download protection for V4 lists.

https://reviewboard.mozilla.org/r/112332/#review116922

All good. The only thing you need to change before landing is that the two probes should have a sligthly different description. One should be "block list" and the other one should be "allow list".

With that change, please land. r+

::: toolkit/components/telemetry/Histograms.json:130
(Diff revisions 4 - 5)
>      "alert_emails": ["safebrowsing-telemetry@mozilla.org"],
>      "expires_in_version": "60",
>      "kind": "enumerated",
>      "n_values": 4,
>      "bug_numbers": [1331139],
> -    "description": "Application reputation block list matching result (0 = no match, 1 = match only V2, 2 = match only V4, 3 = match both V2 and V4)"
> +    "description": "For each Application Reputation lookup against both the V2 and V4 Google lists, note which version of the list returned a match (0 = no match, 1 = match only V2, 2 = match only V4, 3 = match both V2 and V4)"

"version of the block list"

::: toolkit/components/telemetry/Histograms.json:138
(Diff revisions 4 - 5)
>      "alert_emails": ["safebrowsing-telemetry@mozilla.org"],
>      "expires_in_version": "60",
>      "kind": "enumerated",
>      "n_values": 4,
>      "bug_numbers": [1331139],
> -    "description": "Application reputation allow list matching result (0 = no match, 1 = match only V2, 2 = match only V4, 3 = match both V2 and V4)"
> +    "description": "For each Application Reputation lookup against both the V2 and V4 Google lists, note which version of the list returned a match (0 = no match, 1 = match only V2, 2 = match only V4, 3 = match both V2 and V4)"

"version of the allow list"
Attachment #8837075 - Flags: review?(francois) → review+
Thanks for your review Francois :).
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/2bd28e8cf65f
Enable download protection for V4 lists. r=francois
https://hg.mozilla.org/integration/autoland/rev/c2fbddc3469a
Sync safebrowsing.proto from Chromium. r=francois
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/2bd28e8cf65f
https://hg.mozilla.org/mozilla-central/rev/c2fbddc3469a
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Depends on: 1358333
You need to log in before you can comment on or make changes to this bug.