Closed Bug 1515806 Opened 5 years ago Closed 5 years ago

Deploy new lists based on Disconnect's sub-category tags

Categories

(Cloud Services :: Server: Shavar, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: englehardt, Assigned: ckolos)

References

(Blocks 1 open bug)

Details

We'll add support for these tags in:
* https://github.com/mozilla-services/shavar-prod-lists/pull/47
* https://github.com/mozilla-services/shavar-list-creation-config/pull/35
* https://github.com/mozilla-services/shavar-list-creation/pull/62

The only tags we are going to support at the moment are fingerprinting and cryptomining. I've tentatively named the new lists based on these categories:

* base-fingerprinting-track-digest256
* content-fingerprinting-track-digest256
* base-cryptomining-track-digest256
* content-cryptomining-track-digest256

Where the `base-*` lists are pulled from domains on the Advertising, Analytics, Social, and Disconnect top-level categories. The `content-*` lists only pull from the Content category, and are currently empty (i.e., there are no tagged domains in those categories).
Blocks: 1513490
New lists are added by:

1) Creating a source repo for them and configuring them in the list-creation script configs (https://github.com/mozilla-services/shavar-list-creation-config)

2) Once that's complete and working appropriately, you will need to create a server config for the relevant deployment domain. Start with stage. (https://github.com/mozilla-services/shavar-server-list-config)

3) Having done that, and tested in stage, you can create the prod config. Unfortunately, this step requires manual intervention for activation, however, the new list creation *should* actually test the code landed and running in stage to "automatically" pick up new lists. In this *one* case, I will need to deploy a new version of shavar for this to happen in -prod.
Is there a reason to bundle categories into the base-* lists instead of offering them based on the Disconnect categories?

We've already split up the original base-track lists in bug 1425075 to allow for a clear category-based settings API in GeckoView. I fear that we would have to do the same procedure again for each bundle lists we add to maintain consistency.
Flags: needinfo?(senglehardt)
We'd like the option to block or apply restrictions only to domains that contain one of the category tags. Thus, we want to created separate lists for each tag that don't include non-tagged domains from the top-level categories (e.g., Analytics, etc). Note that the ads/analytics/etc lists will still contain the domains that have tags, so the creation of these new lists won't impact those.

I don't see a use case where we'd want to further split the lists. E.g., domains in the top-level ads category that have the fignerprinting tag, and domains in the top-level Social category that have the cryptomining tag. It's technically possible to do. Is this something you'd like to support?
Flags: needinfo?(senglehardt) → needinfo?(esawin)

All PRs for stage are merged and :ckolos bounced the shavar stage server to pick up the new lists ...

(shavar)groovetop:shavar lcrouch$ curl -d" " 'https://shavar.stage.mozaws.net/list?client=foo&appver=1&pver=2.2'
...
base-cryptomining-track-digest256
base-fingerprinting-track-digest256
...
content-cryptomining-track-digest256
content-fingerprinting-track-digest256

:rbillings will run thru staging tests of https://testrail.stage.mozaws.net/index.php?/suites/view/354&group_by=cases:section_id&group_id=1169&group_order=asc

Flags: needinfo?(rbillings)

(In reply to Steven Englehardt [:englehardt] from comment #4)

We'd like the option to block or apply restrictions only to domains that
contain one of the category tags. Thus, we want to created separate lists
for each tag that don't include non-tagged domains from the top-level
categories (e.g., Analytics, etc). Note that the ads/analytics/etc lists
will still contain the domains that have tags, so the creation of these new
lists won't impact those.

I don't see a use case where we'd want to further split the lists. E.g.,
domains in the top-level ads category that have the fignerprinting tag, and
domains in the top-level Social category that have the cryptomining tag.
It's technically possible to do. Is this something you'd like to support?

I'm trying to sketch out how the new lists would translate to GeckoView settings and what the expected behaviour would be on the app side.

In GeckoView, we currently have settings to toggle anti-tracking based on the categories: ads, analytics, social and content [1].
I see two options to handle tag sub-categories:

  1. Add one setting per tag sub-category (fingerprinting: CATEGORY_FINGERPRINTING, cryptomining: CATEGORY_CRYPTOMINING).
    What would be the expected behavior here? Should all pages be blocked that include the tag or only pages that fit into the selected anti-tracking category containing the tag.
    E.g., should CATEGORY_AD + CATEGORY_FINGERPRINTING block "ads-track-digest256" and "base-fingerprinting-track-digest256" or rather "ads-track-digest256" and "ads-fingerprinting-track-digest256"? (fictional constants/lists)

  2. Add one setting per top-level domain category of the sub-category (CATEGORY_AD_FINGERPRINTINGD, etc.).
    This would be a clearer and more explicit settings API, however it would require dedicated lists for reach category per sub-category (as we do now for the top-level domain categories).

Maybe I'm misunderstanding what the new tag-based system is trying to accomplish, or is it rather a desktop vs. GeckoView misconception since desktop doesn't offer the same detailed anti-tracking settings as GeckoView does.

[1] https://searchfox.org/mozilla-central/source/mobile/android/geckoview/src/main/java/org/mozilla/geckoview/TrackingProtection.java

Flags: needinfo?(esawin)

I verified the new lists are on stage and completed e2e testing.

Flags: needinfo?(rbillings)

(In reply to Rebecca Billings [:rbillings] from comment #7)

I verified the new lists are on stage and completed e2e testing.

I also checked that with the patch in Bug 1513490, I can get all the lists from stage server via SafeBrowsing update.

Verified list content using test pages https://mozilla.github.io/tracking-test/cryptomining.html and https://mozilla.github.io/tracking-test/fingerprinting.html. New files for the lists are also created correctly.

Thanks :rbillings. I've opened PRs to create, publish, and serve these new lists in the shavar production environment. The order of operations is:

  1. Merge https://github.com/mozilla-services/shavar-list-creation-config/pull/37 and https://github.com/mozilla-services/shavar-prod-lists/pull/43

This will start creating the new cryptominig and fingerprinting lists and publish them to the production s3 bucket. It will also trigger list update downloads from Firefox clients.

  1. Merge https://github.com/mozilla-services/shavar-server-list-config/pull/20

This will configure production shavar to serve the new lists

  1. Restart production shavar service

This will make the new lists available for Firefox clients to download.

Note: While the lists will be available, Firefox clients won't download them unless/until:

  1. Add base-cryptomining-track-digest256,base-fingerprinting-track-digest256 to browser.safebrowsing.provider.mozilla.lists in about:config
  2. Wait for Firefox to check for new lists to download (can be forced by setting browser.safebrowsing.provider.mozilla.nextupdatetime = 0 and restart Firefox)

That should make the lists appear in the Firefox Caches/Firefox/Profiles/<profile>/safebrowsing directory.

To make Firefox use the lists to block cryptomining and fingerprinting resources, add base-cryptomining-track-digest256,base-fingerprinting-track-digest256 to urlclassifier.trackingTable in about:config.

:ckolos - can we do this early next week for e2e testing in the prod environment?

Flags: needinfo?(ckolos)

I just merged the PRs to start creating and publishing the cryptomining & fingerprinting lists to production environment. Will watch the next jenkins output in 20m to make sure it succeeded before merging the PR to update the shavar server list config.

From the production job, output:

Tracking protection(tracking-protection-base-fingerprinting): publishing 44 items; file size 1429
Tracking protection(tracking-protection-content-fingerprinting): publishing 44 items; file size 1429
Tracking protection(tracking-protection-base-cryptomining): publishing 37 items; file size 1205
Tracking protection(tracking-protection-content-cryptomining): publishing 37 items; file size 1205

...

tracking-protection-base-fingerprinting looks like it hasn't been uploaded to s3://net-mozaws-prod-shavar/tracking/base-fingerprinting-track-digest256
Uploaded to s3: tracking-protection-base-fingerprinting
tracking-protection-content-fingerprinting looks like it hasn't been uploaded to s3://net-mozaws-prod-shavar/tracking/content-fingerprinting-track-digest256
Uploaded to s3: tracking-protection-content-fingerprinting
tracking-protection-base-cryptomining looks like it hasn't been uploaded to s3://net-mozaws-prod-shavar/tracking/base-cryptomining-track-digest256
Uploaded to s3: tracking-protection-base-cryptomining
tracking-protection-content-cryptomining looks like it hasn't been uploaded to s3://net-mozaws-prod-shavar/tracking/content-cryptomining-track-digest256
Uploaded to s3: tracking-protection-content-cryptomining

:ckolos - if you can r+ and merge https://github.com/mozilla-services/shavar-server-list-config/pull/20, then production shavar server can start serving these lists as soon as it's bounced.

Commits pushed to master at https://github.com/mozilla-services/shavar-server-list-config

https://github.com/mozilla-services/shavar-server-list-config/commit/95baba589a050b3e4e193682c97c394cf3e5ee00
bug 1515806 - serve cryptomining and fingerprinting lists

https://github.com/mozilla-services/shavar-server-list-config/commit/0a57c24246d82dc8c7619eefc995a6154c570c2d
Merge pull request #20 from mozilla-services/serve-cryptomining-and-fingerprinting-lists-1515806

bug 1515806 - serve cryptomining and fingerprinting lists

This was done and servers restarted. New lists show up in available lists.

Status: NEW → UNCONFIRMED
Ever confirmed: false
Flags: needinfo?(ckolos)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Assignee: nobody → ckolos
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED

Verified lists on production.

Status: RESOLVED → VERIFIED
Commits pushed to master at https://github.com/mozilla-services/shavar-server-list-config

https://github.com/mozilla-services/shavar-server-list-config/commit/d50ac07c4a8575b725c0d587a0d0cb549598d4f0
bug 1515806 - fix CM and FP lists redirect_url_base

https://github.com/mozilla-services/shavar-server-list-config/commit/fb8d85f73c033435051190051bd53917eddaf047
Merge pull request #23 from mozilla-services/fix-cryptomining-and-fingerprinting-configs-1515806

bug 1515806 - fix CM and FP lists redirect_url_base
You need to log in before you can comment on or make changes to this bug.