Document Shavar blocklist server api for which clients can fetch multiple lists

RESOLVED INVALID

Status

Cloud Services
Server: Other
RESOLVED INVALID
3 years ago
3 years ago

People

(Reporter: edwong, Assigned: telliott)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

3 years ago
Need client and server agreement on how client will fetch multiple lists from shavar server.
(Reporter)

Updated

3 years ago
Assignee: nobody → telliott
(Reporter)

Updated

3 years ago
Blocks: 1143530
(Assignee)

Comment 1

3 years ago
I misspoke in the meeting. Shavar does support getting multiple lists in a single query, so we'll support that. rtilder will have the details.
(Reporter)

Updated

3 years ago
Flags: needinfo?(rtilder)
The wire protocol spec which we implement: https://developers.google.com/safe-browsing/developers_guide_v2
Flags: needinfo?(rtilder)
I think the question is more of how can we configure shavar to serve multiple lists given the current setup, which plunks data from one list into a single bucket.
Blocks: 1037568
Flags: needinfo?(rtilder)
Blocks: 1145857
(Reporter)

Comment 4

3 years ago
In speaking with Toby, there will be a query string param (e.g. ?client=shumway) that will be used to return specific lists from shavar server.  We need bug tasks to track that work in the server and a schedule when that will be live in production.
(In reply to Edwin Wong [:edwong] from comment #4)
> In speaking with Toby, there will be a query string param (e.g.
> ?client=shumway) that will be used to return specific lists from shavar
> server.  We need bug tasks to track that work in the server and a schedule
> when that will be live in production.

That is not in protocol. The way that the client requests multiple lists from the same server is already supported and in use. For example, we download 4 different lists from the Google safebrowsing endpoint. Please see https://developers.google.com/safe-browsing/developers_guide_v2#HTTPRequestForDataBody:

googpub-phish-shavar;a:1-3,5,8:s:4-5
acme-white-shavar;a:1-7:s:1-2

In this example, the client requests data for two lists. It then lists the chunks it already has for each list type.
The client needs to know two pieces of information:

1) The name of the shumway list. The last part of the list name determines how the list is parsed and must be either -digest256 or -shavar. Since the list is small, -digest256 is perfectly fine. How about shumway-allow-digest256?

2) The endpoint for getting list updates and gethash responses for this list. How about keeping the same endpoint as for the other mozilla list, https://tracking.services.mozilla.com/downloads?client=SAFEBROWSING_ID&appver=%VERSION%&pver=2.2 for updates and https://tracking.services.mozilla.com/gethash?client=SAFEBROWSING_ID&appver=%VERSION%&pver=2.2 for gethash?

If the endpoint is the same, the client will make a single request for whichever lists it needs.

Thanks,
Monica
Flags: needinfo?(joshmoz)
Flags: needinfo?(cpeterson)
(In reply to [:mmc] Monica Chew (please use needinfo) from comment #6)
> 1) The name of the shumway list. The last part of the list name determines
> how the list is parsed and must be either -digest256 or -shavar. Since the
> list is small, -digest256 is perfectly fine. How about
> shumway-allow-digest256?

Monica, how small is a "small" allow list? I estimate the initial Shumway list will have a few hundred entries.

I think "shumway-allow-digest256" is a good name. We may also have a "shumway-block-digest256" list for content sites that are broken by the allow-list.


> 2) The endpoint for getting list updates and gethash responses for this
> list. How about keeping the same endpoint as for the other mozilla list,
> https://tracking.services.mozilla.com/
> downloads?client=SAFEBROWSING_ID&appver=%VERSION%&pver=2.2 for updates and
> https://tracking.services.mozilla.com/
> gethash?client=SAFEBROWSING_ID&appver=%VERSION%&pver=2.2 for gethash?
> 
> If the endpoint is the same, the client will make a single request for
> whichever lists it needs.

Yes, I think this is what we want.
Flags: needinfo?(cpeterson) → needinfo?(mmc)
(In reply to Chris Peterson [:cpeterson] from comment #7)
> (In reply to [:mmc] Monica Chew (please use needinfo) from comment #6)
> > 1) The name of the shumway list. The last part of the list name determines
> > how the list is parsed and must be either -digest256 or -shavar. Since the
> > list is small, -digest256 is perfectly fine. How about
> > shumway-allow-digest256?
> 
> Monica, how small is a "small" allow list? I estimate the initial Shumway
> list will have a few hundred entries.

That's very small indeed. -digest256 requires 32 bytes per entry so a few hundred is completely fine. The phishing and malware lists hosted by Google (in -shavar form, which requires 4 bytes per entry) take about 10M with 1M of thrashing daily.

> I think "shumway-allow-digest256" is a good name. We may also have a
> "shumway-block-digest256" list for content sites that are broken by the
> allow-list.
Flags: needinfo?(mmc)

Comment 9

3 years ago
Responding to the needinfo request, I'm going to be unable to work on this at all for the next week or so. Apologies.
Flags: needinfo?(joshmoz)
I am still uncertain what this bug is about.  The summary is greatly at odds with the content of the comments.

Adding another list to serve via the shavar service is simple: populate a data file somewhere, add another stanza to the configuration file for the list to serve, and point that stanza at the data file.
Flags: needinfo?(rtilder)
(Assignee)

Comment 11

3 years ago
This bug is a mess and contains a lot of errors. I'm closing it in favor of Bug 1147462, which is what is actually needed at this time. 

After having talked to the various folks, the API in question is well-understood.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.