Closed Bug 1164303 Opened 9 years ago Closed 8 years ago

Use SafeBrowsing to power inadjacency matching

Categories

(Firefox :: New Tab Page, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox41 --- affected

People

(Reporter: Mardak, Assigned: mzhilyaev)

References

Details

(Whiteboard: [story])

Bug 1105376 is a feature where we choose not to show tiles in the context of other sites' tiles. Bug 1159884 is trying a "simple" approach of hardcoding a list of sites where we check the top tiles against that list. We can reuse the tests from that bug to refactor to using SafeBrowsing.

There's been multiple suggestions to use SafeBrowsing to allow for more dynamic updating.

gcp, is it possible to use give SafeBrowsing a local list, e.g., data/resource/chrome URI? This doesn't provide the dynamic updates, but would allow us to start using the API and make it easier to switch to using a remotely hosted list.

Are there tools we can use to convert a list of sites into a static file that we can package as part of Firefox to use with the SafeBrowsing API?
Flags: needinfo?(gpascutto)
SafeBrowsing can be fed the updates via JavaScript. I presume you can read in your data via JS so that should work. We use this right now to create a testing databases: https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/SafeBrowsing.jsm?from=SafeBrowsing.jsm&case=true#199
i.e. it just shoves the string data into the nsIUrlClassifierDBService.updateStream() call.

We support 2 formats: SafeBrowsing v2 
https://developers.google.com/safe-browsing/developers_guide_v2
and a simpler text based protocol:
https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/ProtocolParser.cpp#231
that is used by our tests and what you see the code above using. It's not documented but hopefully self-explanatory if you read up on how SafeBrowsing updates work.

Given that there's a lot of talk about moving to SafeBrowsing v3 or v4, I don't think you want to deal with writing a tool to format your own updates into v2 right now. We've wanted to deprecate the text-based protocol the tests use a few times because nothing else was using it (so we were testing something that's different from what production uses), but this seems like a fine use case for it, so we can certainly keep it alive until we've moved to v3/v4 and/or you're ready to do dynamic updates.

So summarizing: you can do the updates with basic text/string data in a simple format that you feed in via JS. The addMozEntries function linked above creates 3 SafeBrowsing databases with one URL each by just manipulating JS strings.

SafeBrowsing needs to be taught what to do with the new tables, though :-)
Flags: needinfo?(gpascutto)
Our usage of the list would be to lookup if a given new tab tile is in the "negative adjacency db" and not for classifying urls on loading a page.

So it looks like following the addMozEntries, we construct an update string and update "negative-adjacency-simple" then later to check..

dbservice = Cc["@mozilla.org/url-classifier/dbservice;1"].getService(Ci.nsIUrlClassifierDBService);
dbservice.lookup(url converted to principal, "negative-adjacency-simple", cb)

maksik, looks like we'll need to make sure the negative adjacency code is structured to allow async checks/lookups.
(In reply to Ed Lee :Mardak from comment #2)
> maksik, looks like we'll need to make sure the negative adjacency code is
> structured to allow async checks/lookups.

If you don't need to "double-check" the SafeBrowsing entries vs a remote sever, then you can use this:
https://dxr.mozilla.org/mozilla-central/source/netwerk/base/nsIURIClassifier.idl#63

(It's also what Tracking Protection uses)
Oh neat. We probably could use classifyLocalWithTables which is synchronous and can limit to the negative adjacency table.

https://dxr.mozilla.org/mozilla-central/source/netwerk/base/nsIURIClassifier.idl#66
(In reply to Ed Lee :Mardak from comment #2)
> So it looks like following the addMozEntries, we construct an update string
> and update "negative-adjacency-simple" then later to check..

BTW, the naming convention is: organization-listtype-format

So far we've used "mozpub-*-*" for our own lists (tracking protection and shumway). Unless you have a reason for using something else, I'd suggest something along the lines of "mozpub-negadjacency-simple"
Blocks: 1145418
Assignee: nobody → mzhilyaev
Iteration: --- → 41.3 - Jun 29
Summary: Use SafeBrowsing to power negative adjacency matching → Use SafeBrowsing to power ubadjainncy matching
Whiteboard: .?
Summary: Use SafeBrowsing to power ubadjainncy matching → Use SafeBrowsing to power inadjacency matching
Blocks: 1176364
No longer blocks: 1145418
Iteration: 41.3 - Jun 29 → ---
Whiteboard: .? → [story]
Depends on: 1176741
Depends on: 1176742
No longer depends on: 1176741
Blocks: 1189754
No longer blocks: 1176364
Blocks: 1193831
No longer blocks: 1189754
I don't believe this is required anymore given we no longer have sponsored tiles.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.