Use a heuristic rule to determine if a channel should be classified
Categories
(Toolkit :: Safe Browsing, enhancement, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox68 | --- | fixed |
People
(Reporter: dimi, Assigned: dimi)
References
(Blocks 3 open bugs)
Details
Attachments
(7 files)
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review |
We plan to use a heuristic way to determine if a channel should be classified instead of specifying manually by the caller.
Link to the design document:
https://docs.google.com/document/d/1byFAOCVhg5669AwbU5i0cfs_MrzTojXTiK2EMn_-Kno/edit
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 1•6 years ago
|
||
In this patch, we move from a model where URL classification is opt-in
(by specifying LOAD_CLASSIFIER_URI) to a model where it is enforced by
default unless a channel opts out or is deemed to be a critical channel
based on simple heuristics.
The heuristics exempt a channel from classification if it is triggered
by system with an exception when it is a top level load.
To ensure critical channels are never classified, we also exempt
channels which are flagged as "BeConservative" (implemented in bug 1321783).
Another load flag LOAD_BYPASS_URL_CLASSIFIER is also introduced
for the same reason.
Assignee | ||
Comment 2•6 years ago
|
||
nsIChannel.LOAD_CLASSIFY_URI is no longer required so we can remove it from
the codebase.
In the mean time, we add a new LOAD_BYPASS_URL_CLASSIFIER load flag for
channel creator to be able to force channel to bypass URL classifier check.
The use of the new LOAD_BYPASS_URL_CLASSIFIER flag will be addressed in
the other patches.
Assignee | ||
Comment 3•6 years ago
|
||
This patch uses the flag to exempt channels from classification, but it
doesn't include the use cases of this flag.
See Bug 1442496 for the list of the call sites should use this flag.
Assignee | ||
Comment 4•6 years ago
|
||
We want to ensure SafeBrowsing update request is never classified so we
can recover from a bad SafeBrowsing database.
SafeBrowsing lookup request is also critical because if it is blocked,
it means SafeBrowsing is out of function.
Assignee | ||
Comment 5•6 years ago
|
||
Some testcases are chrome scope testcase hence the resources it loads are triggered by
"system principal". This will exempt the resource from classification.
So in this patch, we used a new window to for the tracker test frame.
The window creation code is referenced from test_privatebrowsing_trackingprotection.html.
Assignee | ||
Comment 6•6 years ago
|
||
This patch adds a xpcsehll-test which tests all the combinations of
those parameters used by the heuristic classification algorithm.
Assignee | ||
Comment 7•6 years ago
|
||
The goal of this testcase is to provide an easier way to add callsites
to test if it is correctly classified. This is a first step, more callsites should be added to
the testcase(See Bug 1532691)
Flow of the test case:
- setup the server(trackerFrame.sjs) with the expceted number of request it should receive
- load the test frame(trackerFrame.html) with cookie restriction off, to ensure all the tracking requests contain cookies
- server responses a list of tracker's request with cookie after reciving all the requests
- the list should contain all the trackers in the test frame.
- enable cookie restriction and load the test frame again.
- server responses a list of tracker's request without cookie after reciving all the requests
- the list should contain all the trackers in the test frame.
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 8•6 years ago
|
||
This is just a note about how I verified the new heuristic works, to ensure there is no missing exception we don't consider.
I audited all the call sites where we create a channel and added "function name" + "file line number" to the LoadInfo.
With this information, together with "IsTriggeringPrinciaplSystemPrincipal", "IsLoadingPrinciaplSystemPrincipal", "IsContentPolicyTypeTypDocument" and "IsOldLoadClassifyURIFlagSet" as a signature.
In the places where we want to know if a channel should be classified, record the "channel's url" and the "heuristic result" for each signature.
Then we get a bunch of data likes these while wandering on the web
{.triggerIsSystem=false, .loadingIsSystem=false, .oldLoadFlags=set, .contentPolicy=other, .func="nsresult mozilla::dom::FetchDriver::HttpFetch:514", .url="https://syndication.twitter.com/settings", .heurisitcResult=classify},
{.triggerIsSystem=true, .loadingIsSystem=true, .oldLoadFlags=set, .contentPolicy=document, .func="nsresult mozilla::dom::FetchDriver::HttpFetch:521", .url=""https://googlechrome.github.io/samples/service-worker/basic/index.html"", .heurisitcResult=true},
Then I manualy check is there any suspicous result from the data I collected. By doing this, around 100 call sites are verified.
But there are still call sites I fail to collect data this way(can't trigger the code path), so I examined the code of those (around 30 call sites) to see if they are good with the new approach.
There are still a few call sites I haven't checked yet(pdf.js, webpayment ...etc). But I think the result I have collected already prove that this is a robust approach to use.
Report:
https://docs.google.com/spreadsheets/d/1-sOQAgJqUaaeBww4kdlwkYuj-SRAsPN1C3rnQzK6u8U/edit?usp=sharing
Comment 10•6 years ago
|
||
Comment 11•6 years ago
|
||
Backed out 7 changesets (bug 1522412) for xpcshell failures at test_shouldclassify.js on a CLOSED TREE.
Backout link: https://hg.mozilla.org/integration/autoland/rev/747a5da93708d6ad12832272d794b21d825a4bb9
Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception&revision=7c998456e1fd5aa0271b8508a25a5becec2768e0&selectedJob=235152734
Log link: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=235152734&repo=autoland&lineNumber=10326
Log snippet:
09:35:44 INFO - TEST-START | toolkit/components/url-classifier/tests/unit/test_shouldclassify.js
09:35:45 WARNING - TEST-UNEXPECTED-FAIL | toolkit/components/url-classifier/tests/unit/test_shouldclassify.js | xpcshell return code: 0
09:35:45 INFO - TEST-INFO took 468ms
09:35:45 INFO - >>>>>>>
09:35:45 INFO - PID 13864 | Unable to load \untrusted-startup-test-dll.dll; LoadLibraryW failed: 126Couldn't convert chrome URL: chrome://branding/locale/brand.properties
09:35:45 INFO - PID 13864 | [13864, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80520012: file z:/build/build/src/netwerk/base/nsIOService.cpp, line 941
09:35:45 INFO - PID 13864 | [13864, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 81
09:35:45 INFO - PID 13864 | [13864, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 81
09:35:45 INFO - PID 13864 | [13864, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 81
09:35:45 INFO - PID 13864 | [13864, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 81
09:35:45 INFO - PID 13864 | [13864, Main Thread] WARNING: Failed to get directory to cache.: file z:/build/build/src/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, line 81
09:35:45 INFO - PID 13864 | [13864, Main Thread] WARNING: Couldn't get the user appdata directory. Crash events may not be produced.: file z:/build/build/src/toolkit/crashreporter/nsExceptionHandler.cpp, line 2528
09:35:45 INFO - PID 13864 | [13864, Main Thread] WARNING: This method is lossy. Use GetCanonicalPath !: file z:/build/build/src/xpcom/io/nsLocalFileWin.cpp, line 3188
09:35:45 INFO - PID 13864 | [13864, Main Thread] WARNING: This method is lossy. Use GetCanonicalPath !: file z:/build/build/src/xpcom/io/nsLocalFileWin.cpp, line 3188
09:35:45 INFO - PID 13864 | [13864, Main Thread] WARNING: This method is lossy. Use GetCanonicalPath !: file z:/build/build/src/xpcom/io/nsLocalFileWin.cpp, line 3188
09:35:45 INFO - PID 13864 | [13864, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, kKnownEsrVersion) failed with result 0x80004002: file z:/build/build/src/toolkit/components/resistfingerprinting/nsRFPService.cpp, line 662
09:35:45 INFO - (xpcshell/head.js) | test MAIN run_test pending (1)
09:35:45 INFO - (xpcshell/head.js) | test run_next_test 0 pending (2)
09:35:45 INFO - (xpcshell/head.js) | test MAIN run_test finished (2)
09:35:45 INFO - running event loop
Assignee | ||
Updated•6 years ago
|
Comment 12•6 years ago
|
||
Comment 13•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/eb789cfccafc
https://hg.mozilla.org/mozilla-central/rev/393b5a62f630
https://hg.mozilla.org/mozilla-central/rev/64664ada91ee
https://hg.mozilla.org/mozilla-central/rev/9f3f62ed5b24
https://hg.mozilla.org/mozilla-central/rev/63bcf0242e87
https://hg.mozilla.org/mozilla-central/rev/0126685be740
https://hg.mozilla.org/mozilla-central/rev/deee64970b40
Assignee | ||
Updated•6 years ago
|
Description
•