Closed Bug 1276826 Opened 9 years ago Closed 8 years ago

Implement Safe Browsing v4 hash completion request

Categories

(Toolkit :: Safe Browsing, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox53 --- fixed

People

(Reporter: hchang, Assigned: hchang)

References

Details

(Whiteboard: #sbv4-m5)

Attachments

(3 files)

No description provided.
Assignee: nobody → hchang
Assignee: hchang → dlee
Whiteboard: #
Whiteboard: # → #sbv4-v2
Whiteboard: #sbv4-v2 → #sbv4-m2
Status: NEW → ASSIGNED
No longer blocks: 1276827
Depends on: 1276827
Priority: -- → P2
Depends on: 1297962
Assignee: dlee → nobody
Status: ASSIGNED → NEW
Depends on: 1298321
Whiteboard: #sbv4-m2 → #sbv4-m3
Whiteboard: #sbv4-m3 → #sbv4-m4
Assignee: nobody → hchang
Depends on: 1319286
Comment on attachment 8814821 [details] Bug 1276826 - Part 1. Implement function to build FindFullHash request for v4. https://reviewboard.mozilla.org/r/95926/#review96956 ::: toolkit/components/url-classifier/nsIUrlClassifierUtils.idl:90 (Diff revision 1) > in uint32_t aCount); > + > + /** > + * Make "find full hash" request by for the given prefixes. > + * > + * @param aListNames An array of list name represented in string. "list names" (plural) ::: toolkit/components/url-classifier/nsIUrlClassifierUtils.idl:92 (Diff revision 1) > + /** > + * Make "find full hash" request by for the given prefixes. > + * > + * @param aListNames An array of list name represented in string. > + * @param aListStatesBase64 An array of list states represented in base64. > + * @param aPrefixes An array of prefixes we'd like to find full hashes. "An array of prefixes for which we'd like to find full hashes." ::: toolkit/components/url-classifier/nsUrlClassifierUtils.cpp:364 (Diff revision 1) > > return NS_OK; > } > > +NS_IMETHODIMP > +nsUrlClassifierUtils::MakeFindFullHashRequestV4(const char** aListNames, Is there a reason why we can't use `TArray`s and `nsCString`s? ::: toolkit/components/url-classifier/nsUrlClassifierUtils.cpp:371 (Diff revision 1) > + const char** aPrefixesBase64, > + uint32_t aListCount, > + uint32_t aPrefixCount, > + nsACString &aRequest) > +{ > + FindFullHashesRequest r; Should we check that `aListNames` and `aListStatesBase64` have the same number of elements and return with an error if they don't? ::: toolkit/components/url-classifier/nsUrlClassifierUtils.cpp:404 (Diff revision 1) > + // 3) Set threat entry type. > + threatInfo->add_threat_entry_types(URL); > + > + // 4) Set threat entries. > + for (uint32_t i = 0; i < aPrefixCount; i++) { > + nsCString p; May as well call this `prefixBinary` to mirror what you have above. ::: toolkit/components/url-classifier/tests/gtest/TestFindFullHash.cpp:76 (Diff revision 1) > + rv = Base64URLDecode(requestBase64, > + Base64URLDecodePaddingPolicy::Require, > + requestBinary); > + ASSERT_TRUE(NS_SUCCEEDED(rv)); > + > + // Parse the FindFullHash binary and comapre with the expected values. "compare"
Attachment #8814821 - Flags: review?(francois) → review+
Comment on attachment 8814822 [details] Bug 1276826 - Part 2. Send gethash request and handle gethash response for v4. https://reviewboard.mozilla.org/r/95928/#review96958 ::: toolkit/components/url-classifier/nsIUrlClassifierUtils.idl:16 (Diff revision 1) > > +/** > + * Interface for parseFindFullHashResponseV4 callback > + * > + * @param aCompleteHash A 32-byte complete hash string. > + * @param aTableNames The table names that this complete hash associated with. "that this complete hash is associated with." ::: toolkit/components/url-classifier/nsIUrlClassifierUtils.idl:17 (Diff revision 1) > +/** > + * Interface for parseFindFullHashResponseV4 callback > + * > + * @param aCompleteHash A 32-byte complete hash string. > + * @param aTableNames The table names that this complete hash associated with. > + * Since what server responded us is the threat type, I'm not sure exactly what you mean by this comment. Is it this: "Since the server responded with a threat type, multiple list names can be returned." ::: toolkit/components/url-classifier/nsIUrlClassifierUtils.idl:130 (Diff revision 1) > in uint32_t aPrefixCount); > + > + /** > + * Parse V4 FindFullHash response. > + * > + * @param aResponse Byte stream that server sent us. "Byte stream from the server." ::: toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js:289 (Diff revision 1) > // Whether we have been informed of a shutdown by the xpcom-shutdown event. > this._shuttingDown = false; > this.gethashUrl = aGethashUrl; > + > + // Multiple partial hashes can be associated with the same tables > + // so we use a set here. A set or a map? ::: toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js:313 (Diff revision 1) > + let isTableNameV4 = aTableName.endsWith('-proto'); > + if (0 === this.tableNames.size) { > + // Decide if this request is v4 by the first added partial hash. > + this.isV4 = isTableNameV4; > + } else if (this.isV4 !== isTableNameV4) { > + log('ERROR: Cannot mix "proto" tables with other types within ' + We print an error, but it seems like we just keep going as if nothing happened... ::: toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js:354 (Diff revision 1) > } > > Services.obs.addObserver(this, "xpcom-shutdown", false); > > + // V4 requires table states to build the request so we need > + // a async call to retrieve the table states from disk. Do we have the in-memory caching yet? I don't think we should land this without this optimization because it could make things _very_ slow on old machines. ::: toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js:393 (Diff revision 1) > + if (this.isV4) { > + // As per spec, we add the request payload to the gethash url. > + actualGethashUrl += "&$req=" + this.buildRequestV4(); > + } > + > + log("actualGethashUrl: " + actualGethashUrl); The API key is going to be filtered out of this, right? ::: toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js:425 (Diff revision 1) > this.timer_.initWithCallback(this, timeout, this.timer_.TYPE_ONE_SHOT); > channel.asyncOpen2(this); > }, > > + buildRequestV4: function HCR_buildRequestV4() { > + // Conver the "name to state" mapping to two equal-length arrays. "Convert" ::: toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js:543 (Diff revision 1) > + log("WARNING: Got complete hash which has ambigious threat type."); > + } > + > + this.handleItem(aCompleteHash, filteredTables[0], 0); > + > + // TODO: Make use of cache durations. Do we have a bug number we can link to? ::: toolkit/components/url-classifier/nsUrlClassifierUtils.cpp:438 (Diff revision 1) > +nsUrlClassifierUtils::ParseFindFullHashResponseV4(const nsACString& aResponse, > + nsIUrlClassifierParseFindFullHashCallback *aCallback) > +{ > + FindFullHashesResponse r; > + if (!r.ParseFromArray(aResponse.BeginReading(), aResponse.Length())) { > + NS_WARNING("Invalid response"); We should probably add a telemetry probe here.
Attachment #8814822 - Flags: review?(francois) → review-
Comment on attachment 8814822 [details] Bug 1276826 - Part 2. Send gethash request and handle gethash response for v4. https://reviewboard.mozilla.org/r/95928/#review96958 > I'm not sure exactly what you mean by this comment. > > Is it this: "Since the server responded with a threat type, multiple list names can be returned." Yes since the conversion from threat type to list name is one-to-many. > We print an error, but it seems like we just keep going as if nothing happened... Then we are gonna process those tables which have the same version as the first table. I would think it's better than doing thing. > Do we have the in-memory caching yet? > > I don't think we should land this without this optimization because it could make things _very_ slow on old machines. Yes it's been done in Bug 1319286. > The API key is going to be filtered out of this, right? Yes!
Comment on attachment 8814822 [details] Bug 1276826 - Part 2. Send gethash request and handle gethash response for v4. https://reviewboard.mozilla.org/r/95928/#review96958 > We should probably add a telemetry probe here. I think the number we want to collect here is: the ratio of gethash responses that cannot be parsed over the total ratio of gethash responses. So we could use an enumerated probe like `URLCLASSIFIER_UPDATE_ERROR` in bug 1311910. I suggest `URLCLASSIFIER_COMPLETION_ERROR` and it would have these values (for now): - `0 = success` - `1 = parsing failure` - `2 = unknown threat type` But let's use 10 buckets to account for future expansion.
Comment on attachment 8814822 [details] Bug 1276826 - Part 2. Send gethash request and handle gethash response for v4. https://reviewboard.mozilla.org/r/95928/#review98260 Can you make it so that V4 completions are not enabled by default yet? I would like to make sure we have a controlled roll-out of all of the parts of the V4 code. I would suggest making the `browser.safebrowsing.provider.google4.gethashURL` pref empty and testing that this doesn't cause any problems. ::: toolkit/components/url-classifier/tests/unit/head_urlclassifier.js:472 (Diff revision 2) > + > + return true; // break no matter whether the state is matching. > + }); > + > + if (!didCallback) { > + do_timeout(1000, waitUntilMetaDataSaved.bind(null, expectedState, Is this timeout going to be long enough on a slow Android build/test machine?
Attachment #8814822 - Flags: review?(francois) → review+
Whiteboard: #sbv4-m4 → #sbv4-m5
Comment on attachment 8814821 [details] Bug 1276826 - Part 1. Implement function to build FindFullHash request for v4. https://reviewboard.mozilla.org/r/95926/#review96956 > Is there a reason why we can't use `TArray`s and `nsCString`s? To the best of my knowledge for xpidl binding, the only alternative is nsIArray containing a XPCOM'ized nsCString (i.e. nsIStringStream), which is way too much for our use case. Therefore I simply use the most native and straightforwad "array" binding in the idl. > Should we check that `aListNames` and `aListStatesBase64` have the same number of elements and return with an error if they don't? Given that this function is the implementation of a xpidl interface ACString makeFindFullHashRequestV4([array, size_is(aListCount)] in string aListNames, [array, size_is(aListCount)] in string aListStatesBase64, [array, size_is(aPrefixCount)] in string aPrefixes, in uint32_t aListCount, in uint32_t aPrefixCount); The caller is reponsible for passing |aListNames| and |aListSttesBase64| with the same length and the callee has no way to check the sanity. This is the limitation of XPIDL binding while using [array, sizeof(...)] as the arguement. The only call site (in patch part 2) will ensure the arguments sanity.
Comment on attachment 8814822 [details] Bug 1276826 - Part 2. Send gethash request and handle gethash response for v4. https://reviewboard.mozilla.org/r/95928/#review98260 > Is this timeout going to be long enough on a slow Android build/test machine? |waitUntilMetaDataSaved| will be repetitively called until succeeded after one-second timeout. That said, the "overall timeout" might be an issue on a slow machine. However, this piece of code has been ran for a couple of days. It is just moved around to be re-used by the new test case for v4 gethash :)
Blocks: 1303393
Pushed by hchang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/01ddaa738087 Part 1. Implement function to build FindFullHash request for v4. r=francois https://hg.mozilla.org/integration/autoland/rev/d35643646f94 Part 2. Send gethash request and handle gethash response for v4. r=francois
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Depends on: 1323856
Depends on: 1323953
Depends on: 1329558
No longer depends on: 1297962
No longer depends on: 1298321
No longer depends on: 1323953
No longer depends on: 1329558
Depends on: 1297962
No longer depends on: 1297962
No longer blocks: 1303393
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: