Closed Bug 1358324 Opened 7 years ago Closed 7 years ago

The URLCLASSIFIER_MATCH_THREAT_TYPE_RESULT probe doesn't seem to be working correctly

Categories

(Toolkit :: Safe Browsing, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: francois, Assigned: dimi)

References

Details

(Whiteboard: #sbv4-m7)

Attachments

(1 file)

Looking at the data collected by this probe so far in Nightly, I'm not sure it's working correctly:

https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2017-04-18&keys=__none__!__none__!__none__&max_channel_version=nightly%252F55&measure=URLCLASSIFIER_MATCH_THREAT_TYPE_RESULT&min_channel_version=null&processType=*&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2017-04-12&table=1&trim=1&use_submission_date=0

because it essentially only returns bit flags with V2 categories OR V4 categories, never both at the same time.

I would expect to always see a mix of V2 and V4 because the probe is supposed to note when a URL:

- matches in both versions of the protocol
- returns a different threat type in each of them
Assignee: nobody → dlee
Status: NEW → ASSIGNED
Comment on attachment 8860286 [details]
Bug 1358324 - The URLCLASSIFIER_MATCH_THREAT_TYPE_RESULT probe doesn't seem to be working correctly.

https://reviewboard.mozilla.org/r/132312/#review135420

The approach makes sense to me. I've suggested a small simplification.

::: commit-message-8b854:15
(Diff revision 2)
> +But if one of lookupResult is from cache, for example, completion is found in 'goog-malware-proto',
> +then we won't trigger a gethash request for it because it is in the cache.
> +In this scenario, mCacheResults will not have results from V4 so when we try to record
> +threat types we won't find V4 ones.
> +
> +I think what we want to record here is when gethash is sent for both V2 and V4, so in this patch

That limits the usefulness of that probe quite a bit, but I guess there's not much we can do about this. We shouldn't make the code less efficient just to be able to measure things better :)

::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:1368
(Diff revision 2)
>    // Only record threat type telemetry when completion is found in V2 & V4.
>    if (matchResult == MatchResult::eAll && mCacheResults) {
>      MatchThreatType types = MatchThreatType::eIdentical;
>  
>      // Check all the results because there may be multiple matches being returned.
> +    uint8_t FIND_V2 = 0x01, FIND_V4 = 0x02, value = 0;

I think this would be clearer if we used two booleans just like we do in a few other places.

    bool foundV2Result = foundV4Result = false;

::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:1372
(Diff revision 2)
>      // Check all the results because there may be multiple matches being returned.
> +    uint8_t FIND_V2 = 0x01, FIND_V4 = 0x02, value = 0;
>      for (uint32_t i = 0; i < mCacheResults->Length(); i++) {
>        CacheResult* c = mCacheResults->ElementAt(i).get();
> +      bool isV2 = CacheResult::V2 == c->Ver();
> +      value |= isV2 ? FIND_V2 : FIND_V4;

I's suggest:

    if (isV2) {
      foundV2Result = true;
    } else {
      foundV4Result = true;
    }

::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:1386
(Diff revision 2)
>          }
>          break;
>        }
>      }
>  
> +    // We don't want to record telemetry when not all the results are from gethash

Assuming I understand the underlying problem correctly, this is how I would explain it:

"We don't want to record telemetry when one of the results is from cache because finding an unexpired cache entry will prevent us from doing gethash requests that would otherwise be required."

::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:1388
(Diff revision 2)
>        }
>      }
>  
> +    // We don't want to record telemetry when not all the results are from gethash
> +    // requests, that is to say at least one of confirmed LookupResult is from cache.
> +    if (value == (FIND_V2 | FIND_V4)) {

This would become simply:

     if (foundV2Result && foundV4Result) {
Attachment #8860286 - Flags: review?(francois) → review-
Comment on attachment 8860286 [details]
Bug 1358324 - The URLCLASSIFIER_MATCH_THREAT_TYPE_RESULT probe doesn't seem to be working correctly.

https://reviewboard.mozilla.org/r/132312/#review137386
Attachment #8860286 - Flags: review?(francois) → review+
Keywords: checkin-needed
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fa96f047e722
The URLCLASSIFIER_MATCH_THREAT_TYPE_RESULT probe doesn't seem to be working correctly. r=francois
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/fa96f047e722
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: