Closed Bug 1333328 Opened 7 years ago Closed 7 years ago

Refactor cache miss handling mechanism for V2

Categories

(Toolkit :: Safe Browsing, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: dlee, Assigned: dlee)

References

Details

(Whiteboard: #sbv4-m7)

Attachments

(2 files, 3 obsolete files)

Right now safebrowsing uses mMissCache to record gethash request that doesn't return a completion. So next time when we trigger another gethash request, we will check if the prefix is in mMissCache first, if it exists, then gethash request for that prefix won't be sent.

The idea of this mechanism is the same as V4 negative caching, so we should have a better way to distinguish these two mechanism in code.
Depends on: 1311935
Whiteboard: #sbv4-m6
Depends on: 1338082
Ideally we should change the V2 code so that is uses a fixed negative cache duration and therefore can use the same caching code as V4.

I will check with Google, but if that strategy is approved by Google, then we can get rid of the old miss cache entirely.
Priority: -- → P2
Attached patch WIP patch (obsolete) — Splinter Review
This is a WIP patch based on bug 1311935.
Google is fine with us using a negative cache time of 15 minutes on V2.
Attachment #8855807 - Flags: review?(francois)
(In reply to François Marier [:francois] from comment #5)
> Google is fine with us using a negative cache time of 15 minutes on V2.

Ok, I will update the patch to use 15 minutes as the negative cache duration.
Whiteboard: #sbv4-m6 → #sbv4-m7
Attachment #8851843 - Attachment is obsolete: true
Attachment #8855806 - Attachment is obsolete: true
Attachment #8855807 - Attachment is obsolete: true
Comment on attachment 8855806 [details]
Bug 1333328 - Refactor cache miss handling mechanism for V2.

Cancel the review request because I think we should also consider positive cache case.

Hi Francois,
Hope you haven't reviewed the patch.

Right now the positive cache for v2 is done by checking table freshness and will be cleared when update. Do you think we should follow the same rule(15 mins cache duration) and get rid of table freshness?
Flags: needinfo?(francois)
Attachment #8855806 - Flags: review?(francois)
(In reply to Dimi Lee[:dimi][:dlee] from comment #8)
> Right now the positive cache for v2 is done by checking table freshness and
> will be cleared when update. Do you think we should follow the same rule(15
> mins cache duration) and get rid of table freshness?

Yes, I think we can simplify our caching code by assuming that every V2 prefix has a 15 min negative caching duration instead of keeping around a miss cache.
Flags: needinfo?(francois)
(In reply to François Marier [:francois] from comment #9)
> (In reply to Dimi Lee[:dimi][:dlee] from comment #8)
> > Right now the positive cache for v2 is done by checking table freshness and
> > will be cleared when update. Do you think we should follow the same rule(15
> > mins cache duration) and get rid of table freshness?
> 
> Yes, I think we can simplify our caching code by assuming that every V2
> prefix has a 15 min negative caching duration instead of keeping around a
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
Do you mean positive caching duration? 
> miss cache.
Let's do 15 min for positive and negative cache duration.

We also need to make sure that tables that have completions disabled (e.g. tracking protection) never expire.
Regarding the positive cache duration, we also need to make sure that whenever we call gethash in V2, we update the positive cache duration. Otherwise we may end up calling gethash too often.

I imagine it already works like that in the V4 code.
See Also: → 1357898
Depends on: 1357898
See Also: → 1309161
Comment on attachment 8864406 [details]
Bug 1333328 - Remove mTableFreshness and gFreshnessGuarantee.

https://reviewboard.mozilla.org/r/136094/#review139428

There's also a `lastupdatetime` in https://searchfox.org/mozilla-central/rev/6580dd9a4eb0224773897388dba2ddf5ed7f4216/toolkit/components/url-classifier/SafeBrowsing.jsm#183

::: toolkit/components/url-classifier/content/listmanager.js
(Diff revision 1)
>          updateDelay = Math.min(maxDelayMs, Math.max(0, nextUpdate - Date.now()));
>          log("Next update at " + nextUpdate);
>        }
>        log("Next update " + updateDelay + "ms from now");
>  
> -      // Set the last update time to verify if data is still valid.

If we're removing these prefs, we'll need to also remove them from about:url-classifier.
Attachment #8864406 - Flags: review?(francois) → review-
(In reply to François Marier [:francois] from comment #16)
> Comment on attachment 8864406 [details]
> Bug 1333328 - P2. Remove mTableFreshness and gFreshnessGuarantee.
> 
> https://reviewboard.mozilla.org/r/136094/#review139428
> 
> There's also a `lastupdatetime` in

Yes, the reason I didn't remove it is because I think this value might still be worth keeping for information purpose.
I'll remove that if you think we don't really need that. Another approach is remove the "lastupdatetime" preference but keep a variable for it so about:url-classifier may still get the information.
(In reply to Dimi Lee[:dimi][:dlee] from comment #17)
> Yes, the reason I didn't remove it is because I think this value might still
> be worth keeping for information purpose.
> I'll remove that if you think we don't really need that. Another approach is
> remove the "lastupdatetime" preference but keep a variable for it so
> about:url-classifier may still get the information.

So the provider.*.lastupdatetime pref is still set even after your patch is applied?
(In reply to François Marier [:francois] from comment #18)
> 
> So the provider.*.lastupdatetime pref is still set even after your patch is
> applied?

Yes, I didn't remove the part we set preference.
Attachment #8859504 - Attachment is obsolete: true
Comment on attachment 8864406 [details]
Bug 1333328 - Remove mTableFreshness and gFreshnessGuarantee.

https://reviewboard.mozilla.org/r/136094/#review139444

(In reply to Dimi Lee[:dimi][:dlee] from comment #19)
> (In reply to François Marier [:francois] from comment #18)
> > 
> > So the provider.*.lastupdatetime pref is still set even after your patch is
> > applied?
> 
> Yes, I didn't remove the part we set preference.

Ok let's keep it for now then. I agree with you that it's a nice debugging help.

We can always take it out later if we have a good reason (e.g. if it gets out of sync and/or causes any problems).
Attachment #8864406 - Flags: review- → review+
Comment on attachment 8864406 [details]
Bug 1333328 - Remove mTableFreshness and gFreshnessGuarantee.

https://reviewboard.mozilla.org/r/136094/#review140868

::: commit-message-63e55:1
(Diff revision 1)
> +Bug 1333328 - P2. Remove mTableFreshness and gFreshnessGuarantee. r?francois

ni: you could remove "P2" from the commit message since the commits will already be in the right order
Comment on attachment 8855806 [details]
Bug 1333328 - Refactor cache miss handling mechanism for V2.

https://reviewboard.mozilla.org/r/127694/#review140838

::: commit-message-64714:1
(Diff revision 3)
> +Bug 1333328 - P1. Refactor cache miss handling mechanism for V2. r?francois

nit: I would omit "P1" here

::: commit-message-64714:11
(Diff revision 3)
> +
> +But Safebrowsing V2 doesn't contain negative/positive cache duration information in
> +gethash response. So we hard-code a fixed value, 15 minutes, as cache duration.
> +In this way, we can sync the mechanism we handle caching for V2 and V4.
> +
> +An extra effort for V2 here is that we need manually record prefixes can't hit

Do you mean this?

    "we need to manually record the prefixes misses"

::: commit-message-64714:12
(Diff revision 3)
> +But Safebrowsing V2 doesn't contain negative/positive cache duration information in
> +gethash response. So we hard-code a fixed value, 15 minutes, as cache duration.
> +In this way, we can sync the mechanism we handle caching for V2 and V4.
> +
> +An extra effort for V2 here is that we need manually record prefixes can't hit
> +because we won't get any response for those prefixes(Implement in

"... those prefixes (implemented in nsUrl..."

::: toolkit/components/url-classifier/LookupCache.h:127
(Diff revision 3)
> -           prefix == aOther.prefix &&
> -           completion == aOther.completion &&
> -           addChunk == aOther.addChunk;
> +        prefix != aOther.prefix ||
> +        miss != aOther.miss) {
> +      return false;
> +    }
> +
> +    return miss ? true :

I think this might be easier to read as:

    if (miss) {
        return true
    }
    return completion == aOther.completion && ...

::: toolkit/components/url-classifier/LookupCache.cpp:130
(Diff revision 3)
> +{
> +  uint32_t prefix = aCompletion.ToUint32();
> +
> +  CachedFullHashResponse* fullHashResponse = mCache.Get(prefix);
> +  if (!fullHashResponse) {
> +    return NS_OK;

Should we explicitly set `*aHas = *aConfirmed = false` before returning from this function?

It seems like the function relies on the caller initializing both output parameters to false.

::: toolkit/components/url-classifier/LookupCache.cpp:181
(Diff revision 3)
> +
> +  return NS_OK;
> +}
> +
> +// This function remove cache entries whose negative cache time is expired.
> +// It is possible that a cache entry whose positive cache time is not yet

"It is possible that a cache entry whose positive cache time has not passed will still be removed after calling this function. Right now we call this on every update."

Of course, this means that we could be getting rid of perfectly valid completions that have a long expiry time. Maybe we should leave a comment to say that maybe we should look for completions with unexpired positive cache times?

    // FIXME: look for unexpired completions before removing the prefix entirely

::: toolkit/components/url-classifier/LookupCache.cpp:529
(Diff revision 3)
>      *aHas = true;
>      *aMatchLength = COMPLETE_SIZE;
> +    *aConfirmed = true;
> +  }
>  
> -    int64_t ageSec; // in seconds
> +  LOG(("Probe in %s: %X, found %d", mTableName.get(), prefix, found));

It may be useful to show the contents of `aHas` and `aConfirmed` here instead of `found` since we are removing the `LOG(("Complete in %s))` line.

::: toolkit/components/url-classifier/LookupCache.cpp:586
(Diff revision 3)
>  
> -nsresult
> -LookupCacheV2::AddCompletionsToCache(AddCompleteArray& aAddCompletes)
> +void
> +LookupCacheV2::AddGethashResultToCache(AddCompleteArray& aAddCompletes,
> +                                       MissPrefixArray& aMissPrefixes)
>  {
> -  for (uint32_t i = 0; i < aAddCompletes.Length(); i++) {
> +  int64_t expirySec = PR_Now() / PR_USEC_PER_SEC + V2_CACHE_DURATION_SEC;

I would suggest `defaultExpirySec` to emphasize that it's a hard-coded value when we use it elsewhere in the function.

::: toolkit/components/url-classifier/LookupCache.cpp:593
(Diff revision 3)
> -      mGetHashCache.AppendElement(aAddCompletes[i].CompleteHash());
> +  for (const AddComplete& add : aAddCompletes) {
> +    nsDependentCSubstring fullhash(
> +      reinterpret_cast<const char*>(add.CompleteHash().buf), COMPLETE_SIZE);
> +
> +    CachedFullHashResponse* response = mCache.LookupOrAdd(add.ToUint32());
> +    // Set negative cache expired time to the same value as positive cache

"Set negative cache expiry"

::: toolkit/components/url-classifier/LookupCache.cpp:594
(Diff revision 3)
> +    nsDependentCSubstring fullhash(
> +      reinterpret_cast<const char*>(add.CompleteHash().buf), COMPLETE_SIZE);
> +
> +    CachedFullHashResponse* response = mCache.LookupOrAdd(add.ToUint32());
> +    // Set negative cache expired time to the same value as positive cache
> +    // expired time when gethash request returns completion.

"...as positive cache expiry when the gethash request returns a complete match."

::: toolkit/components/url-classifier/LookupCacheV4.cpp:113
(Diff revision 3)
>    // Check if fullhash match any prefix in the local database
>    if (!(*aHas)) {
>      return NS_OK;
>    }
>  
> -  // We always send 4-bytes for completion(Bug 1323953) so the prefix used to
> +  // Even though V4 supports variable-length prefix, we always send 4-bytes for

"Even though V4 supports variable-length prefixes, we always send 4-byte prefixes when requesting completions. This means that the cached prefix length is also 4 bytes."

I think we can leave out the "... same cache algorithm for V2 and V4..." since it's clearly in the base class.

::: toolkit/components/url-classifier/LookupCacheV4.cpp:114
(Diff revision 3)
>    if (!(*aHas)) {
>      return NS_OK;
>    }
>  
> -  // We always send 4-bytes for completion(Bug 1323953) so the prefix used to
> -  // lookup for cache should be 4-bytes(uint32_t) too.
> +  // Even though V4 supports variable-length prefix, we always send 4-bytes for
> +  // completion(Bug 1323953). This means cached prefix length is also 4-bytes,

nit: space before `(`

::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:1371
(Diff revision 3)
>    }
>  
> -  // TODO: Bug 1333328, Refactor cache miss mechanism for v2.
>    // Some parts of this gethash request generated no hits at all.
> -  // Prefixes must have been removed from the database since our last update.
> -  // Save the prefixes we checked to prevent repeated requests
> +  // Save the prefixes we checked to prevent repeated requests.
> +  // Missed prefix

nit: the "Missed prefix" line is not particularly necessary

::: toolkit/components/url-classifier/tests/mochitest/cache.sjs:1
(Diff revision 3)
> +const CC = Components.Constructor;

Missing copyright header

::: toolkit/components/url-classifier/tests/mochitest/test_cachemiss.html:79
(Diff revision 3)
> +// manually reset DB to make sure next test won't be affected by cache.
> +function reset() {
> +  return classifierHelper.resetDB;
> +}
> +
> +// This test has to put before testPositiveCache to ensure gethash server doesn't

"has to come before"

::: toolkit/components/url-classifier/tests/mochitest/test_cachemiss.html:85
(Diff revision 3)
> +// contain completions.
> +function testNegativeCache() {
> +  shouldLoad = true;
> +
> +  function setup() {
> +		classifierHelper.allowCompletion([MALWARE_LIST, UNWANTED_LIST], GETHASH_URL);

indentation

::: toolkit/components/url-classifier/tests/mochitest/test_cachemiss.html:89
(Diff revision 3)
> +  function setup() {
> +		classifierHelper.allowCompletion([MALWARE_LIST, UNWANTED_LIST], GETHASH_URL);
> +
> +    // Only add prefix to database. not server, so gethash will not return
> +    // result.
> +		return Promise.all([

ditto

::: toolkit/components/url-classifier/tests/mochitest/test_cachemiss.html:111
(Diff revision 3)
> +
> +function testPositiveCache() {
> +  shouldLoad = false;
> +
> +  function setup() {
> +		classifierHelper.allowCompletion([MALWARE_LIST, UNWANTED_LIST], GETHASH_URL);

ditto

::: toolkit/components/url-classifier/tests/unit/test_partial.js
(Diff revision 3)
>  }
>  
>  function run_test()
>  {
>    runTests([
> -      testPartialAdds,

Did you mean to disable all of these tests?

They're essentially dead code, no?
Attachment #8855806 - Flags: review?(francois) → review-
Comment on attachment 8855806 [details]
Bug 1333328 - Refactor cache miss handling mechanism for V2.

https://reviewboard.mozilla.org/r/127694/#review140912

::: commit-message-64714:11
(Diff revision 3)
> +
> +But Safebrowsing V2 doesn't contain negative/positive cache duration information in
> +gethash response. So we hard-code a fixed value, 15 minutes, as cache duration.
> +In this way, we can sync the mechanism we handle caching for V2 and V4.
> +
> +An extra effort for V2 here is that we need manually record prefixes can't hit

Yes, i will correct the wording.

::: toolkit/components/url-classifier/LookupCache.cpp:529
(Diff revision 3)
>      *aHas = true;
>      *aMatchLength = COMPLETE_SIZE;
> +    *aConfirmed = true;
> +  }
>  
> -    int64_t ageSec; // in seconds
> +  LOG(("Probe in %s: %X, found %d", mTableName.get(), prefix, found));

Yes you are right, i should show these information in LOG.
I'll also use 'has' instead of 'found' because when a completion is matched in local database(mUpdateCompletions), show found = false, complete = true is somewhat strange to me

::: toolkit/components/url-classifier/tests/unit/test_partial.js
(Diff revision 3)
>  }
>  
>  function run_test()
>  {
>    runTests([
> -      testPartialAdds,

I am sorry, this is accidentally added to the patch... :(
Comment on attachment 8855806 [details]
Bug 1333328 - Refactor cache miss handling mechanism for V2.

https://reviewboard.mozilla.org/r/127694/#review140838

> Should we explicitly set `*aHas = *aConfirmed = false` before returning from this function?
> 
> It seems like the function relies on the caller initializing both output parameters to false.

Since *aHas indicates if we find the prefix in local database which shouldn't be false when call this function,
I'll add MOZ_ASSERT(*aHas) in the beginning

For *aConfirmed, I agree with you that it should be set to false in the function.

> "It is possible that a cache entry whose positive cache time has not passed will still be removed after calling this function. Right now we call this on every update."
> 
> Of course, this means that we could be getting rid of perfectly valid completions that have a long expiry time. Maybe we should leave a comment to say that maybe we should look for completions with unexpired positive cache times?
> 
>     // FIXME: look for unexpired completions before removing the prefix entirely

I thought this is discussed in tpe work week that just use a simpler cache eviction approach when update.
Otherwise if the cache is large then check every entries might be time consuming.
Comment on attachment 8855806 [details]
Bug 1333328 - Refactor cache miss handling mechanism for V2.

https://reviewboard.mozilla.org/r/127694/#review140838

> I thought this is discussed in tpe work week that just use a simpler cache eviction approach when update.
> Otherwise if the cache is large then check every entries might be time consuming.

You're right, we did decide to do this. I was thinking that we should document that decision in the code somehow.
Comment on attachment 8855806 [details]
Bug 1333328 - Refactor cache miss handling mechanism for V2.

https://reviewboard.mozilla.org/r/127694/#review141384

This is looking quite good.

Do you think we should add a small `TestCachingV2.cpp` gtest too?

::: toolkit/components/url-classifier/LookupCache.cpp:539
(Diff revisions 3 - 4)
> -
> +    rv = CheckCache(aCompletion, aHas, aConfirmed);
> -  if (!(*aHas) || (*aHas && *aConfirmed)) {
> -    return NS_OK;
>    }
>  
> -  return CheckCache(aCompletion, aHas, aConfirmed);
> +  LOG(("Probe in %s: %X, has %d, complete %d",

Maybe we should use "confirmed" instead of "complete" since that's what the variable is called.
Attachment #8855806 - Flags: review?(francois) → review-
(In reply to François Marier [:francois] from comment #28)
> 
> Do you think we should add a small `TestCachingV2.cpp` gtest too?
> 

Yes, that will be good, I will add it.
Comment on attachment 8855806 [details]
Bug 1333328 - Refactor cache miss handling mechanism for V2.

https://reviewboard.mozilla.org/r/127694/#review141874

::: toolkit/components/url-classifier/tests/gtest/TestCachingV4.cpp:9
(Diff revision 5)
>  
>  #include "Common.h"
>  
>  #define EXPIRED_TIME_SEC     (PR_Now() / PR_USEC_PER_SEC - 3600)
>  #define NOTEXPIRED_TIME_SEC  (PR_Now() / PR_USEC_PER_SEC + 3600)
>  

This file should be renamed to TestCaching.cpp, didn't do it now because I think it would be easier for reviewer to review.
I'll change file name once the review is finished

::: toolkit/components/url-classifier/tests/gtest/TestCachingV4.cpp:166
(Diff revision 5)
> +  TestCache<LookupCacheV4>(fullhash, false, false, true);
>  }
>  
>  // This testcase check the returned result of |Has| API if fullhash matches
>  // a cache entry in negative cache but that entry is expired.
>  TEST(CachingV4, InNegativeCacheExpired)

Should be renamed to Test(Caching, ...)
I just forgot, will fix this in next patch
Comment on attachment 8855806 [details]
Bug 1333328 - Refactor cache miss handling mechanism for V2.

https://reviewboard.mozilla.org/r/127694/#review142224

Looks great! I like how you generalized the V4 test and got V2 testing almost for free :)

::: toolkit/components/url-classifier/LookupCache.cpp:594
(Diff revisions 4 - 5)
>  void
>  LookupCacheV2::AddGethashResultToCache(AddCompleteArray& aAddCompletes,
> -                                       MissPrefixArray& aMissPrefixes)
> +                                       MissPrefixArray& aMissPrefixes,
> +                                       int64_t aExpirySec)
>  {
> -  int64_t defaultExpirySec = PR_Now() / PR_USEC_PER_SEC + V2_CACHE_DURATION_SEC;
> +  int64_t defaultExpirySec = aExpirySec == 0 ?

To make this easier to read, I'd suggest:

    int64_t defaultExpirySec = PR_Now() / PR_USEC_PER_SEC + V2_CACHE_DURATION_SEC;
    if (aExpirySec != 0) {
      defaultExpirySec = aExpirySec;
    }

It's a bunch of extra instructions for the test case, but that's ok since it's just for the unit test.
Attachment #8855806 - Flags: review?(francois) → review+
Depends on: 1359299
No longer depends on: 1359299
Blocks: 1359299
Blocks: 1360480
Keywords: checkin-needed
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/455566089652
Refactor cache miss handling mechanism for V2. r=francois
https://hg.mozilla.org/integration/autoland/rev/4f9f6121c257
Remove mTableFreshness and gFreshnessGuarantee. r=francois
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/455566089652
https://hg.mozilla.org/mozilla-central/rev/4f9f6121c257
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Blocks: 1365877
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: