Closed Bug 1296820 Opened 8 years ago Closed 8 years ago

Enabling Safe Browsing V4 updates breaks all list updates

Categories

(Toolkit :: Safe Browsing, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla51
Tracking Status
firefox51 --- fixed

People

(Reporter: francois, Assigned: hchang)

References

Details

(Whiteboard: #sbv4-m1)

Attachments

(1 file)

This is not enabled by default so it's not actually a regression, but if one enables the experimental V4 support for Safe Browsing, then Safe Browsing updates don't happen anymore.

Steps:

1. Create a new profile called "empty"
2. Go into about:config and set these prefs:

browser.safebrowsing.debug = true
browser.safebrowsing.provider.google.nextupdatetime = 1
browser.safebrowsing.provider.google4.nextupdatetime = 1
urlclassifier.phishTable = googpub-phish-proto,goog-phish-shavar,test-phish-simple
urlclassifier.malwareTable = goog-malware-proto,goog-malware-shavar,goog-unwanted-shavar,test-malware-simple,test-unwanted-simple

3. Shut down Firefox
4. Delete the ~/.cache/mozilla/firefox/?????.empty/safebrowsing/ directory
5. Start Firefox and watch the console to see something like:

listmanager: 18:07:35 GMT-0700 (PDT): Initializing update checker for https://safebrowsing.googleapis.com/v4/threatListUpdates:fetch?$ct=application/x-protobuf&key=[trimmed-google-api-key] provided by google4
listmanager: 18:07:35 GMT-0700 (PDT): Next update at 1
listmanager: 18:07:35 GMT-0700 (PDT): Next update 0ms from now
listmanager: 18:07:35 GMT-0700 (PDT): Initializing update checker for https://safebrowsing.google.com/safebrowsing/downloads?client=navclient-auto-ffox&appver=51.0a&pver=2.2&key=[trimmed-google-api-key] provided by google
listmanager: 18:07:35 GMT-0700 (PDT): Next update at 1
listmanager: 18:07:35 GMT-0700 (PDT): Next update 0ms from now
...
listmanager: 18:07:35 GMT-0700 (PDT): checkForUpdates with https://safebrowsing.googleapis.com/v4/threatListUpdates:fetch?$ct=application/x-protobuf&key=[trimmed-google-api-key]
listmanager: 18:07:35 GMT-0700 (PDT): checkForUpdates with https://safebrowsing.google.com/safebrowsing/downloads?client=navclient-auto-ffox&appver=51.0a&pver=2.2&key=[trimmed-google-api-key]
...
listmanager: 18:07:35 GMT-0700 (PDT): makeUpdateRequestForEntry_: requestPayload ChUKE25hdmNsaWVudC1hdXRvLWZmb3gaCggCEAIiAiABKAEaCggBEAIiAiABKAE= update: https://safebrowsing.googleapis.com/v4/threatListUpdates:fetch?$ct=application/x-protobuf&key=[trimmed-google-api-key] tablelist: googpub-phish-proto,goog-malware-proto
...
listmanager: 18:07:35 GMT-0700 (PDT): makeUpdateRequestForEntry_: requestPayload goog-phish-shavar;
goog-malware-shavar;
goog-unwanted-shavar;
goog-badbinurl-shavar;
 update: https://safebrowsing.google.com/safebrowsing/downloads?client=navclient-auto-ffox&appver=51.0a&pver=2.2&key=[trimmed-google-api-key] tablelist: goog-phish-shavar,goog-malware-shavar,goog-unwanted-shavar,goog-badbinurl-shavar

listmanager: 18:07:35 GMT-0700 (PDT): pending update, queued request until later
listmanager: 18:07:38 GMT-0700 (PDT): update error for googpub-phish-proto,goog-malware-proto from https://safebrowsing.googleapis.com/v4/threatListUpdates:fetch?$ct=application/x-protobuf&key=[trimmed-google-api-key]: 2147500037

6. Close Firefox
7. Look in the ~/.cache/mozilla/firefox/?????.empty/safebrowsing/ directory

That directory is empty for me. The V2 files for Google are missing.
I wasn't aware of this issue when I was testing V4 list updates but the relevant code might have changed as time went by. Note that "browser.safebrowsing.provider.google4.updateURL" needs to be properly configured with valid API key as well. I am not sure if it's appropriate to post my personal testing key here.

I will give it a try on top of m-c today. Thanks for you reporting!

[1] https://dxr.mozilla.org/mozilla-central/rev/f97a056ae6235de7855fd8aaa04fb1c8d183bd06/modules/libpref/init/all.js#5138
Assignee: nobody → hchang
Whiteboard: #sbv4-m1
Attachment #8783421 - Flags: review?(dlee)
Attachment #8783421 - Flags: review?(francois)
According to the following log:

[URL Classifier]: D/UrlClassifierDbService nsUrlClassifierDBServiceWorker::ApplyUpdate()
[URL Classifier]: D/UrlClassifierDbService Backup before update.
[URL Classifier]: D/UrlClassifierDbService Applying 1 table updates.
[URL Classifier]: D/UrlClassifierDbService Classifier::UpdateHashStore(googpub-phish-proto)
[URL Classifier]: D/UrlClassifierDbService GetPrefixes from empty LookupCache
[URL Classifier]: D/UrlClassifierDbService Notifying error: NS_ERROR_FAILURE (-2147467259)
[URL Classifier]: D/UrlClassifierDbService Spoiling table: googpub-phish-proto
[URL Classifier]: D/UrlClassifierDbService Reading Completions
[URL Classifier]: D/UrlClassifierDbService Loading PrefixSet
[URL Classifier]: D/UrlClassifierDbService no (usable) stored PrefixSet found
listmanager: 11:10:42 GMT+0800 (CST): update error for googpub-phish-proto from https://safebrowsing.googleapis.com/v4/threatListUpdates:fetch?$ct=application/x-protobuf&key=xxxxxxxxxxxxxxxxxxxxxxx: 2147500037

When failing to apply updates, the backed up databases were NOT restored.

The fundamental issue here is we are not handling the update error case very well.
For every single update error, we should restore the backed up databases.
However, it seems that the recovery would be done at some point [1].

I haven't looked into all the recovery things but we should probably prevent 
any "table apply" codes from being executed with TableUpdateV4 until we've done
Bug 1283009. As a result, I submit a patch to 

1) Before actually applying the updates, we check if it's a V4 updates and bail out
   if it's V4 updates.

2) We manually control the life cycle of nsTArray<TableUpdate*> for now. 
   Chances are we've ever tried to make them ref-counted (but failed) 
   so I wouldn't be doing that at the moment. Instead, we can mitigate 
   the memory leaking issue by adding an scoped-clearer around the TableUpdates 
   and remove all the explicit deletions [2][3][4].


[1] https://dxr.mozilla.org/mozilla-central/rev/f97a056ae6235de7855fd8aaa04fb1c8d183bd06/toolkit/components/url-classifier/Classifier.cpp#136
[2] https://dxr.mozilla.org/mozilla-central/rev/f97a056ae6235de7855fd8aaa04fb1c8d183bd06/toolkit/components/url-classifier/Classifier.cpp#359
[3] https://dxr.mozilla.org/mozilla-central/rev/f97a056ae6235de7855fd8aaa04fb1c8d183bd06/toolkit/components/url-﷒0﷓
[4] https://dxr.mozilla.org/mozilla-central/rev/f97a056ae6235de7855fd8aaa04fb1c8d183bd06/toolkit/components/url-classifier/Classifier.cpp#3=594
I actually have another approach (which is ugly but safer):

Leave all "delete"s there and let the scoped clearer be the "plan B" to delete the pointers that we forgot to delete. This requires all the

>> delete update;

is accompanied with

>> aUpdates->ElementAt(i) = nullptr
Comment on attachment 8783421 [details]
Bug 1296820 - Skip applying TableUpdateV4 to avoid premature update codes being run.

https://reviewboard.mozilla.org/r/73224/#review71264

This is a good approach, I like it!, thanks
Attachment #8783421 - Flags: review?(dlee) → review+
Comment on attachment 8783421 [details]
Bug 1296820 - Skip applying TableUpdateV4 to avoid premature update codes being run.

https://reviewboard.mozilla.org/r/73224/#review71580

If I understand this correctly, this patch disables all updates if you enable V4 updates. Is that right?

I was expecting us to be able to start testing V4 updates before landing the storage changes. In particular, this would allow us to find and fix issues like this one where enabling V4 updates break V2.

So, for the V4 update code that has landed, would you say it's incomplete or broken? In other words, was it supposed to work (even though we're not storing anything yet) or did we land something unfinished?
Attachment #8783421 - Flags: review?(francois) → review-
Hi Francois,

Sorry for the confusions I made to you but I'd like to sum up first:

this patch would "un-break" the V2 list updates while retaining the
V4 update process. To be even more clear, let's split the "V4 update process" 
to the following parts:

[U1] Sending update request
  - Input:  The table names (e.g. goog-phishing-proto) and states
  - Output: A HTTP request along with API key and the base64-encoded protobuf binaries.
  - Stats:  Done.

[U2] Parsing update response to TableUpdateV4
  - Input:  protobuf binaries from HTTP response
  - Output: TableUpdateV4
  - Status: Done

[U3] Generating in-memory prefix set
  - Input:  TableUpdateV4, resulted from [U2]
  - Output: in-memory prefix set to be looked up
  - Status: No V4 code landed at all.

[U4] Generating in-disk prefix set
  - Input: TableUpdateV4, resulted from [U2]
  - Output: in-disk databases
  - Status: No V4 code landed at all.

(In reply to François Marier [:francois] from comment #8)
> Comment on attachment 8783421 [details]
> Bug 1296820 - Skip applying TableUpdateV4 to avoid premature update codes
> being run.
> 
> https://reviewboard.mozilla.org/r/73224/#review71580
> 
> If I understand this correctly, this patch disables all updates if you
> enable V4 updates. Is that right?
> 

No. My intention is just to "not ever attempt to do V4 update".
As you can see, [U3] cannot handle TableUpdateV4 and an error
would just occur. The error is not handled very well so the safebrowsing
directory becomes empty. The fix this issue, we can either

1) Fix the broken error handling so that we can still propagate
   TableUpdateV4 to [U3][U4], get failed (as expected) and then
   everything would get restored.

or

2) Just don't propagate TableUpdateV4 to [U3][U4]. We just drop TableUpdateV4
   objects.

or

3) Get [3][4] finished.

I filed bug 1297330 to keep track of (1) and Dimi is working on (3).
Given that I have no idea how much I would spend fixing (1), I decided
to take (2) to not block V4 testing.

> I was expecting us to be able to start testing V4 updates before landing the
> storage changes. 

The only way we can test V4 updates is to check the logs (and states after
Bug 1287059 lands). We can see if V4 update is done on time, if the previous
state is sent along with the request, if the update response is parsed 
correctly, etc. Those are the only things we can test right now.
Comment on attachment 8783421 [details]
Bug 1296820 - Skip applying TableUpdateV4 to avoid premature update codes being run.

https://reviewboard.mozilla.org/r/73224/#review71596

(In reply to Henry Chang [:henry][:hchang] from comment #9)
> this patch would "un-break" the V2 list updates while retaining the
> V4 update process. To be even more clear, let's split the "V4 update
> process" 
> to the following parts:
> 
> [U1] Sending update request
>   - Input:  The table names (e.g. goog-phishing-proto) and states
>   - Output: A HTTP request along with API key and the base64-encoded
> protobuf binaries.
>   - Stats:  Done.
> 
> [U2] Parsing update response to TableUpdateV4
>   - Input:  protobuf binaries from HTTP response
>   - Output: TableUpdateV4
>   - Status: Done
> 
> [U3] Generating in-memory prefix set
>   - Input:  TableUpdateV4, resulted from [U2]
>   - Output: in-memory prefix set to be looked up
>   - Status: No V4 code landed at all.
> 
> [U4] Generating in-disk prefix set
>   - Input: TableUpdateV4, resulted from [U2]
>   - Output: in-disk databases
>   - Status: No V4 code landed at all.

My apologies, I completely misunderstood your patch!

We are on the same page and I agree with your approach.

I do however have two questions (inline) about where you chose to block the V4 tableupdates. I suspect I am misunderstading the logic of the existing code.

::: toolkit/components/url-classifier/Classifier.cpp:338
(Diff revision 3)
> +    // of the updates is for V4.
> +    for (auto update : *aUpdates) {
> +      if (update && TableUpdate::Cast<TableUpdateV4>(update)) {
> +        LOG(("V4 update is not supported yet."));
> +        // TODO: Bug 1283009 - Supports applying table udpate V4.
> +        return NS_ERROR_NOT_IMPLEMENTED;

If we return here, doesn't that mean that we are skipping all tableupdates (not just the V4 ones)?

::: toolkit/components/url-classifier/Classifier.cpp:352
(Diff revision 3)
> -  LOG(("Applying %d table updates.", aUpdates->Length()));
> +    LOG(("Applying %d table updates.", aUpdates->Length()));
>  
> -  for (uint32_t i = 0; i < aUpdates->Length(); i++) {
> +    for (uint32_t i = 0; i < aUpdates->Length(); i++) {
> -    // Previous UpdateHashStore() may have consumed this update..
> +      // Previous UpdateHashStore() may have consumed this update..
> -    if ((*aUpdates)[i]) {
> +      if ((*aUpdates)[i]) {
> -      // Run all updates for one table
> +        // Run all updates for one table

Why not skip the V4 tables here instead of aborting entirely?
Attachment #8783421 - Flags: review-
Comment on attachment 8783421 [details]
Bug 1296820 - Skip applying TableUpdateV4 to avoid premature update codes being run.

https://reviewboard.mozilla.org/r/73224/#review71596

I implicitly impose a precondition to |Classifier::ApplayUpdate|:

- |aUpdates| contains only TableUpdateV2 or TableUpdateV4. 

Getting rid of this contraint highly depends on how we implement the integrate V2 and V4 update code.
According to how Classifier::ApplayUpdate is currently used, the input would always meet this pre-condition.
Comment on attachment 8783421 [details]
Bug 1296820 - Skip applying TableUpdateV4 to avoid premature update codes being run.

https://reviewboard.mozilla.org/r/73224/#review71858

(In reply to Henry Chang [:henry][:hchang] from comment #11)
>
> I implicitly impose a precondition to |Classifier::ApplayUpdate|:
> 
> - |aUpdates| contains only TableUpdateV2 or TableUpdateV4. 

Ok, that all makes sense to me now. Sorry it took me a while and thanks for taking the time to walk me through it!

::: toolkit/components/url-classifier/Classifier.cpp:332
(Diff revision 3)
> +
> +  {
> +    ScopedUpdatesClearer scopedUpdatesClearer(aUpdates);
> +
> +    // In order to prevent any premature update codes from being run
> +    // against V4 updates, we bail out as early as possible if any

To clarify that `aUpdates` is either all V2 or all V4, let's rephrase this comment:

    In order to prevent any premature update code from being run against V4 updates, we bail out as early as possible if aUpdates is using V4.

The previous wording "if any of the updates" suggests that `aUpdates` could contain a mix of V2 and V4.
Attachment #8783421 - Flags: review+
(In reply to François Marier [:francois] from comment #12)
> Comment on attachment 8783421 [details]
> Bug 1296820 - Skip applying TableUpdateV4 to avoid premature update codes
> being run.
> 
> https://reviewboard.mozilla.org/r/73224/#review71858
> 
> (In reply to Henry Chang [:henry][:hchang] from comment #11)
> >
> > I implicitly impose a precondition to |Classifier::ApplayUpdate|:
> > 
> > - |aUpdates| contains only TableUpdateV2 or TableUpdateV4. 
> 
> Ok, that all makes sense to me now. Sorry it took me a while and thanks for
> taking the time to walk me through it!
> 
> ::: toolkit/components/url-classifier/Classifier.cpp:332
> (Diff revision 3)
> > +
> > +  {
> > +    ScopedUpdatesClearer scopedUpdatesClearer(aUpdates);
> > +
> > +    // In order to prevent any premature update codes from being run
> > +    // against V4 updates, we bail out as early as possible if any
> 
> To clarify that `aUpdates` is either all V2 or all V4, let's rephrase this
> comment:
> 
>     In order to prevent any premature update code from being run against V4
> updates, we bail out as early as possible if aUpdates is using V4.
> 
> The previous wording "if any of the updates" suggests that `aUpdates` could
> contain a mix of V2 and V4.

Hi Francois,

I just found the update error would stop doing the subsequent updates [1].
I made another change to nsUrlClassifierDBService.cpp in the patch to
treat NS_ERROR_NOT_IMPLEMENTED and mark the table spoiled. Would you like
to review again?

Thanks!


[1] https://dxr.mozilla.org/mozilla-central/rev/a551f534773cf2d6933f78ce7d82a7a33a99643e/toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp#466
Flags: needinfo?(francois)
(In reply to Henry Chang [:henry][:hchang] from comment #16)
> I just found the update error would stop doing the subsequent updates [1].
> I made another change to nsUrlClassifierDBService.cpp in the patch to
> treat NS_ERROR_NOT_IMPLEMENTED and mark the table spoiled. Would you like
> to review again?

Looks good to me!
Flags: needinfo?(francois)
Pushed by hchang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/618ef1b4953f
Skip applying TableUpdateV4 to avoid premature update codes being run. r=dimi,francois
https://hg.mozilla.org/mozilla-central/rev/618ef1b4953f
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Depends on: 1342900
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: