Closed Bug 1032414 Opened 7 years ago Closed 7 years ago

nsUrlClassifierStreamUpdater doesn't propagate error code in onStopRequest

Categories

(Core :: DOM: Security, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla37

People

(Reporter: mmc, Assigned: mmc)

Details

Attachments

(1 file, 5 obsolete files)

https://mxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp#497

  } else {
494     // The fetch failed, but we didn't start the stream (probably a
495     // server or connection error).  We can commit what we've applied
496     // so far, and request again later.
497     rv = mDBService->FinishUpdate();
498   }

All it does is throw away the error. So if someone registers an updateUrl that doesn't exist, onStopRequest gets onStopRequest(NS_ERROR_UNKNOWN_HOST) and then merrily calls UpdateSuccess from nsUrlClassifierDBServiceWorker::FinishUpdate.
Summary: nsUrlClassifierStreamUpdater doesn't propogate error code in onStopRequest → nsUrlClassifierStreamUpdater doesn't propagate error code in onStopRequest
This also happens if the request times out, which is worse than UNKNOWN_HOST because it means the backoff logic in the listmanager doesn't trigger.
Assignee: nobody → mmc
Status: NEW → ASSIGNED
Attachment #8517861 - Flags: review?(gpascutto)
Attachment #8517861 - Flags: review?(gpascutto) → review+
Well, this doesn't fix the problem either. We are notifying success from nsUrlClassifierDBService::ApplyUpdate.
We also never get OnStartRequest so this whole block never happens:

https://mxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp#451

We just hit onStopRequest, apply an empty update (which succeeds) and notify success from within the DBservice.
Attachment #8517861 - Attachment is obsolete: true
Comment on attachment 8518285 [details] [diff] [review]
Always return failure in OnStopRequest on network error (

Review of attachment 8518285 [details] [diff] [review]:
-----------------------------------------------------------------

Verified manually by pointing updateURL at some non existent server. I am not a fan of the way that the observer needs 3 callbacks. listmanager never gets more than 1 callback per update because if there's a download error, the success callback gets null'ed out.
Attachment #8518285 - Flags: review?(gpascutto)
Comment on attachment 8518285 [details] [diff] [review]
Always return failure in OnStopRequest on network error (

Review of attachment 8518285 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me. The old code tried to recover partially succeeded downloads, but it's clearly too clever to be reliable.
Attachment #8518285 - Flags: review?(gpascutto) → review+
This got backed out for xpcshell failures that I can't reproduce locally. I'll have another look next week.

mchew-22096:mozilla-central mchew$ ./mach xpcshell-test toolkit/components/url-classifier/tests/unit
From _tests: Kept 20875 existing; Added/updated 0; Removed 3 files and 0 directories.
 0:01.32 LOG: MainThread INFO Using at most 16 threads.
 0:01.32 SUITE_START: MainThread 8
 0:01.37 TEST_START: Thread-5 /Users/mchew/mozilla-central/obj-x86_64-apple-darwin13.4.0/_tests/xpcshell/toolkit/components/url-classifier/tests/unit/test_hashcompleter.js
 0:01.37 TEST_START: Thread-4 /Users/mchew/mozilla-central/obj-x86_64-apple-darwin13.4.0/_tests/xpcshell/toolkit/components/url-classifier/tests/unit/test_digest256.js
 0:01.38 TEST_START: Thread-8 /Users/mchew/mozilla-central/obj-x86_64-apple-darwin13.4.0/_tests/xpcshell/toolkit/components/url-classifier/tests/unit/test_streamupdater.js
 0:01.39 TEST_START: Thread-6 /Users/mchew/mozilla-central/obj-x86_64-apple-darwin13.4.0/_tests/xpcshell/toolkit/components/url-classifier/tests/unit/test_partial.js
 0:01.40 TEST_START: Thread-2 /Users/mchew/mozilla-central/obj-x86_64-apple-darwin13.4.0/_tests/xpcshell/toolkit/components/url-classifier/tests/unit/test_backoff.js
 0:01.40 TEST_START: Thread-1 /Users/mchew/mozilla-central/obj-x86_64-apple-darwin13.4.0/_tests/xpcshell/toolkit/components/url-classifier/tests/unit/test_addsub.js
 0:01.41 TEST_START: Thread-3 /Users/mchew/mozilla-central/obj-x86_64-apple-darwin13.4.0/_tests/xpcshell/toolkit/components/url-classifier/tests/unit/test_dbservice.js
 0:01.42 TEST_START: Thread-7 /Users/mchew/mozilla-central/obj-x86_64-apple-darwin13.4.0/_tests/xpcshell/toolkit/components/url-classifier/tests/unit/test_prefixset.js
 0:05.42 TEST_END: Thread-8 PASS
 0:05.56 TEST_END: Thread-2 PASS
 0:05.86 TEST_END: Thread-1 PASS
 0:06.36 TEST_END: Thread-4 PASS
 0:08.06 TEST_END: Thread-5 PASS
 0:20.75 TEST_END: Thread-7 PASS
 0:20.77 TEST_END: Thread-3 PASS
 0:31.03 TEST_END: Thread-6 PASS
 0:31.05 LOG: MainThread INFO INFO | Result summary:
 0:31.05 LOG: MainThread INFO INFO | Passed: 8
 0:31.05 LOG: MainThread INFO INFO | Failed: 0
 0:31.05 LOG: MainThread INFO INFO | Todo: 0
 0:31.05 LOG: MainThread INFO INFO | Retried: 0
 0:31.05 SUITE_END: MainThread 
Summary
=======

Ran 8 tests
Expected results: 8
Unexpected results: 0
I just noticed this bug during testing. Full log is in bug 1059393. Juicy bits:

014-11-13 12:08:51.506239 UTC - -924907776[7f64cd48b860]: nsUrlClassifierDBServiceWorker::BeginStream
2014-11-13 12:08:51.506244 UTC - -257157312[7f64ef76d4a0]: OnDataAvailable (501 bytes)
2014-11-13 12:08:51.506318 UTC - -257157312[7f64ef76d4a0]: OnStopRequest (status 0)
2014-11-13 12:08:51.506377 UTC - -257157312[7f64ef76d4a0]: Queuing requested update from safebrowsing-cache.google.com/safebr
owsing/rd/ChVnb29nLWJhZGJpbnVybC1zaGF2YXI4AEACSgwIARCYnQIYm50CIAFKDAgAEIqDAhiNgwIgAQ
2014-11-13 12:08:51.506390 UTC - -257157312[7f64ef76d4a0]: Queuing requested update from safebrowsing-cache.google.com/safebr
owsing/rd/ChNnb29nLW1hbHdhcmUtc2hhdmFyOABAAkoMCAEQh_0IGIr9CCABSgwIABDtxQkY8MUJIAE
2014-11-13 12:08:51.506398 UTC - -257157312[7f64ef76d4a0]: Queuing requested update from safebrowsing-cache.google.com/safebrowsing/rd/ChFnb29nLXBoaXNoLXNoYXZhcjgAQAJKDAgBELXdCxi43QsgAUoMCAAQovkVGKX5FSAB
2014-11-13 12:08:51.506406 UTC - -257157312[7f64ef76d4a0]: nsUrlClassifierStreamUpdater::StreamFinished [0, 0]
2014-11-13 12:08:51.506468 UTC - -257157312[7f64ef76d4a0]: nsUrlClassifierStreamUpdater::Notify [7f64b70ebde0]
2014-11-13 12:08:51.506478 UTC - -257157312[7f64ef76d4a0]: Fetching update url: https://safebrowsing-cache.google.com/safebrowsing/rd/ChVnb29nLWJhZGJpbnVybC1zaGF2YXI4AEACSgwIARCYnQIYm50CIAFKDAgAEIqDAhiNgwIgAQ
2014-11-13 12:08:51.506482 UTC - -257157312[7f64ef76d4a0]: (pre) Fetching update from https://safebrowsing-cache.google.com/safebrowsing/rd/ChVnb29nLWJhZGJpbnVybC1zaGF2YXI4AEACSgwIARCYnQIYm50CIAFKDAgAEIqDAhiNgwIgAQ
2014-11-13 12:08:51.506506 UTC - -257157312[7f64ef76d4a0]: (post) Fetching update from https://safebrowsing-cache.google.com/safebrowsing/rd/ChVnb29nLWJhZGJpbnVybC1zaGF2YXI4AEACSgwIARCYnQIYm50CIAFKDAgAEIqDAhiNgwIgAQ
2014-11-13 12:08:51.506509 UTC - -257157312[7f64ef76d4a0]: Fetching update  from https://safebrowsing-cache.google.com/safebrowsing/rd/ChVnb29nLWJhZGJpbnVybC1zaGF2YXI4AEACSgwIARCYnQIYm50CIAFKDAgAEIqDAhiNgwIgAQ
2014-11-13 12:08:51.679967 UTC - -257157312[7f64ef76d4a0]: HTTP request returned failure code.
2014-11-13 12:08:51.679990 UTC - -257157312[7f64ef76d4a0]: HTTP request returned failure code: 400.
2014-11-13 12:08:51.680674 UTC - -257157312[7f64ef76d4a0]: OnStopRequest (status 80004004)
...
2014-11-13 12:08:51.841961 UTC - -924907776[7f64cd48b860]: Notifying success: 1908
2014-11-13 12:08:51.842031 UTC - -257157312[7f64ef76d4a0]: nsUrlClassifierStreamUpdater::UpdateSuccess [this=7f64b70ebde0]
2014-11-13 12:08:51.842087 UTC - -257157312[7f64ef76d4a0]: nsUrlClassifierStreamUpdater::DownloadDone [this=7f64b70ebde0]
Yeah, that's the NS_ERROR_ABORT error. For posterity, the backout was in https://hg.mozilla.org/integration/mozilla-inbound/rev/f2dd2c376a05.
The stream updater test was failing on linux because the update error was triggering backoffs (I think). I don't understand why it doesn't fail on mac.
Attachment #8518285 - Attachment is obsolete: true
Attachment #8523307 - Attachment is obsolete: true
Comment on attachment 8523335 [details] [diff] [review]
Always return failure in OnStopRequest on network error (

Review of attachment 8523335 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp
@@ +344,5 @@
>      return NS_OK;
>    }
>  
>    // Wait the requested amount of time before starting a new stream.
> +  // WTF, listmanager.js already does this!

This code has been there forever. So has the callback code in listmanager to reset the timer on downloadSuccess, though I did touch it earlier this year. I think this means we may be having multiple downloads for the same pending update, but I don't want to touch it in this bug.

::: toolkit/components/url-classifier/tests/unit/test_streamupdater.js
@@ +124,4 @@
>      "urlsExist" : add1Urls
>    };
>  
> +  doTest([update], assertions, true);

I think this behavior might be broken, but I don't know what the right solution is.

In this case we are updating the table with add1Urls before we hit the broken update, so they exist in test-phish-simple. However, since we fail on the update, dbService.getTables won't include "test-phish-simple". The fact that we can process these mid-stream updates at all is an artifact of the way the stream updater is written. Normally the update response is just a list of redirect URLs and doesn't include chunk data (right?), but the tests actually take advantage of this fact.

So in this test we are in the awkward position of having no chunk data but having flushed URLs already. There's not an easy way to fix this.
Attachment #8523335 - Flags: review?(gpascutto)
Comment on attachment 8523335 [details] [diff] [review]
Always return failure in OnStopRequest on network error (

Review of attachment 8523335 [details] [diff] [review]:
-----------------------------------------------------------------

This patch is now very confusing because there's a load of extra logging and commenting intertwined with the actual changes. Please clean it up before landing.

As for behavior on partial updates:
https://developers.google.com/safe-browsing/developers_guide_v3#Changes3.0

1) If a redirect request returns an error code, the client MUST perform backoff behavior as indicated in the Request Frequency section.
2) A client MUST perform a download request again if a redirect request returns an error.
3) The client SHOULD keep all data delivered prior to a bad request.
4) The client MUST refuse to use the whole response if any of the adddel and subdel metadata headers, or the encoded chunk data, cannot be parsed successfully.
5) Upon successful decoding of all the response and all the binary data, the client MUST update its lists in an atomic fashion.

I think the test is failing because (3) is no longer respected?

::: toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp
@@ +290,3 @@
>      // We can commit the urls that we've applied so far.  This is
>      // probably a transient server problem, so trigger backoff.
> +    // What if the whole chunk failed? What's the state of the tables?

I think it's a good point that if we only partially apply an update, but mark the chunk as received, we will be missing entries forever.

::: toolkit/components/url-classifier/tests/unit/test_streamupdater.js
@@ +99,5 @@
>      [{ "chunkNum" : 1,
>         "urls" : add1Urls }]);
>    update += "u:asdf://blah/blah\n";  // invalid URL scheme
>  
>    // The first part of the update should have succeeded.

I presume this comment is now no longer true.

@@ +117,5 @@
>      [{ "chunkNum" : 1,
>         "urls" : add1Urls }]);
>    update += "u:http://test.invalid/asdf/asdf\n";  // invalid URL scheme
>  
> +  // The first part of the update should have succeeded.

Same.
Attachment #8523335 - Flags: review?(gpascutto) → review-
(In reply to [:mmc] Monica Chew (please use needinfo) from comment #19)

> The fact
> that we can process these mid-stream updates at all is an artifact of the
> way the stream updater is written. Normally the update response is just a
> list of redirect URLs and doesn't include chunk data (right?), but the tests
> actually take advantage of this fact.

That's a tricky one. The spec, as far as I can tell, says that both are valid in its BNF protocol description, but explains in the text that "The response doesn't actually contain the data associated with the lists; instead, it tells you where to find the data via redirect URLs. These URLs should be visited in the order that they are given, and if an error is encountered fetching any of the URLs, then the client must NOT fetch any URL after that. Parallel fetching is NOT allowed."

I see this as explaining the current implementation rather clarifying the spec.

The test does:

  var update = buildPhishingUpdate(
    [{ "chunkNum" : 1,
       "urls" : add1Urls }]);
  update += "u:http://test.invalid/asdf/asdf\n";  // invalid URL scheme


>In this case we are updating the table with add1Urls before we hit the broken update, so they exist in 
>test-phish-simple. However, since we fail on the update, dbService.getTables won't include "test-phish-
>simple".

The latter is a bug.
(In reply to Gian-Carlo Pascutto [:gcp] from comment #21)
> (In reply to [:mmc] Monica Chew (please use needinfo) from comment #19)
> 
> > The fact
> > that we can process these mid-stream updates at all is an artifact of the
> > way the stream updater is written. Normally the update response is just a
> > list of redirect URLs and doesn't include chunk data (right?), but the tests
> > actually take advantage of this fact.
> 
> That's a tricky one. The spec, as far as I can tell, says that both are
> valid in its BNF protocol description, but explains in the text that "The
> response doesn't actually contain the data associated with the lists;
> instead, it tells you where to find the data via redirect URLs. These URLs
> should be visited in the order that they are given, and if an error is
> encountered fetching any of the URLs, then the client must NOT fetch any URL
> after that. Parallel fetching is NOT allowed."
> 
> I see this as explaining the current implementation rather clarifying the
> spec.
> 
> The test does:
> 
>   var update = buildPhishingUpdate(
>     [{ "chunkNum" : 1,
>        "urls" : add1Urls }]);
>   update += "u:http://test.invalid/asdf/asdf\n";  // invalid URL scheme
> 
> 
> >In this case we are updating the table with add1Urls before we hit the broken update, so they exist in 
> >test-phish-simple. However, since we fail on the update, dbService.getTables won't include "test-phish-
> >simple".
> 
> The latter is a bug.

We don't get all of the chunk data for chunk 1. So do I report incorrect chunk data, or blow away the entire chunk?
>We don't get all of the chunk data for chunk 1.

I don't understand why you say this.
You MUST refuse to use it (see rule 4) but that's not what that test is doing at all.
From gcp over irc, these tests should be getting urlsExist for add1Urls and tablesData="test-phish-simple", but the caller should be getting an error callback. I don't know how to make that happen yet, but hopefully I'll have some time this week.
Attached patch fix-onstop (obsolete) — Splinter Review
Per irc, I left the test the way it was.
<gcp> It looks to me whether the "should we apply this or not" call needs to be made by fixing ProtocolParser and then checking the result in nsUrlClassifierDBServiceWorker::FinishStream()
9:26 AM <gcp> I think that's probably the only way to get a consistent result out of the test
9:29 AM <gcp> actually even so
9:30 AM <gcp> I don't think our architecture supports this
9:30 AM <gcp> check ProtocolParser::AppendStream for example
9:31 AM <gcp> how can this ever communicate what tables the update should be applied to?
 <gcp> in any case if we're fixing this
9:38 AM <gcp> 3) The client SHOULD keep all data delivered prior to a bad request.
9:38 AM <gcp> that's a SHOULD, so "drop update on any error" is probably the safest way forward
9:38 AM <gcp> the others are MUST
9:39 AM <gcp> which means changing the tests as you were proposing is OK as a fallback
Attachment #8523335 - Attachment is obsolete: true
Attachment #8535068 - Flags: review?(gpascutto)
Attachment #8535068 - Attachment is patch: true
Comment on attachment 8535068 [details] [diff] [review]
fix-onstop

Review of attachment 8535068 [details] [diff] [review]:
-----------------------------------------------------------------

OK on the principles but please update the comments.

::: toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp
@@ +351,5 @@
>      return NS_OK;
>    }
>  
>    // Wait the requested amount of time before starting a new stream.
> +  // WTF, listmanager.js already does this!

Link the bug where this will get fixed instead of the expletive.

::: toolkit/components/url-classifier/tests/unit/test_streamupdater.js
@@ +7,5 @@
>    }
>  }
>  
> +// Never use the same URLs for multiple tests, because we aren't guaranteed
> +// to reset the database between tests.

How did this work before then? I'm guessing it did because they all replace the same chunk #1?

@@ +99,5 @@
>      [{ "chunkNum" : 1,
>         "urls" : add1Urls }]);
>    update += "u:asdf://blah/blah\n";  // invalid URL scheme
>  
>    // The first part of the update should have succeeded.

Well, no. Update the comment.

@@ +117,5 @@
>      [{ "chunkNum" : 1,
>         "urls" : add1Urls }]);
>    update += "u:http://test.invalid/asdf/asdf\n";  // invalid URL scheme
>  
> +  // The first part of the update should have succeeded.

Nope.
Attachment #8535068 - Flags: review?(gpascutto) → review+
Comment on attachment 8535068 [details] [diff] [review]
fix-onstop

Review of attachment 8535068 [details] [diff] [review]:
-----------------------------------------------------------------

Fixed and checking in. Thanks!

::: toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp
@@ +351,5 @@
>      return NS_OK;
>    }
>  
>    // Wait the requested amount of time before starting a new stream.
> +  // WTF, listmanager.js already does this!

This is bug 1110891, fixed.

::: toolkit/components/url-classifier/tests/unit/test_streamupdater.js
@@ +7,5 @@
>    }
>  }
>  
> +// Never use the same URLs for multiple tests, because we aren't guaranteed
> +// to reset the database between tests.

Yes.

@@ +99,5 @@
>      [{ "chunkNum" : 1,
>         "urls" : add1Urls }]);
>    update += "u:asdf://blah/blah\n";  // invalid URL scheme
>  
>    // The first part of the update should have succeeded.

add1Urls is present, but that's an artifact of the way that we do the test. I will just change the comment to that, because I'm not sure what you mean -- it is kind of a partially successful update.
Comment on attachment 8536880 [details] [diff] [review]
Always return failure in OnStopRequest on network error (

Carrying over
Attachment #8536880 - Flags: review+
Attachment #8535068 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/89c76328ade6
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
You need to log in before you can comment on or make changes to this bug.