Closed Bug 1336904 Opened 4 years ago Closed 4 years ago

Add a timeout for Safe Browsing updates

Categories

(Toolkit :: Safe Browsing, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: francois, Assigned: tnguyen)

References

Details

(Whiteboard: #sbv4-m7)

Attachments

(1 file, 6 obsolete files)

Right now, we don't have a specific timeout for Safe Browsing updates. While they happen in the background and mostly don't block anything, it's possible that they lock some files and do block certain lookups (see bug 1334616 for example).

This was done for gethash (5 seconds) in bug 1024555 and for download protection (10 seconds) in bug 1165816.

We should wait until bug 1336903 has landed however since the data collected in that probe will inform our choice for the value of that timeout.

The timeout value we pick should probably not be any smaller than 60 seconds. It should be configurable via urlclassifier.update.timeout_ms
We should also add a telemetry probe like URLCLASSIFIER_COMPLETE_TIMEOUT2 (http://searchfox.org/mozilla-central/rev/b1aadb3572eaf7d2c70e19a2ba5413809d9ac698/toolkit/components/telemetry/Histograms.json#4076) as part of this patch.
Whiteboard: #sbv4-m7
Assignee: nobody → tnguyen
Status: NEW → ASSIGNED
Attachment #8860864 - Attachment filename: Bug-1336904---Add-a-timeout-for-Safe-Browsing-upda.patch → WIP Bug-1336904---Add-a-timeout-for-Safe-Browsing-upda.patch
Attachment #8860864 - Attachment is obsolete: true
Attachment #8860864 - Attachment is obsolete: false
Attachment #8860864 - Attachment is obsolete: true
Attachment #8861322 - Attachment is obsolete: true
Attachment #8861323 - Attachment is obsolete: true
Attachment #8861333 - Attachment is obsolete: true
We may have 2 kinds of time out
- Server response timeout, default value is set to 5 secs
- Download update timeout, default value is 60secs

MozReview-Commit-ID: AJfR193cTf8
Attachment #8861337 - Attachment is obsolete: true
Attachment #8861338 - Flags: review?(francois)
Comment on attachment 8861338 [details] [diff] [review]
Add a timeout for Safe Browsing updates

Review of attachment 8861338 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/libpref/init/all.js
@@ +5202,5 @@
>  
>  // Gethash timeout for Safebrowsing.
>  pref("urlclassifier.gethash.timeout_ms", 5000);
> +// Update server response timeout for Safebrowsing.
> +pref("urlclassifier.update.server_response_timeout_ms", 5000);

nit: I would use just `response_timeout_ms` here.

@@ +5204,5 @@
>  pref("urlclassifier.gethash.timeout_ms", 5000);
> +// Update server response timeout for Safebrowsing.
> +pref("urlclassifier.update.server_response_timeout_ms", 5000);
> +// Download update timeout for Safebrowsing.
> +pref("urlclassifier.update.download_timeout_ms", 60000);

nit: `timeout_ms` since it's the overall timer

::: toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp
@@ +25,5 @@
>  #include "nsContentUtils.h"
>  #include "nsIURLFormatter.h"
>  #include "Classifier.h"
>  
> +#define DEFAULT_SERVER_RESPONSE_TIMEOUT 5 * 1000

`DEFAULT_RESPONSE_TIMEOUT_MS`

@@ +26,5 @@
>  #include "nsIURLFormatter.h"
>  #include "Classifier.h"
>  
> +#define DEFAULT_SERVER_RESPONSE_TIMEOUT 5 * 1000
> +#define MINIMUM_DOWNLOAD_UPDATE_TIMEOUT 60 * 1000

`DEFAULT_TIMEOUT_MS`

@@ +27,5 @@
>  #include "Classifier.h"
>  
> +#define DEFAULT_SERVER_RESPONSE_TIMEOUT 5 * 1000
> +#define MINIMUM_DOWNLOAD_UPDATE_TIMEOUT 60 * 1000
> +

Let's add a static assert here to ensure that `DEFAULT_TIMEOUT_MS > DEFAULT_SERVER_RESPONSE`.

@@ +32,3 @@
>  static const char* gQuitApplicationMessage = "quit-application";
>  
> +static uint32_t sServerResponseTimeoutMs = DEFAULT_SERVER_RESPONSE_TIMEOUT;

`sResponseTimeoutMs`

@@ +32,4 @@
>  static const char* gQuitApplicationMessage = "quit-application";
>  
> +static uint32_t sServerResponseTimeoutMs = DEFAULT_SERVER_RESPONSE_TIMEOUT;
> +static uint32_t sDownloadUpdateTimeoutMs = MINIMUM_DOWNLOAD_UPDATE_TIMEOUT;

`sTimeoutMs`

@@ +205,5 @@
> +                                          "urlclassifier.update.server_response_timeout_ms",
> +                                          DEFAULT_SERVER_RESPONSE_TIMEOUT);
> +    preferencesInitialized = true;
> +  }
> +

We should add a check to only create the timers if `sResponseTimeoutMs < sTimeoutMs`.

@@ +215,5 @@
> +  }
> +
> +  mDownloadUpdateTimeoutTimer = do_CreateInstance("@mozilla.org/timer;1", &rv);
> +
> +  // The timeout value we pick should probably not be any smaller than 60 seconds

I think I would remove this check. We should make 60 seconds the default (like you did) but I can see value if making this lower in our tests.

@@ +942,5 @@
> +      mDownloadUpdateTimeoutTimer->Cancel();
> +      mDownloadUpdateTimeoutTimer = nullptr;
> +    }
> +    // Trigger backoff
> +    mDownloadError = true;

I would move the comment `// triggers backoff` to the end of this line to make it clear that it's setting `mDownloadError` which triggers backoff.

@@ +948,5 @@
> +  }
> +
> +  if (timer == mDownloadUpdateTimeoutTimer) {
> +    mDownloadUpdateTimeoutTimer = nullptr;
> +    updateFailed = true;

I would add a comment above this line: // No backoff since the connection may just be temporarily slow.

@@ +953,5 @@
> +  }
> +
> +  if (updateFailed) {
> +    // We cancel channel and it will trigger OnStopRequest:
> +    // - CancelUpdate if server responsed but it takes too long to download

"responded"

"took too long"

@@ +960,5 @@
> +    // responsed
> +    mozilla::Unused << mChannel->Cancel(NS_ERROR_ABORT);
> +    mChannel = nullptr;
> +    mTelemetryClockStart = 0;
> +    return NS_OK;

Maybe we should cancel the other timers here (`mFetchNextRequestTimer` and `mFetchIndirectUpdatesTimer`)?

::: toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.h
@@ +90,5 @@
>    // request issused outside of StreamUpdater. We have to fire a timer to
>    // retry on our own.
>    nsCOMPtr<nsITimer> mFetchNextRequestTimer;
>  
> +  // Timer to abort downloading if server takes too long to response

"Timer to abort the download if the server takes too long to respond."

@@ +91,5 @@
>    // retry on our own.
>    nsCOMPtr<nsITimer> mFetchNextRequestTimer;
>  
> +  // Timer to abort downloading if server takes too long to response
> +  nsCOMPtr<nsITimer> mServerResponseTimeoutTimer;

`mResponseTimeoutTimer`

@@ +93,5 @@
>  
> +  // Timer to abort downloading if server takes too long to response
> +  nsCOMPtr<nsITimer> mServerResponseTimeoutTimer;
> +
> +  // Timer to abort downloading if it takes too long download update

"Timer to abort the download if it takes too long."

@@ +94,5 @@
> +  // Timer to abort downloading if server takes too long to response
> +  nsCOMPtr<nsITimer> mServerResponseTimeoutTimer;
> +
> +  // Timer to abort downloading if it takes too long download update
> +  nsCOMPtr<nsITimer> mDownloadUpdateTimeoutTimer;

`mTimeoutTimer`
Attachment #8861338 - Flags: review?(francois) → review-
(In reply to François Marier [:francois] from comment #8)
> 
> @@ +960,5 @@
> > +    // responsed
> > +    mozilla::Unused << mChannel->Cancel(NS_ERROR_ABORT);
> > +    mChannel = nullptr;
> > +    mTelemetryClockStart = 0;
> > +    return NS_OK;
> 
> Maybe we should cancel the other timers here (`mFetchNextRequestTimer` and
> `mFetchIndirectUpdatesTimer`)?
> 
mFetchNextRequestTimer is fired before we start updating (in the case we failed to update for any reason, this will help us to retry)
mFetchIndirectUpdatesTimer is fired after we finished a stream (channel) and are going to FetchNext.(a bit surprised when I found that we are using 0 delay, mean the timer is fired immediately).
https://searchfox.org/mozilla-central/rev/876c7dd30586f9c6f9c99ef7444f2d73c7acfe7c/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#577
So mFetchNextRequestTimer, mFetchIndirectUpdatesTimer would be "out of downloading" timer and probably we dont have to cancel them.
Attachment #8861844 - Flags: review?(francois)
Attachment #8861338 - Attachment is obsolete: true
(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #9)
> mFetchNextRequestTimer is fired before we start updating (in the case we
> failed to update for any reason, this will help us to retry)

It may help, but probably not because it retries very shortly afterwards. If the network is temporarily bad, it's likely to be still bad a second later.

> mFetchIndirectUpdatesTimer is fired after we finished a stream (channel) and
> are going to FetchNext.(a bit surprised when I found that we are using 0
> delay, mean the timer is fired immediately).

This is for the "u:" redirections in V2. So I would say it's part of the overall timeout. If we're not done downloading the update and all of its dependent files, we should abort the whole process. Otherwise, we'll be in a weird state where we've aborted the first part of the update because it took too long, but then we're still fetching one of the dependent pieces.
Comment on attachment 8861844 [details]
Bug 1336904 - Add timeouts for Safe Browsing updates

https://reviewboard.mozilla.org/r/133824/#review137032

A few minor tweaks to comments and warnings, but I also think we need to cancel the other two timers as per comment 11.

::: commit-message-0f5ba:1
(Diff revision 1)
> +Bug 1336904 - Add a timeout for Safe Browsing updates

"Add timeouts for Safe Browsing updates"

::: commit-message-0f5ba:3
(Diff revision 1)
> +Bug 1336904 - Add a timeout for Safe Browsing updates
> +
> +We may have 2 kinds of time out

This paragraph is not needed because we already have  good comments in the patch.

I would suggest instead that we add a sentence that explains the reasoning for these timeouts:

    These timeouts will ensure that we don't block the Safe Browsing update thread for too long when we encounter slow or bad network conditions.

::: toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp:214
(Diff revision 1)
> +                                          DEFAULT_RESPONSE_TIMEOUT);
> +    preferencesInitialized = true;
> +  }
> +
> +  if (sResponseTimeoutMs > sTimeoutMs) {
> +    NS_WARNING("Reponse timeout is greater than general timeout");

I would suggest including in the warning how we are dealing with this:

    "Safe Browsing response timeout is greater than the general timeout. Disabling these update timeouts."

::: toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp:961
(Diff revision 1)
> +    // No backoff since the connection may just be temporarily slow.
> +    updateFailed = true;
> +  }
> +
> +  if (updateFailed) {
> +    // We cancel channel and it will trigger OnStopRequest:

nit: I would simplify this comment down to just:

    // Cancelling the channel will trigger OnStopRequest.

Since you already have comments that talk about backoff above that block.
Attachment #8861844 - Flags: review?(francois) → review-
Thanks Francois for your review
Comment on attachment 8861844 [details]
Bug 1336904 - Add timeouts for Safe Browsing updates

https://reviewboard.mozilla.org/r/133824/#review137116

Thanks Thomas!
Attachment #8861844 - Flags: review?(francois) → review+
Comment on attachment 8861844 [details]
Bug 1336904 - Add timeouts for Safe Browsing updates

https://reviewboard.mozilla.org/r/133824/#review137366

Sorry Thomas, I forgot one thing: we need to add a telemetry probe to see how often the timeout is triggered.

Something like the `URLCLASSIFIER_COMPLETE_TIMEOUT2` you added in bug 1311926.
Attachment #8861844 - Flags: review+ → review-
Comment on attachment 8861844 [details]
Bug 1336904 - Add timeouts for Safe Browsing updates

https://reviewboard.mozilla.org/r/133824/#review137834

::: toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp:869
(Diff revisions 2 - 3)
>      mTimeoutTimer = nullptr;
>    }
>  
> +  // mResponseTimeoutTimer may be cleared in OnStartRequest, so we check mTimeoutTimer
> +  // to see whether the udpate was timed out
> +  if (mTimeoutTimer) {

This will never be true because if it were true on line 862, it will be set to `nullptr` on line 864.

::: toolkit/components/telemetry/Histograms.json:4221
(Diff revision 3)
> +    "expires_in_version": "61",
> +    "kind": "enumerated",
> +    "keyed": true,
> +    "n_values": 4,
> +    "bug_numbers": [1336904],
> +    "description": "This metric is recorded every time we send download update request to server (0 = no timeout, 1 = server respond timeout, 2 = Server did respond but it took long to download). Keyed by provider"

I would simplify this down to:

"Whether or not an update timed out (0 = no timeout, 1 = server response timeout, 2 = overall timeout). Keyed by provider"
Attachment #8861844 - Flags: review?(francois) → review-
Comment on attachment 8861844 [details]
Bug 1336904 - Add timeouts for Safe Browsing updates

https://reviewboard.mozilla.org/r/133824/#review137834

> This will never be true because if it were true on line 862, it will be set to `nullptr` on line 864.

Oops, thanks you
That was my silly mistake, could you take a look again?
Comment on attachment 8861844 [details]
Bug 1336904 - Add timeouts for Safe Browsing updates

https://reviewboard.mozilla.org/r/133824/#review137864

Just two small typos to fix before you land.

datareview+

::: toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp:863
(Diff revisions 3 - 4)
> -    mTimeoutTimer->Cancel();
> -    mTimeoutTimer = nullptr;
> -  }
> -
>    // mResponseTimeoutTimer may be cleared in OnStartRequest, so we check mTimeoutTimer
>    // to see whether the udpate was timed out

"update"

"has timed out"
Attachment #8861844 - Flags: review?(francois) → review+
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/54ab48204b9e
Add timeouts for Safe Browsing updates r=francois
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/54ab48204b9e
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Depends on: 1366394
You need to log in before you can comment on or make changes to this bug.