Closed
Bug 1385609
Opened 7 years ago
Closed 7 years ago
Backoff seems to be interfering with updates
Categories
(Toolkit :: Safe Browsing, defect, P2)
Toolkit
Safe Browsing
Tracking
()
RESOLVED
FIXED
mozilla57
People
(Reporter: francois, Assigned: tnguyen)
Details
(Whiteboard: #sbv4-m10)
Attachments
(6 files)
From what I can see, the google4 provider is no longer updating even though both the backoff time and the nextupdatetime have passed. Maybe there's a logic error somewhere in our code?
Assignee | ||
Comment 1•7 years ago
|
||
I did not take a deep look at this, but I guess there is probably something wrong when we show backoff time and last update status. If the last update succeeded, we should show 0 back off time. We should fix it first to get a correct debug information.
Assignee | ||
Comment 2•7 years ago
|
||
And, there's one case I am thinking. That is, we click update > 4 times in a sequence (in a short period of time), then this will happen. We could not make any other request in a period of 60 mins. (restart would help) http://searchfox.org/mozilla-central/rev/09c065976fd4f18d4ad764d7cb4bbc684bf56714/toolkit/components/url-classifier/nsUrlClassifierLib.js#119 hmm, we should clear requestTimes_ when getting 200 from server, shouldn't we?
Reporter | ||
Comment 3•7 years ago
|
||
(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #1) > I did not take a deep look at this, but I guess there is probably something > wrong when we show backoff time and last update status. If the last update > succeeded, we should show 0 back off time. Yes. As soon as we have a successful update we should be exiting the backoff mode: https://developers.google.com/safe-browsing/v4/request-frequency#back-off-mode That should be reflected on about:url-classifier since it's meant to help us detect when the internal state is wrong. > We should fix it first to get a correct debug information. Indeed, if there's a bug in about:url-classifier, we should fix it quickly so that we can detect any problems with the backoff algorithm as we roll this code out to more users.
Whiteboard: #sbv4-m8
Reporter | ||
Comment 4•7 years ago
|
||
(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #2) > hmm, we should clear requestTimes_ when getting 200 from server, shouldn't > we? I don't think we need to. Looking at the code (http://searchfox.org/mozilla-central/rev/bbc1c59e460a27b20929b56489e2e55438de81fa/toolkit/components/url-classifier/nsUrlClassifierLib.js#119) if `requestTimes_` only contains old entries then they are ignored and the update is allowed to proceed. Also, it was added in bug 428548 to prevent a high ping frequency if anything goes wrong and so I would be wary of removing it. Especially since it would mean that a user could go into about:url-classifier and click the button repeatedly.
Assignee | ||
Comment 5•7 years ago
|
||
I could not reproduce the issue, because if the last update succeeds, the about page will be refreshed with 0 backoff time. It looks like there're some differences between your profile and my profile. Could you please send me your value of "urlclassifier" and "safebrowsing" preferences?
Flags: needinfo?(francois)
Reporter | ||
Comment 7•7 years ago
|
||
(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #5) > I could not reproduce the issue, because if the last update succeeds, the > about page will be refreshed with 0 backoff time. > It looks like there're some differences between your profile and my profile. > Could you please send me your value of "urlclassifier" and "safebrowsing" > preferences? Yeah, let's give it a few days to see if you can reproduce it too. (emailed you the details)
Flags: needinfo?(francois)
Reporter | ||
Updated•7 years ago
|
Whiteboard: #sbv4-m8 → #sbv4-m9
Reporter | ||
Updated•7 years ago
|
Priority: P2 → P3
Reporter | ||
Comment 8•7 years ago
|
||
It happened again on Nightly 57.0a1 (2017-08-16) (64-bit).
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → tnguyen
Assignee | ||
Comment 9•7 years ago
|
||
UI looks correct this time but seems update was no longer triggered after the last failure. I will try to see what happened
Status: NEW → ASSIGNED
Assignee | ||
Updated•7 years ago
|
Priority: P3 → P2
Assignee | ||
Comment 10•7 years ago
|
||
Some cases I would think about: - We did > 4 requests in 1 hour (server might return too small wait duration). Then the next scheduled checkupdate would be false, unfortunately, we did not reset the timer for the next update. Update would be stopped at that time. http://searchfox.org/mozilla-central/rev/e8c36327cd8c9432c69e5e1383156a74330f11f2/toolkit/components/url-classifier/nsUrlClassifierListManager.js#324 - Or, the update request was put in queue http://searchfox.org/mozilla-central/rev/e8c36327cd8c9432c69e5e1383156a74330f11f2/toolkit/components/url-classifier/nsUrlClassifierListManager.js#475 but the updater failed to notify callback (which will reset the next update timer). I think a full log would be useful to look at this issue. I am trying to enable log in about:url-classifier and leaving my computer running for a long time. If you could reproduce this issue, please set log enabled and log file so we could see what happened.
Reporter | ||
Comment 11•7 years ago
|
||
This one is strange: Baidu updates are stuck on an expired backoff, but the last update shows "N/A".
Assignee | ||
Comment 12•7 years ago
|
||
(In reply to François Marier [:francois] from comment #11) > Created attachment 8898835 [details] > Screenshot of stuck baidu update in about:url-classifier > > This one is strange: Baidu updates are stuck on an expired backoff, but the > last update shows "N/A". The only place that we show last update "N/A" is when "onload" event is fired Provider.init() -> Provider.render() http://searchfox.org/mozilla-central/rev/b258e6864ee3e809d40982bc5d0d5aff66a20780/toolkit/content/aboutUrlClassifier.js#137 So I think that the page was refreshed or window onload was fired.
Assignee | ||
Comment 13•7 years ago
|
||
hmm, finally I saw it has just occurred today but unfortunately no useful JS log in Nightly 57.0a1 . I got modules log of nsUrlClassifierStreamUpdater but it looks good [Main Thread]: D/UrlClassifierStreamUpdater nsUrlClassifierStreamUpdater::OnStartRequest (status=NS_OK, uri=https://cache-ams04.cdn.yandex.net/sba.cdn.yandex.net/chunks/ydx-fraud-phishing-shavar/.................., this=113bbc3a0) [Main Thread]: D/UrlClassifierStreamUpdater nsUrlClassifierStreamUpdater::OnStartRequest failed (404) [Main Thread]: D/UrlClassifierStreamUpdater nsUrlClassifierStreamUpdater::Download error [this=113bbc3a0] .............. [Main Thread]: D/UrlClassifierStreamUpdater nsUrlClassifierStreamUpdater::UpdateSuccess [this=113bbc3a0] [Main Thread]: D/UrlClassifierStreamUpdater nsUrlClassifierStreamUpdater::DownloadDone [this=113bbc3a0] [Main Thread]: D/UrlClassifierStreamUpdater Notify download error callback in UpdateSuccess [this=113bbc3a0] [Main Thread]: D/UrlClassifierStreamUpdater stream updater: calling into fetch next request Not sure what happened in listmanager.js. I will try to run in debug build to see what happened
Assignee | ||
Comment 14•7 years ago
|
||
I got one log listmanager: 14:42:11 GMT+0800 (CST): checkForUpdates with https://download.api.bsb.baidu.com/downloads?ver=2.2&key=ffD7Y9anV5dZVp8 listmanager: 14:42:11 GMT+0800 (CST): Can't make update request listmanager: 14:42:43 GMT+0800 (CST): checkForUpdates with https://safebrowsing.google.com/safebrowsing/downloads?client=navclient-auto-ffox&appver=57.0a&pver=2.2&key=[trimmed-google-api-key] For any reason, we still have to make sure there's a timer for the next update running
Assignee | ||
Comment 15•7 years ago
|
||
[Main Thread]: D/UrlClassifierStreamUpdater stream updater: calling into fetch next request [Main Thread]: D/UrlClassifierStreamUpdater No more requests, returning listmanager: 15:31:25 GMT+0800 (CST): checkForUpdates with https://download.api.bsb.baidu.com/downloads?ver=2.2&key=ffD7Y9anV5dZVp8 RequestBackoff: 15:31:25 GMT+0800 (CST): canMakeRequest : now : 1504078285494 requestTimes 1504063657792 length 4 nextRequestTime_ 1504078285611 errorTimeout_ 4241554 numErrors_ 3 The backoff time is shown in about:url-classifier is Wed Aug 30 2017 15:31:25 GMT+0800 (CST) I see that the canMakeRequest is bailed out at http://searchfox.org/mozilla-central/rev/51b3d67a5ec1758bd2fe7d7b6e75ad6b6b5da223/toolkit/components/url-classifier/nsUrlClassifierLib.js#115 Obviously, nsITimer is not abosultely accurate. The delay is approximate so we may have to give it a tolerance, 1 sec for example
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #8902617 -
Flags: review?(francois)
Reporter | ||
Comment 17•7 years ago
|
||
mozreview-review |
Comment on attachment 8902617 [details] Bug 1385609 - Fix backoff issue that makes SB lists no longer update https://reviewboard.mozilla.org/r/174232/#review179792 Great investigation Thomas! ::: commit-message-d10c9:4 (Diff revision 1) > +Bug 1385609 - Fix backoff issue that makes SB lists no longer update > + > +The issue may occur when nsITimer is fired earlier than the request time > +elapses. We fix by: I would suggest the following to clarify the problem: ``` The issue occurs when nsITimer is fired earlier than the backoff time. In that case, the update doesn't proceed and we never make another attempt because the backoff update timer was oneshot. We fix the issue in two ways: - Add a tolerance of 1 second in case the timer fires too early. - Set another oneshot timer whenever we are prevented from updating due to backoff. ``` ::: toolkit/components/url-classifier/nsUrlClassifierLib.js:115 (Diff revision 1) > * Check to see if we can make a request. > */ > RequestBackoff.prototype.canMakeRequest = function() { > var now = Date.now(); > - if (now < this.nextRequestTime_) { > + // Note that nsITimer delay is approximate: the timer can be fired before the > + // requested time has elapsed. So, give it a tolerance 1 sec "tolerance of 1 second." ::: toolkit/components/url-classifier/nsUrlClassifierLib.js:116 (Diff revision 1) > */ > RequestBackoff.prototype.canMakeRequest = function() { > var now = Date.now(); > - if (now < this.nextRequestTime_) { > + // Note that nsITimer delay is approximate: the timer can be fired before the > + // requested time has elapsed. So, give it a tolerance 1 sec > + if (now + 1000 < this.nextRequestTime_) { If the timer fires before the update can proceed, then it's fine as long as it fires again. Is this timer never going to fire again? ::: toolkit/components/url-classifier/nsUrlClassifierListManager.js:222 (Diff revision 1) > this.updateCheckers_[updateUrl] = Cc["@mozilla.org/timer;1"] > .createInstance(Ci.nsITimer); > this.updateCheckers_[updateUrl].initWithCallback(() => { > this.updateCheckers_[updateUrl] = null; > - this.checkForUpdates(updateUrl); > + if (updateUrl && !this.checkForUpdates(updateUrl)) { > + this.setUpdateCheckTimer(updateUrl, this.updateInterval); Let's add a comment here: // Make another attempt later.
Attachment #8902617 -
Flags: review?(francois) → review+
Assignee | ||
Comment 18•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8902617 [details] Bug 1385609 - Fix backoff issue that makes SB lists no longer update https://reviewboard.mozilla.org/r/174232/#review179792 > If the timer fires before the update can proceed, then it's fine as long as it fires again. > > Is this timer never going to fire again? Yes, if the timer fires before the request time, it will be never going to fire again (that's how the issue occurs). It only occurs for backoff case. Another case, for test purpose, people click update >= 4 times update button. Then update timer will never fire again.
Comment hidden (mozreview-request) |
Assignee | ||
Comment 20•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=37402f64d0556e1b34a8b5794ba97ced8dedd957
Reporter | ||
Comment 21•7 years ago
|
||
Since this one is a little riskier than other changes, and we've had this bug forever, let's not uplift to Beta. We should get more testing on Nightly.
Whiteboard: #sbv4-m9 → #sbv4-m10
Comment hidden (mozreview-request) |
Assignee | ||
Comment 23•7 years ago
|
||
I add a slight change which added a parameter RequestBackoff.tolerance. So we can skip the tolerance in our test.
Assignee | ||
Comment 24•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c5bd3517098a0c68f7eafe188c1f0d59e7e8a788
Reporter | ||
Comment 25•7 years ago
|
||
(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #23) > I add a slight change which added a parameter RequestBackoff.tolerance. So > we can skip the tolerance in our test. Looks good to me.
Assignee | ||
Updated•7 years ago
|
Keywords: checkin-needed
Comment 26•7 years ago
|
||
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/integration/autoland/rev/14b4af261515 Fix backoff issue that makes SB lists no longer update r=francois
Keywords: checkin-needed
Comment 27•7 years ago
|
||
Backed out for eslint failures in nsUrlClassifierLib.js: https://hg.mozilla.org/integration/autoland/rev/f08a3cf8fcb64c1e710e41e09733ef9094b1a080 Push with failure: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=14b4af2615158fe83b174c6f1236668e13cd20ab&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=128584746&repo=autoland Two more spaces and you are done: TEST-UNEXPECTED-ERROR | /builds/worker/checkouts/gecko/toolkit/components/url-classifier/nsUrlClassifierLib.js:117:3 | Expected space or tab after '//' in comment. (spaced-comment) TEST-UNEXPECTED-ERROR | /builds/worker/checkouts/gecko/toolkit/components/url-classifier/nsUrlClassifierLib.js:118:3 | Expected space or tab after '//' in comment. (spaced-comment)
Flags: needinfo?(tnguyen)
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(tnguyen)
Assignee | ||
Updated•7 years ago
|
Keywords: checkin-needed
Comment 29•7 years ago
|
||
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/integration/autoland/rev/fdbb8e7b65f2 Fix backoff issue that makes SB lists no longer update r=francois
Keywords: checkin-needed
Comment 30•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/fdbb8e7b65f2
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox57:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Reporter | ||
Comment 31•7 years ago
|
||
As per comment 21, we won't fix this in earlier versions and instead let it ride the train to get extra testing.
You need to log in
before you can comment on or make changes to this bug.
Description
•