Closed Bug 1021419 Opened 10 years ago Closed 10 years ago

make update and gethash urls for safebrowsing updates configurable per-table

Categories

(Core :: DOM: Security, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33

People

(Reporter: mmc, Assigned: mmc)

References

Details

Attachments

(1 file, 11 obsolete files)

86.92 KB, patch
gcp
: review+
Details | Diff | Splinter Review
We're encountering more situations where it would be nice to have a safebrowsing-like interface for various services that are not actually safebrowsing. Currently the listmanager and hash completer use a single url for fetching all updates and hash completions for all tables. We should be able to specify different URLs for different tables that want to use the same format.
Assignee: nobody → mmc
Status: NEW → ASSIGNED
Attachment #8435433 - Attachment is obsolete: true
Attachment #8436099 - Attachment is obsolete: true
Well the list registration, etc. in Safebrowsing.jsm and listmanager.js is relatively simple to clean up. The real problem is that nsUrlClassifierStreamUpdater is incapable of fetching updates from multiple urls, plus it clears all pending callbacks, etc. when the download is done.
Attachment #8437082 - Attachment is obsolete: true
Attachment #8437384 - Attachment is obsolete: true
Attachment #8437851 - Attachment is obsolete: true
Comment on attachment 8439337 [details] [diff] [review] Implement per-table update and gethash requests Review of attachment 8439337 [details] [diff] [review]: ----------------------------------------------------------------- The hash completer still needs work, but this is getting big and I'd like your feedback. Generally the changes are to parameterize gethash and downloadupdates wherever they occur, and queue pending updates rather than throwing them out (which probably makes more sense if there's only one update url).
Attachment #8439337 - Flags: review?(gpascutto)
Comment on attachment 8439337 [details] [diff] [review] Implement per-table update and gethash requests Review of attachment 8439337 [details] [diff] [review]: ----------------------------------------------------------------- Looks OK in terms of general approach. ::: toolkit/components/url-classifier/content/listmanager.js @@ +326,5 @@ > + log("existing chunks: " + tableData + "\n"); > + // A map of update urls to objects of the form: > + // { tableList: comma-separated list of tables to request, > + // tableNames: map of tables that need updating, > + // requests: existing chunk ranges for the table } Chunk ranges for which table? Looking down, it's really a list of tables and existing chunk ranges, formatted in the specific way the server needs. @@ +336,5 @@ > + } > + // Disallow blank updateUrls > + if (!updateUrl) { > + continue; > + } The above code can be simplified a bit. @@ +361,5 @@ > + // Find the entry in streamerMap associated with this table > + for (let url in streamerMap) { > + log("Checking " + url + "\n"); > + log(JSON.stringify(streamerMap[url], undefined, 2)); > + if (streamerMap[url].tableNames[fields[0]]) { let tableName = fields[0] ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp @@ +831,5 @@ > + if (listManager) { > + listManager->GetGethashUrl(result.mTableName, gethashUrl); > + LOG(("Got gethash url: %s", gethashUrl.get())); > + } else { > + LOG(("No list manager")); Probably want to fail more spectacularly here, or is this also a viable case at startup? ::: toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js @@ +84,5 @@ > + // TODO: Keep one request per URI. If aGethashUrl doesn't match the > + // currentRequest uri, that's not good. > + let uri = Services.io.newURI(aGethashUrl, null, null); > + this._currentRequest.setURI(uri); > + Services.tm.currentThread.dispatch(this, Ci.nsIThread.DISPATCH_NORMAL); Why does a JavaScript module that does async HTTP requests have the need to dispatch...wait I don't want to know. ::: toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp @@ +85,5 @@ > + { > + nsCString spec; > + aUpdateUrl->GetSpec(spec); > + LOG(("Fetching update %s from %s", aRequestBody.Data(), spec.get())); > + } Maybe #ifdef DEBUG this part @@ +169,5 @@ > + request->mRequest = aRequestBody; > + request->mUrl = aUpdateUrl; > + request->mSuccessCallback = aSuccessCallback; > + request->mUpdateErrorCallback = aUpdateErrorCallback; > + request->mDownloadErrorCallback = aDownloadErrorCallback; I guess the need to have these callbacks per request is that they should correspond to the right callbacks for the specific table+server-that-handles-the-table? How will you do this? Is the relevant logic in listmanager simply not written yet?
Attachment #8439337 - Flags: review?(gpascutto) → feedback+
Comment on attachment 8439337 [details] [diff] [review] Implement per-table update and gethash requests Review of attachment 8439337 [details] [diff] [review]: ----------------------------------------------------------------- Btw, the safebrowsing PM told me last week that they are revamping their protocol to use protocol buffers and want to transition off the old protocol within a year :p ::: toolkit/components/url-classifier/content/listmanager.js @@ +396,5 @@ > request, > + updateUrl, > + BindToObject(this.updateSuccess_, this, tableList), > + BindToObject(this.updateError_, this, tableList), > + BindToObject(this.downloadError_, this, tableList))) { These updates are still bound to existing callbacks in listmanager. @@ +413,5 @@ > var delay = parseInt(waitForUpdate, 10); > // As long as the delay is something sane (5 minutes or more), update > // our delay time for requesting updates > if (delay >= (5 * 60) && this.updateChecker_) > this.updateChecker_.setDelay(delay * 1000); And now the updateChecker won't be configured per update URL, so if one server sends back a different delay it'll stomp on the other one. You are right, I need to fix this logic to have one updateChecker_ per updateURL, and one requestBackoff_ per updateURL. ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp @@ +831,5 @@ > + if (listManager) { > + listManager->GetGethashUrl(result.mTableName, gethashUrl); > + LOG(("Got gethash url: %s", gethashUrl.get())); > + } else { > + LOG(("No list manager")); Startup and also tests which don't have the gethash url because Safebrowsing.jsm doesn't run :( ::: toolkit/components/url-classifier/nsUrlClassifierStreamUpdater.cpp @@ +169,5 @@ > + request->mRequest = aRequestBody; > + request->mUrl = aUpdateUrl; > + request->mSuccessCallback = aSuccessCallback; > + request->mUpdateErrorCallback = aUpdateErrorCallback; > + request->mDownloadErrorCallback = aDownloadErrorCallback; The list manager actually doesn't do anything different with these callbacks -- just spits out an error in case the update fails, and backs off if the download fails. Now that I think about it, the list manager will now backoff on all updates if downloads for some tables fail but not others. That may be ok for now because the new list (and list service) will be turned off by default for some time while we figure out a rollout strategy -- but ultimately we don't want regular phishing/malware updates to block on failed downloads from a different server.
Attachment #8439337 - Attachment is obsolete: true
Attachment #8443181 - Attachment is obsolete: true
Attachment #8444875 - Attachment is obsolete: true
Attachment #8445395 - Attachment is obsolete: true
Comment on attachment 8445618 [details] [diff] [review] Implement per-table update and gethash requests Review of attachment 8445618 [details] [diff] [review]: ----------------------------------------------------------------- Major changes include: - Keeping per-url requests and RequestBackoffs in hash completer and list manager - Using RequestBackoff in nsUrlClassifierHashCompleter - Changing the listmanager to always use a repeating alarm (see bug 1029240) Tested manually, and by visiting phishtank urls, and by pushing bug 1024610 and seeing if downloads/gethash still work. Try: https://tbpl.mozilla.org/?tree=Try&rev=d2e41b4e17b8
Attachment #8445618 - Flags: review?(gpascutto)
(In reply to [:mmc] Monica Chew (please use needinfo) from comment #16) > - Using RequestBackoff in nsUrlClassifierHashCompleter Btw, I think this will help with the China GFW issue, because they'll no longer be able to issue more than max requests in the request backoff time period -- and it seems the non-RequestBackoff backoff code was pretty busted anyway.
Comment on attachment 8445618 [details] [diff] [review] Implement per-table update and gethash requests Review of attachment 8445618 [details] [diff] [review]: ----------------------------------------------------------------- Looks ok, I'd like to see the final version before checkin though. Needs some cleaning up of the debug/log stuff. We're significantly changing the requestBackoff stuff. You already noticed the existing tests don't seem to cover it well enough. Are we able to extend our testing coverage to this? Bugs in this code could run the risk of DDOSing both Google and us (I'm assuming we provide the new lists). ::: browser/app/profile/firefox.js @@ +922,5 @@ > #ifdef MOZ_SAFE_BROWSING > pref("browser.safebrowsing.enabled", true); > pref("browser.safebrowsing.malware.enabled", true); > +// TODO: revert before checkin > +pref("browser.safebrowsing.debug", true); *cough* ::: toolkit/components/url-classifier/content/listmanager.js @@ +18,5 @@ > + return; > + > + let msg = "listmanager: " + stuff.join(" "); > + dump(msg + "\n"); > +} *coughs again* @@ +42,4 @@ > this.tablesData = {}; > + // A map of updateUrls to maps of tables requiring updates, e.g. > + // { safebrowsing-update-url: { goog-phish-shavar: 1, goog-malware-shavar: 1 } > + this.updateUrls = {}; For naming consistency it should probably be updateUrls_ @@ +105,5 @@ > + backoffInterval /* backoff interval, 60 min */, > + 8*60*60*1000 /* max backoff, 8hr */); > + > + } > + this.updateUrls[updateUrl][tableName] = false; This really just means empty, or is it actually similar to needsUpdate? From the comment where updateUrls is defined I'd have expected "0" instead. From looking below, this is really needsUpdate, so assign that to make the meaning more clear? @@ +177,5 @@ > this.startingUpdate_ = false; > var initialUpdateDelay = 3000; > > + // Check if any table registered for updates has ever been downloaded. How > + // does this know that they're already updating? Maybe I didn't name this var very well. But It means "updating existing" vs "downloading because not existing yet". @@ +300,5 @@ > + } > + // An object of the form > + // { tableList: comma-separated list of tables to request, > + // tableNames: map of tables that need updating, > + // requests: list of tables and existing chunk ranges from tableData } There's also the needsUpdate boolean. @@ +325,5 @@ > var lines = tableData.split("\n"); > for (var i = 0; i < lines.length; i++) { > var fields = lines[i].split(";"); > + log(JSON.stringify(streamerMap, undefined, 2)); > + if (streamerMap.tableNames[fields[0]]) { Put fields[0] into a var for readability. ::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp @@ +828,5 @@ > + nsCString gethashUrl; > + nsCOMPtr<nsIUrlListManager> listManager = do_GetService( > + "@mozilla.org/url-classifier/listmanager;1"); > + if (listManager) { > + listManager->GetGethashUrl(result.mTableName, gethashUrl); getGethashUrl can return "" if not set. Do we handle this case? (Even if by asserting it can't ever happen in debug mode)
Attachment #8445618 - Flags: review?(gpascutto) → review-
Attachment #8445618 - Attachment is obsolete: true
Attachment #8446717 - Attachment is obsolete: true
Comment on attachment 8446724 [details] [diff] [review] Implement per-table update and gethash requests Review of attachment 8446724 [details] [diff] [review]: ----------------------------------------------------------------- Hey gcp, I ended up making significant changes to the hashcompleter when extending its test. The threading is used to batch multiple calls to complete in a single request, and it turned out to be easiest to just keep a current request and a stack of pending requests similar to listmanager. Unfortunately it made this patch even bigger, sorry about that. > *coughs again* This is flagged under the debug pref. I think I took care of everything else. Thanks, Monica ::: toolkit/components/url-classifier/tests/unit/xpcshell.ini @@ +9,5 @@ > [test_backoff.js] > [test_dbservice.js] > [test_hashcompleter.js] > # Bug 752243: Profile cleanup frequently fails > +#skip-if = os == "mac" || os == "linux" Not sure if this applies anymore, but I'll wait on a green try to see.
Attachment #8446724 - Flags: review?(gpascutto)
Comment on attachment 8446724 [details] [diff] [review] Implement per-table update and gethash requests Review of attachment 8446724 [details] [diff] [review]: ----------------------------------------------------------------- ::: toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js @@ +116,4 @@ > > + if (this._currentRequest) { > + try { > + // Note the request before beginning. begin *must* call finishRequest. Will delete comment, we call noteRequest after asyncOpen succeeds and noteResponse on server response.
Comment on attachment 8446724 [details] [diff] [review] Implement per-table update and gethash requests Review of attachment 8446724 [details] [diff] [review]: ----------------------------------------------------------------- ::: toolkit/components/url-classifier/content/listmanager.js @@ +90,3 @@ > } > + this.tablesData[tableName] = {}; > + this.tablesData[tableName].needsUpdate = false; This looks redundant with needsUpdate_to me. Instead of indexing into this you can always look up this.needsUpdate_[tablesData[tableName].updateUrl][tableName]. @@ +301,5 @@ > + } > + // An object of the form > + // { tableList: comma-separated list of tables to request, > + // tableNames: map of tables that need updating, > + // requests: list of tables and existing chunk ranges from tableData nit: requests vs request ::: toolkit/components/url-classifier/nsUrlClassifierHashCompleter.js @@ +372,5 @@ > if (this._shuttingDown) { > throw Cr.NS_ERROR_ABORT; > } > > + let httpStatus = 503; Comment please.
Attachment #8446724 - Flags: review?(gpascutto) → review+
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Depends on: 1032767
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: