Closed Bug 1305484 Opened 9 years ago Closed 9 years ago

Store state in the file instead of preference

Categories

(Toolkit :: Safe Browsing, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox52 --- fixed

People

(Reporter: dimi, Assigned: hchang)

References

Details

(Whiteboard: #sbv4-m2)

Attachments

(1 file, 4 obsolete files)

We should store v4 "state" of each table update in file to make recover, error handing ... etc easier.
Whiteboard: #sbv4-m1
Also allows us to remove the base64 decoding and enconding.
Assignee: nobody → hchang
Priority: -- → P2
Blocks: 1305780
Assignee: hchang → tnguyen
Depends on: 1305801
Blocks: 1305581
No longer blocks: safebrowsingv4
No longer blocks: 1305780
MozReview-Commit-ID: BXyRUVW3uHg
Attachment #8797425 - Attachment description: Store state in the file instead of preference → WIP Store state in the file instead of preference
MozReview-Commit-ID: BaY82Sc03cv
Attachment #8797425 - Attachment is obsolete: true
Attachment #8797914 - Flags: feedback?(dlee)
MozReview-Commit-ID: fZypi0r9H4
Attachment #8797914 - Attachment is obsolete: true
Attachment #8797914 - Flags: feedback?(dlee)
Attachment #8797928 - Flags: feedback?(dlee)
Comment on attachment 8797928 [details] [diff] [review] Store state in the file instead of preference Review of attachment 8797928 [details] [diff] [review]: ----------------------------------------------------------------- Strongly recommend to use nsIUrlClassifierDbService.getTables as a single entry. Besides, test_listmanager.js might require to remove [1]. [1] https://dxr.mozilla.org/mozilla-central/rev/42c95d88aaaa7c2eca1d278399421d437441ac4d/toolkit/components/url-classifier/tests/unit/test_listmanager.js#336 ::: toolkit/components/url-classifier/nsIUrlClassifierDBService.idl @@ +94,5 @@ > > /** > + * Lists the tables along with client state associated with each table. > + */ > + void getTableStates(in nsIUrlClassifierCallback c); I would suggest just having |getTables| return states for v4 lists (and chunks for v2 lists). Doing this will save us a bunch of dup works in the patch (e.g. classifierProxies, listmanager, ...)
Attachment #8797928 - Flags: feedback?(dlee)
MozReview-Commit-ID: 3K6QAsZXgfY
Attachment #8797928 - Attachment is obsolete: true
Henry would like to take it :)
Assignee: tnguyen → hchang
Attachment #8798321 - Flags: review?(dlee)
Comment on attachment 8798321 [details] Bug 1305484 - Save/load state and checksum to/from disk rather than prefs. https://reviewboard.mozilla.org/r/83838/#review82680 This looks really good, Thanks! And there is only one thing i think maybe worth doing in this patch or create a follow up bug. When we write metadata file, we can use NS_NewCheckSummedOutputStream to make sure the metadata file itself is not corrupt. See the HashStore::WriteFile as an example[1]. [1] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/HashStore.cpp#923 ::: toolkit/components/url-classifier/Classifier.cpp:431 (Diff revision 2) > } > > aResult.Append('\n'); > } > + > + // Load meta data from *.metadata files in the root directory. I think we should add comments about the first part is for v2 tables and the second part is for v4 tables ::: toolkit/components/url-classifier/Classifier.cpp:434 (Diff revision 2) > } > + > + // Load meta data from *.metadata files in the root directory. > + nsCString metaData; > + nsresult rv = LoadMetaData(mRootStoreDirectory, metaData); > + NS_ENSURE_SUCCESS_VOID(rv); Almost every failure in LoadMetaData has already use NS_ENSURE_SUCESS which will show the warning message. We could just silently return here. And i was told that NS_ENSURE_SUCCESS is deprected recently(bug 672843) and there is a discussion here[1] [1] https://groups.google.com/forum/#!topic/mozilla.dev.platform/1clMLuuhtWQ ::: toolkit/components/url-classifier/Classifier.cpp:1186 (Diff revision 2) > + } > + > + // The state might include '\n' so that we have to encode. > + nsAutoCString stateBase64; > + nsresult rv = Base64Encode(state, stateBase64); > + NS_ENSURE_SUCCESS(rv, rv); We have already declared nsresult in the beginning
Attachment #8798321 - Flags: review?(dlee) → review+
Blocks: 1308384
Attachment #8797958 - Attachment is obsolete: true
Attachment #8798321 - Flags: review?(francois)
Whiteboard: #sbv4-m1 → #sbv4-m2
Comment on attachment 8798321 [details] Bug 1305484 - Save/load state and checksum to/from disk rather than prefs. https://reviewboard.mozilla.org/r/83838/#review83952 I like your approach. I've got a few suggestions below and I suspect we need to increase the test timeout to avoid oranges. ::: toolkit/components/url-classifier/Classifier.cpp:1014 (Diff revision 3) > NS_ENSURE_SUCCESS(rv, rv); > > input->Clear(); > } > > + // Keep track of the last applied udpate. typo: update ::: toolkit/components/url-classifier/Classifier.cpp:1138 (Diff revision 3) > } > > +nsresult > +Classifier::LoadMetaData(nsIFile* aDirectory, nsACString& aResult) > +{ > + nsCOMPtr<nsISimpleEnumerator> entries; We should probably NS_ENSURE that we don't get a NULL aDirectory. ::: toolkit/components/url-classifier/Classifier.cpp:1143 (Diff revision 3) > + nsCOMPtr<nsISimpleEnumerator> entries; > + nsresult rv = aDirectory->GetDirectoryEntries(getter_AddRefs(entries)); > + NS_ENSURE_SUCCESS(rv, rv); > + > + bool hasMore; > + while (NS_SUCCEEDED(rv = entries->HasMoreElements(&hasMore)) && hasMore) { Does looping through all of the files mean that we'll apply the state and metadata twice for each V4 table? (once for the .pset and once for the .cache file) ::: toolkit/components/url-classifier/Classifier.cpp:1160 (Diff revision 3) > + if (isDirectory) { > + LoadMetaData(file, aResult); > + continue; > + } > + > + // Truncate file extention to get the table name. typo: extension ::: toolkit/components/url-classifier/Classifier.cpp:1174 (Diff revision 3) > + tableName.Cut(dot, METADATA_SUFFIX.Length()); > + > + LookupCacheV4* lookupCache = > + LookupCache::Cast<LookupCacheV4>(GetLookupCache(tableName)); > + if (!lookupCache) { > + continue; It would be good to explain why we can ignore these. Maybe a comment like: // Ignore V2 tables since they don't have metadata. ::: toolkit/components/url-classifier/Classifier.cpp:1181 (Diff revision 3) > + > + nsCString state; > + nsCString checksum; > + rv = lookupCache->LoadMetaData(state, checksum); > + if (NS_FAILED(rv)) { > + LOG(("Failed to get meta data for table %s", tableName.get())); nit: "metadata" (no space) ::: toolkit/components/url-classifier/LookupCacheV4.cpp:319 (Diff revision 3) > + if (NS_FAILED(rv) || read != readLength) { > + LOG(("Failed to read the value.")); > + return NS_FAILED(rv) ? rv : NS_ERROR_FAILURE; > + } > + > + return NS_OK; For consistency, we should either `return rv;` here, just like in `WriteValue()` or `return NS_OK;` in both functions. ::: toolkit/components/url-classifier/LookupCacheV4.cpp:341 (Diff revision 3) > + > + nsCOMPtr<nsIOutputStream> outputStream; > + rv = NS_NewLocalFileOutputStream(getter_AddRefs(outputStream), metaFile, > + PR_WRONLY | PR_TRUNCATE | PR_CREATE_FILE); > + if (!NS_SUCCEEDED(rv)) { > + LOG(("Unable to create file to store meta data.")); nit: "metadata" ::: toolkit/components/url-classifier/LookupCacheV4.cpp:359 (Diff revision 3) > + if (NS_FAILED(rv)) { > + LOG(("Failed to write the list checksum.")); > + return rv; > + } > + > + NS_WARNING_ASSERTION(NS_SUCCEEDED(rv), "failed to store"); Given that you're already checking for `NS_FAILED(rv)` after writing the checksum and returning immediately, I don't think we'll ever hit this assertion. ::: toolkit/components/url-classifier/LookupCacheV4.cpp:361 (Diff revision 3) > + return rv; > + } > + > + NS_WARNING_ASSERTION(NS_SUCCEEDED(rv), "failed to store"); > + > + return NS_OK; Similarly, we can simply `return rv;` here since any failures will have already been handled above. ::: toolkit/components/url-classifier/LookupCacheV4.cpp:378 (Diff revision 3) > + > + nsCOMPtr<nsIInputStream> localInFile; > + rv = NS_NewLocalFileInputStream(getter_AddRefs(localInFile), metaFile, > + PR_RDONLY | nsIFile::OS_READAHEAD); > + if (NS_FAILED(rv)) { > + LOG(("Unable to open meta data file.")); nit: "metadata" ::: toolkit/components/url-classifier/LookupCacheV4.cpp:396 (Diff revision 3) > + if (NS_FAILED(rv)) { > + LOG(("Failed to read checksum.")); > + return rv; > + } > + > + return NS_OK; Again, `return rv;` should work here. ::: toolkit/components/url-classifier/ProtocolParser.cpp:853 (Diff revision 3) > + } > > PARSER_LOG(("==== Update for threat type '%d' ====", aResponse.threat_type())); > PARSER_LOG(("* listName: %s\n", listName.get())); > PARSER_LOG(("* newState: %s\n", aResponse.new_client_state().c_str())); > PARSER_LOG(("* isFullUpdate: %s\n", (isFullUpdate ? "yes" : "no"))); It might be nice to also add: PARSER_LOG(("* hasChecksum: %s\n", (aResponse.has_checksum() ? "yes" : "no"))); ::: toolkit/components/url-classifier/nsIUrlClassifierDBService.idl:86 (Diff revision 3) > in nsIUrlClassifierCallback c); > > /** > - * Lists the tables along with which chunks are available in each table. > - * This list is in the format of the request body: > - * tablename;chunkdata\n > + * Lists the tables along with their meta info in the following format: > + * > + * tablename;[meta data]\n nit: "metadata" ::: toolkit/components/url-classifier/nsIUrlClassifierDBService.idl:89 (Diff revision 3) > - * Lists the tables along with which chunks are available in each table. > - * This list is in the format of the request body: > - * tablename;chunkdata\n > - * tablename2;chunkdata2\n > - * > - * For example: > + * Lists the tables along with their meta info in the following format: > + * > + * tablename;[meta data]\n > + * tablename2;[meta data]\n > + * > + * For v2 tables, the meta data is the chunks info such as nit: "metadata" ::: toolkit/components/url-classifier/nsIUrlClassifierDBService.idl:95 (Diff revision 3) > - * goog-phish-regexp;a:10,14,30-40s:56,67 > - * goog-white-regexp;a:1-3,5 > + * > + * goog-phish-shavar;a:10,14,30-40s:56,67 > + * goog-unwanted-shavar;a:1-3,5 > + * > + * For v4 tables, base64 encoded state is currently the only info in the > + * meta data (can be extended whenever necessary). For exmaple, nit: "metadata" Also, we may want to say that once there is more info here, it will be colon-separated. ::: toolkit/components/url-classifier/tests/unit/test_listmanager.js:371 (Diff revision 3) > > - do_timeout(1000, waitUntilStateSavedToPref.bind(null, expectedState, callback)); > + return true; // break no matter whether the state is matching. > + }); > + > + if (!didCallback) { > + do_timeout(1000, waitUntilMetaDataSaved.bind(null, expectedState, 1 second seems like a very short timeout for disk I/O. We seem to run some of our tests on slow machines, so this will likely lead to intermittent failures.
Attachment #8798321 - Flags: review?(francois) → review-
Comment on attachment 8798321 [details] Bug 1305484 - Save/load state and checksum to/from disk rather than prefs. https://reviewboard.mozilla.org/r/83838/#review83952 > Does looping through all of the files mean that we'll apply the state and metadata twice for each V4 table? (once for the .pset and once for the .cache file) No because only files with ".metadata" will be taken into account: ``` int32_t dot = tableName.RFind(METADATA_SUFFIX, 0); if (dot == -1) { continue; } ``` > It would be good to explain why we can ignore these. Maybe a comment like: > > // Ignore V2 tables since they don't have metadata. We don't ignore v2 tables when "loading" metadata. This chunk of code recursively loops through all metadata file under safebrowsing directory. > 1 second seems like a very short timeout for disk I/O. > > We seem to run some of our tests on slow machines, so this will likely lead to intermittent failures. "1 second" is just how often we check the database repetitively until state and checksum is written.
Comment on attachment 8798321 [details] Bug 1305484 - Save/load state and checksum to/from disk rather than prefs. https://reviewboard.mozilla.org/r/83838/#review83952 > No because only files with ".metadata" will be taken into account: > > ``` > int32_t dot = tableName.RFind(METADATA_SUFFIX, 0); > if (dot == -1) { > continue; > } > ``` You're right. I missed that, sorry. > "1 second" is just how often we check the database repetitively until state and checksum is written. Ah ok. Sounds good then.
Comment on attachment 8798321 [details] Bug 1305484 - Save/load state and checksum to/from disk rather than prefs. https://reviewboard.mozilla.org/r/83838/#review84314 r+ with just two minor comments. ::: toolkit/components/url-classifier/Classifier.cpp:1141 (Diff revisions 3 - 4) > -Classifier::LoadMetaData(nsIFile* aDirectory, nsACString& aResult) > +Classifier::LoadMetadata(nsIFile* aDirectory, nsACString& aResult) > { > nsCOMPtr<nsISimpleEnumerator> entries; > nsresult rv = aDirectory->GetDirectoryEntries(getter_AddRefs(entries)); > NS_ENSURE_SUCCESS(rv, rv); > + NS_ENSURE_TRUE(entries, NS_ERROR_FAILURE); nit: NS_ENSURE_ARG_POINTER() might be a better fit ::: toolkit/components/url-classifier/nsIUrlClassifierDBService.idl:97 (Diff revisions 3 - 4) > * goog-unwanted-shavar;a:1-3,5 > * > * For v4 tables, base64 encoded state is currently the only info in the > - * meta data (can be extended whenever necessary). For exmaple, > + * metadata (can be extended whenever necessary). For exmaple, > * > - * goog-phish-proto;Cg0IARAGGAEiAzAwMTABEKqTARoCGAjT1gDD\n > + * goog-phish-proto;Cg0IARAGGAEiAzAwMTABEKqTARoCGAjT1gDD,oCGAjT1gDD\n Since the metadata is separated by colons, we should use colons (not commas) in the example too.
Attachment #8798321 - Flags: review?(francois) → review+
Pushed by hchang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/599f3dd65b6a Save/load state and checksum to/from disk rather than prefs. r=dimi,francois
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: