Closed Bug 1310142 Opened 4 years ago Closed 4 years ago

Move backup databases and raw table update data to a "update wreck" directory


(Toolkit :: Safe Browsing, defect, P2)




Tracking Status
firefox52 --- fixed


(Reporter: hchang, Assigned: hchang)



(Whiteboard: #sbv4-m2)


(2 files)

No description provided.
Assignee: nobody → hchang
Inspired by Bug 1309770: It's almost impossible to debug "update apply" error systematically without the databases in disk and raw update response from server.

Provided a patch to save all databases and the raw update response from server (saved as tableupdates.bin) while update error occurs. To be honest, exposing raw update response from ProtocolParser to Classifier is not cool at all but I just can't figure out any better approach :( Thanks to the nsCString impl, there at least is no string copy overhead around the bad approach.
Comment on attachment 8801048 [details]
Bug 1310142 - Preserve backup databases and raw table updates for diagnostic.

This should be pretty useful. Before I r+ it though, we should discuss two things:

1. Should we only dump to the wreck directory in DEBUG builds? Or maybe on pre-release channels? I'm not sure we want to dump debug output on our release users.
2. Should we clean up the wreck directory at some point? e.g. after a successful update

::: toolkit/components/url-classifier/Classifier.h:114
(Diff revision 1)
>    nsresult RecoverBackups();
>    nsresult CleanToDelete();
>    nsresult BackupTables();
>    nsresult RemoveBackupTables();
>    nsresult RegenActiveTables();
> +  nsresult DumpUpdateWreck(const nsACString& aRawUpdates);

Maybe "FailedUpdate" instead of "UpdateWreck"?
Attachment #8801048 - Flags: review?(francois) → review-
I've hit another update error (after two days!) and this is the database in disk and raw update data (tableupdates.bin) for debugging!
Priority: -- → P2
Whiteboard: #sbv4-m2
Attachment #8801048 - Flags: review?(gpascutto)
Hi gcp,

I'd like to have your feedback about this patch. It's intended to dump the local databases and the raw update response to a "safebrowsing-failedupdate" directory when an update error occurs.
Directory "safebrowsing-failedupdate" will only store the previous failed update so that
it wouldn't occupy too much disk space.

Right now only when "browser.safebrowsing.debug" is on would we do the dump. What
do you think about this approach? Or would you feel it's also okay to do it even
"browser.safebrowsing.debug" is false?

Thanks :)
Flags: needinfo?(gpascutto)
Comment on attachment 8801048 [details]
Bug 1310142 - Preserve backup databases and raw table updates for diagnostic.

I think that you want to decouple this from the browser.safebrowsing.debug flag, as that one is quite spammy, while this code is silent and will only act when Things Have Gone Badly Wrong. You also can't really ask users to toggle that flag after they've detected a problem (not so easy in the first place) because the database will already have been wiped.

My suggestion would be to surround all this code with MOZ_SAFEBROWSING_DUMP_FAILED_UPDATES, and then set that define only for Nightly builds. 

I would think that once updates appear to be working and we consider rolling out to Aurora and further, we can just disable the dumping everywhere, and perhaps remove the code in the #ifdef's, or simply not bother for a while if it's not getting in our way.

On a related note, once you start verifying the checksum after updates, it should be possible to debug update failures by just knowing the checksum before and (what it was supposed to be) after the update. Then I think we can definitely get rid of this code and replace it by something simpler.

::: toolkit/components/url-classifier/Classifier.h:69
(Diff revision 2)
>    /**
>     * Apply the table updates in the array.  Takes ownership of
>     * the updates in the array and clears it.  Wacky!
>     */
> -  nsresult ApplyUpdates(nsTArray<TableUpdate*>* aUpdates);
> +  nsresult ApplyUpdates(nsTArray<TableUpdate*>* aUpdates,
> +                        const nsCString& aRawUpdates = EmptyCString());

I don't think setting a default argument here buys you anything as you want all callers to pass it.

::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:628
(Diff revision 2)
>  nsresult
>  nsUrlClassifierDBServiceWorker::ApplyUpdate()
>  {
>    LOG(("nsUrlClassifierDBServiceWorker::ApplyUpdate()"));
> -  return mClassifier->ApplyUpdates(&mTableUpdates);
> +  nsresult rv = mClassifier->ApplyUpdates(&mTableUpdates, mRawTableUpdates);

I think you can get rid of the uglyness here by not passing mRawTableUpdates, but noticing when rv indicates failure, and then calling a new method in Classifier that adds the raw updates file to the (already backed up) old database. I think it's very borderline whether that makes the code overall cleaner (due to issues like having to match rv against OUT_OF_MEMORY vs other errors, and having to deal with existence of the failedupdate path in the second method), though it would make it clearer here in the caller that this extra argument is only there for debugging.
Attachment #8801048 - Flags: review?(gpascutto)
I addressed the needinfo in the review.
Flags: needinfo?(gpascutto)
Comment on attachment 8801048 [details]
Bug 1310142 - Preserve backup databases and raw table updates for diagnostic.

::: toolkit/components/url-classifier/
(Diff revision 3)
>  if CONFIG['GNU_CXX']:
>      CXXFLAGS += ['-Wno-error=shadow']
> +

I wonder whether we should enable this if it's either:

- Nightly, or
- a debug build

The big advantage is that we run our tests on both non-debug and debug builds and so if there's anything grossly wrong with our dumping code we'll see it fail in the tests.

I'm not sure we run anything on try with `NIGHTLY_BUILD` defined.
Attachment #8801048 - Flags: review?(francois) → review+
Pushed by
Preserve backup databases and raw table updates for diagnostic. r=francois
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
You need to log in before you can comment on or make changes to this bug.