Move backup databases and raw table update data to a "update wreck" directory

RESOLVED FIXED in Firefox 52

Status

()

defect
P2
normal
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: hchang, Assigned: hchang)

Tracking

unspecified
mozilla52
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox52 fixed)

Details

(Whiteboard: #sbv4-m2)

Attachments

(2 attachments)

Assignee

Description

3 years ago
No description provided.
Assignee

Updated

3 years ago
Assignee: nobody → hchang
Comment hidden (mozreview-request)
Assignee

Comment 2

3 years ago
Inspired by Bug 1309770: It's almost impossible to debug "update apply" error systematically without the databases in disk and raw update response from server.

Provided a patch to save all databases and the raw update response from server (saved as tableupdates.bin) while update error occurs. To be honest, exposing raw update response from ProtocolParser to Classifier is not cool at all but I just can't figure out any better approach :( Thanks to the nsCString impl, there at least is no string copy overhead around the bad approach.

Comment 3

3 years ago
mozreview-review
Comment on attachment 8801048 [details]
Bug 1310142 - Preserve backup databases and raw table updates for diagnostic.

https://reviewboard.mozilla.org/r/85848/#review85674

This should be pretty useful. Before I r+ it though, we should discuss two things:

1. Should we only dump to the wreck directory in DEBUG builds? Or maybe on pre-release channels? I'm not sure we want to dump debug output on our release users.
2. Should we clean up the wreck directory at some point? e.g. after a successful update

::: toolkit/components/url-classifier/Classifier.h:114
(Diff revision 1)
>    nsresult RecoverBackups();
>    nsresult CleanToDelete();
>    nsresult BackupTables();
>    nsresult RemoveBackupTables();
>    nsresult RegenActiveTables();
> +  nsresult DumpUpdateWreck(const nsACString& aRawUpdates);

Maybe "FailedUpdate" instead of "UpdateWreck"?
Attachment #8801048 - Flags: review?(francois) → review-
Assignee

Comment 4

3 years ago
I've hit another update error (after two days!) and this is the database in disk and raw update data (tableupdates.bin) for debugging!
Status: NEW → ASSIGNED
Priority: -- → P2
Whiteboard: #sbv4-m2
Comment hidden (mozreview-request)
Assignee

Updated

3 years ago
Attachment #8801048 - Flags: review?(gpascutto)
Assignee

Comment 6

3 years ago
Hi gcp,

I'd like to have your feedback about this patch. It's intended to dump the local databases and the raw update response to a "safebrowsing-failedupdate" directory when an update error occurs.
Directory "safebrowsing-failedupdate" will only store the previous failed update so that
it wouldn't occupy too much disk space.

Right now only when "browser.safebrowsing.debug" is on would we do the dump. What
do you think about this approach? Or would you feel it's also okay to do it even
"browser.safebrowsing.debug" is false?

Thanks :)
Flags: needinfo?(gpascutto)

Comment 7

3 years ago
mozreview-review
Comment on attachment 8801048 [details]
Bug 1310142 - Preserve backup databases and raw table updates for diagnostic.

https://reviewboard.mozilla.org/r/85848/#review87572

I think that you want to decouple this from the browser.safebrowsing.debug flag, as that one is quite spammy, while this code is silent and will only act when Things Have Gone Badly Wrong. You also can't really ask users to toggle that flag after they've detected a problem (not so easy in the first place) because the database will already have been wiped.

My suggestion would be to surround all this code with MOZ_SAFEBROWSING_DUMP_FAILED_UPDATES, and then set that define only for Nightly builds. 

I would think that once updates appear to be working and we consider rolling out to Aurora and further, we can just disable the dumping everywhere, and perhaps remove the code in the #ifdef's, or simply not bother for a while if it's not getting in our way.

On a related note, once you start verifying the checksum after updates, it should be possible to debug update failures by just knowing the checksum before and (what it was supposed to be) after the update. Then I think we can definitely get rid of this code and replace it by something simpler.

::: toolkit/components/url-classifier/Classifier.h:69
(Diff revision 2)
>    /**
>     * Apply the table updates in the array.  Takes ownership of
>     * the updates in the array and clears it.  Wacky!
>     */
> -  nsresult ApplyUpdates(nsTArray<TableUpdate*>* aUpdates);
> +  nsresult ApplyUpdates(nsTArray<TableUpdate*>* aUpdates,
> +                        const nsCString& aRawUpdates = EmptyCString());

I don't think setting a default argument here buys you anything as you want all callers to pass it.

::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:628
(Diff revision 2)
>  
>  nsresult
>  nsUrlClassifierDBServiceWorker::ApplyUpdate()
>  {
>    LOG(("nsUrlClassifierDBServiceWorker::ApplyUpdate()"));
> -  return mClassifier->ApplyUpdates(&mTableUpdates);
> +  nsresult rv = mClassifier->ApplyUpdates(&mTableUpdates, mRawTableUpdates);

I think you can get rid of the uglyness here by not passing mRawTableUpdates, but noticing when rv indicates failure, and then calling a new method in Classifier that adds the raw updates file to the (already backed up) old database. I think it's very borderline whether that makes the code overall cleaner (due to issues like having to match rv against OUT_OF_MEMORY vs other errors, and having to deal with existence of the failedupdate path in the second method), though it would make it clearer here in the caller that this extra argument is only there for debugging.
Attachment #8801048 - Flags: review?(gpascutto)
I addressed the needinfo in the review.
Flags: needinfo?(gpascutto)
Comment hidden (mozreview-request)

Comment 10

3 years ago
mozreview-review
Comment on attachment 8801048 [details]
Bug 1310142 - Preserve backup databases and raw table updates for diagnostic.

https://reviewboard.mozilla.org/r/85848/#review90272

::: toolkit/components/url-classifier/moz.build:85
(Diff revision 3)
>  CXXFLAGS += CONFIG['SQLITE_CFLAGS']
>  
>  if CONFIG['GNU_CXX']:
>      CXXFLAGS += ['-Wno-error=shadow']
> +
> +if CONFIG['NIGHTLY_BUILD']:

I wonder whether we should enable this if it's either:

- Nightly, or
- a debug build

The big advantage is that we run our tests on both non-debug and debug builds and so if there's anything grossly wrong with our dumping code we'll see it fail in the tests.

I'm not sure we run anything on try with `NIGHTLY_BUILD` defined.
Attachment #8801048 - Flags: review?(francois) → review+
Comment hidden (mozreview-request)

Comment 12

3 years ago
Pushed by hchang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7320cd3e1568
Preserve backup databases and raw table updates for diagnostic. r=francois

Comment 13

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/7320cd3e1568
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
You need to log in before you can comment on or make changes to this bug.