Places automatically keeps ten daily backups. This is less than we might want for recovery (lots of users treat bookmarks as "write often, retrieve rarely", which can lead to very late discovery of problems), but multiple copies is expensive. Also, it can lead to ten identical copies on disk, which does nobody any good. A better scheme might be: * Keep up to 30 bookmark snapshots (or fewer if they're large, or disk is small). Configurable. * Discard consecutive identical copies (same hash). * Compress bookmark backups on disk. JSON is verbose. * Try to never discard a snapshot whose immediate successor is smaller. This might avoid data loss.
(In reply to Richard Newman [:rnewman] from comment #0) > Places automatically keeps ten daily backups. Actually we keep 10 backups, they may not be daily, it mostly depend on the user's habits, for me it's more spread across 20 days. This is cause the backups are done on idle. > This is less than we might want for recovery (lots of users treat bookmarks > as "write often, retrieve rarely", which can lead to very late discovery of > problems), but multiple copies is expensive. I agree. > A better scheme might be: > > * Keep up to 30 bookmark snapshots (or fewer if they're large, or disk is > small). Configurable. so, I'm sure we can have multi-MB backups here (hey we could add telemetry!), let's say a 2MB avg, this would be 60MB, that is quite a bit. > * Discard consecutive identical copies (same hash). This is really a good idea, we should implement this first, then this would be 10 useful backups instead of 10 possibly-useful backups. that bey itself already increases the number of backups. We should file a dependent bug apart and that will be fixable even now. > * Compress bookmark backups on disk. JSON is verbose. We could, though this means writes and reads must be async (off-mainthread compression if possible), we still have to make backups async. Maybe the 2 things should be done at the same time. > * Try to never discard a snapshot whose immediate successor is smaller. This > might avoid data loss. We should predict which size difference may represent a dataloss, like 20%? And there are other facts to take into account: 1. Automatic restore always restore the newest backup, what if this backed up a dataloss? Should we ask the user which backup he wants to restore on startup? 2. May be worth adding the backup size to the Restore menu in the Library so one can take the larger backup 3. What to do when after doing a backup we figure its size is much smaller than the previous one? 4. if roots are gone we need a disaster recovery plan, otherwise the user can't even restore a backup
(In reply to Marco Bonardo [:mak] from comment #1) > 4. if roots are gone we need a disaster recovery plan, otherwise the user > can't even restore a backup And actually I thought we were already doing this by replacing the DB, could be there's a bug down there.
I added a bunch of dependencies, and morphed this into a meta bug. mak, please check if there are any details you want to add!
Looks good, easy targets are the hash check and file size on the menu. We may even increase number of backups to 15.
I've seen the importance of this on systems with slow storage backend, e.g. SD card. JSON files get overly big (same experience with sessionstore.js) so compression would be nice. Also consider storing /diff/ is changes are small, but that implies JSON output is human readable and depends on external tool. Subscribing to this meta-bug, thanks Marco for pointing this out!
I'm not sure where best to put this... I am not developer, so maybe this is a dumb suggestion. Would it be reasonable to "watch" to see if there are any changes to bookmarks (new, deletes, moves, etc). When a change is seen, save a flag and shut down the watcher. (Perhaps there's an easier way to detect changes.) This would prevent us from needlessly making a backup. This came to mind because of bug 824433 and I may go days without touching my bookmarks.
it's possible, but could not be that frequent since Sync from other devices may touch bookmarks. though it's definitely something we could look into.
(In reply to Marco Bonardo [:mak] from comment #7) > it's possible, but could not be that frequent since Sync from other devices > may touch bookmarks. though it's definitely something we could look into. Oh, that reminded me, the guys working on Picl (FF data cloud sync) demoed some code that tracked Places transactions and made a changelog: http://vimeo.com/60851334 Perhaps we can use that code or somehow utilize it once it lands (assuming it does).
There is nothing more planned to improve backups on the Places side. bug 973591 is more a platform bug and bug 983988 is an improvement we should apply to the new behavior. Even if important, they were not part of the project goal (make backups async and reduce their IO footprint), so I'm calling this done.