Closed Bug 817950 Opened 12 years ago Closed 10 years ago

Sync2.0 script to purge expired items from the database

Categories

(Cloud Services Graveyard :: Server: Sync, defect, P1)

x86_64
Linux
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: rfkelly, Unassigned)

References

Details

(Whiteboard: [qa?])

Attachments

(2 files)

Sync1.1 deployment currently has a script that runs to periodically purge items with expired ttls from the database.  We need something similar for sync2.0.

As part of Bug 721530, I have a copy of a "purge_ttl.py" script from the sysadmins svn repo.  AFAICT it basically does the following:

  on each database host:
      for each database wbo0 ... wbo9
          find the oldest ttl in the database, and write this into memcache
          while there are items with expired ttls:
              delete the 1000 items with the oldest expired ttls
          find the oldest ttl in the database, and write this into memcache
          sleep for 1h to give it time to garbage collect

Are all of these steps required for sync2.0?  In particular, I've no idea what if anything reads the "oldest ttl" data that this script writes into memcache.  It's also not clear to me how, when and where this script gets run.

Code for updated sync2.0 version coming soon, hopefully tomorrow.
This script runs as a daemon, with an N minute sleep between runs to keep it from pummeling the database.

The "oldest ttl" written to memcache is collected for monitoring purposes.

see also: puppet/weave/modules/syncdb/manifests/prune.pp
Attaching a rework of the ttl-purging script for sync2.0.  This can be run either as a daemon or a oneshot script, and it calls a provided API method purge_expired_items() to purge items without hardcoding details of the storage backend.
Attachment #688663 - Flags: feedback?(bobm)
And here's the rest of the code to support this script - adding a purge_expired_items() method and some tests for it.

The monitoring output via memcached, Bob and I should discuss offline to get a better handle on before adding it into the code.
Attachment #688664 - Flags: review?(telliott)
Here's my concern here, and why the old scripts are the way they are: Doing it this way introduces a dependency on syncstorage, which pulls in a bunch of other stuff. The old scripts generally had no dependencies on non-core-python libs and whichever storage engine you wanted.

So that's a core question that ops should weigh in on.
Fair point.  I personally prefer this approach because it's the nicest from a code perspective, and has the least chance of funny incompatibilities like the one that surfaced Bug 756226.  But I'm not the one deploying this thing :-)

An alternative might be to suck all the logic out into the script itself, but leave it in the repo and leave the testing hooks in place to make sure nothing breaks it.
I'm less concerned about introducing the syncstorage dependency, than about architecting the expiry of items from within the Sync project.  Allowing the maintenance utilities to grow alongside of the project, and get some QA testing time, will make operations easier in the long run.  

Things to consider:
Where should be we running this from?
How should the maintenance scripts be bundled?

If we're going to be running the scripts from an administrative host, the lighter weight single script makes the most sense.  For scripts intended to be run on Sync hosts themselves, using the syncstorage libs is probably the way to go.
(In reply to Bob Micheletto [:bobm] from comment #6)
> Things to consider:
> Where should be we running this from?

This particular script, IMHO makes the most sense to run locally on each storage node.  This means installing syncstorage and its dependencies on a machine where they were previously not required, but this seems like an ok compromise to me.

> If we're going to be running the scripts from an administrative host, the
> lighter weight single script makes the most sense.  For scripts intended to
> be run on Sync hosts themselves, using the syncstorage libs is probably the
> way to go.

The mcclear and mcread scripts from Bug 756226 make sense to be run from an admin host, and so might benefit from fewer dependencies.  Other scripts so far all make sense to run from actual syncstorage hosts of one kind or another.
Comment on attachment 688664 [details] [diff] [review]
support code patch for purgettls.py script

Consider parameterizing the max-records per loop. It'll let us run some experiments to see which is most efficient.

Systems that don't support it can simply ignore the parameter.
Attachment #688664 - Flags: review?(telliott) → review+
Comment on attachment 688663 [details]
script to purge ttls, using provided API function

Clearing old reviews.
Attachment #688663 - Flags: feedback?(bobm)
I'm going to revive this for sync1.5.  It makes even more sense there because we'll be running this script on the storage node, where it can manage its own DB.
Committed in https://github.com/mozilla-services/server-syncstorage/commit/2263ce539ff557ee95761f1494a17a526a6e77ca with minimal edits due to code divergence

Bob, we should hook this up to run on the all-in-one storage nodes you're building.
Whiteboard: [qa?]
:rfkelly and :bobm 
status?

Did we do this already in Sync 1.5 Stage? or do we still need to schedule it?
Priority: -- → P1
I believe :bobm said this was running in sync1.5 prod now, is that correct?
Flags: needinfo?(bobm)
(In reply to Ryan Kelly [:rfkelly] from comment #13)
> I believe :bobm said this was running in sync1.5 prod now, is that correct?

This is correct.
Flags: needinfo?(bobm)
Awesome, closing it out then
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Done.
Status: RESOLVED → VERIFIED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: