Closed Bug 770159 Opened 12 years ago Closed 12 years ago

Refactor SQLStorage backend for locking-based API

Categories

(Cloud Services Graveyard :: Server: Sync, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rfkelly, Assigned: rfkelly)

References

Details

(Whiteboard: [qa?])

Attachments

(1 file, 1 obsolete file)

Attached is my in-progress refactoring of the SQLStorage backend to use database transactions and implement the new locking API.  It's a serious change, so I'm attaching the code as a zipfile rather than a messy messy patch.

The backend now consists of three parts:

 1) The files queries_*.py contain the pre-built queries that can be executed by the backend.  They're almost free of sqlalchemy syntax, except where it's really necessary.

 2) The file dbconnect.py contains the low-level details of connecting to the database, starting a transaction etc.  It's a thin wrapper over the SQLAlchemy APIs for this, with some additional niceties.

 3) The main __init__.py file contains the high-level logic of the backend, translating SyncStorage API calls into database queries.

All operations are processed using a "SQLStorageSession" object, which encapsulates an open connection, an open transaction and some cached metadata about the state of the store.  When locking a collection, it takes a database-level lock and stores the active session in a threadlocal so that any subsequent API calls will use it and hence respect the lock.

In its current state this code passes all existing syncstorage tests, but I need to add some more tests to explicitly test the locking API before it is ready for commit.
Attachment #638330 - Flags: feedback?(telliott)
Whiteboard: [qa?]
OK, I have been putting this through its paces with both sqlite and mysql storage backends and it seems to be holding up well.  Posting updated code for final review.
Assignee: nobody → rfkelly
Attachment #638330 - Attachment is obsolete: true
Attachment #638330 - Flags: feedback?(telliott)
Attachment #640135 - Flags: review?(telliott)
Comment on attachment 640135 [details]
updated zipfile containing sqlstorage backend code

>            if not self.shard:
>                bso.create(self.engine, checkfirst=True)
>            else:
>                for idx in xrange(self.shardsize):

seems redundant to have both shard and shardsize. I think you could collapse it into one simple variable.

>               for row in res:
>                    yield row

Yay!

I think the naming is a little confusing here. When I go looking for code to do the sql, I don't look in dbconnect.py - that's where all the random connection pooling logic lives. It should probably just be called sql.py or something. 

Better yet, it seems a lot of this grab-a-query-and-run-it logic is just excellent utility to have around and could be abstracted out into a superclass that goes somewhere more core. Like 90% of the dbconnection object is just straight up utility sql functions (Heck, most of this is just making sqlalchemy usable :P ). Abstract out the sharding stuff and you have something you'll be able to reuse a bunch.

The final thing that looks like it's missing to me is the opening comment in each of the sql queries. Those got dropped somewhere along the way, but they used to be very useful for knowing which functions were getting called a lot. If we could find a way to get that data included, it'd be a bit of a win.
Attachment #640135 - Flags: review?(telliott) → review+
Per discussions on IRC, I have spun out your improvement suggestions into separate bugs (Bug 774976, Bug 774977, Bug 669805) and I'll commit this patch as-is.
https://github.com/mozilla-services/server-syncstorage/commit/a437a5d712955bc22ebb13c7d8b632aa7a632168
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: