Closed Bug 599018 Opened 14 years ago Closed 14 years ago

Add ability to read certain constants out of memcache

Categories

(Cloud Services Graveyard :: Server: Sync, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: telliott, Assigned: telliott)

References

Details

(Keywords: push-needed, Whiteboard: [qa-])

Attachments

(1 file)

We'll mostly just need this for server-downing and backing off, but the sync server should be able to read an arbitrary set of constants out of memcache.

Need to make sure this is compatible with systems not using memcache. That means having a constant in default_constants that defines a set of constants to be read out of memcache. (This constant would be off by default, therefore ignored)
Blocks: 599075
Blocks: 599085
sorry, that path should be constants:$node
(In reply to comment #0)
> We'll mostly just need this for server-downing and backing off, but the sync
> server should be able to read an arbitrary set of constants out of memcache.
> 
> Need to make sure this is compatible with systems not using memcache. That
> means having a constant in default_constants that defines a set of constants to
> be read out of memcache. (This constant would be off by default, therefore
> ignored)

Sounds good. For the Python version I was thinking of a [constants] section in the config file, with a "provider" variable to define where the constants are loaded from:

- memcache: loads the values from the file, and updates them using the mapping from memcached, if any.
- file: loaded from the file (default one if the "provider" variable is not defined.)

Example:

  [constants]
  provider = memcache
  var1 = value1
  var2 = value2
  ...

the constants in memcached would be a mapping under the same cache key "constants:host".
Ok that works fine. I tried to disable my local php/memcached server with success.

I have also realized that the PHP memcache lib uses a custom serializer to store arrays in memcache, which makes it incompatible with Python. Since we will do a progressive upgrade of the nodes when switching to Python, we should make the memcached data inter-operable. 

I could write a custom deserializer to transform the PHP array into a Python mapping, but a cleaner solution would be to use JSON there or maybe a custom interoperable serialization format for mappings.

== JSON ==

It seems built-in with certain versions of PHP: http://stackoverflow.com/questions/1816128/change-serialization-functions-in-php-for-memcached

If we do this we'll need to compare the speed to make sure we don't slow down the lookup too much. And maybe try Proto-buf.

== Custom format ==

a "name:value,name2:value2,.." string
Attachment #478890 - Flags: review?(tarek) → review+
While I think this would be OK for a transition, I think it hurts us in the long run, and we should try to optimize for that rather than worry about the transition.

I expect the python transition to be fairly dramatic, and we can announce an actual downtime to do this. Then, we can kill the memcache and start from scratch, bringing up one node at a time to prevent the dbs from being overwhelmed.

That will result in loss of tab data, so we should talk about the messaging there.
Maybe we should start a migration doc then to have the big lines of the scenario in mind.

I thought we wanted to move transparently one node at a time to Python to limit the risks if anything goes bad. That's one of the reason I am currently making the Python server compatible with the existing infrastructure/data. The cache was the last bit.
Hmm, good point. We would need an interim format if we wanted to go that way.

Handling seriaization/deserialization ourselves isn't where we want to go in the long run. Need to think about the best way to handle this.
JSON seems fine to me. We just need to make sure that the overhead is minimal by benching the serialization on both side. I doubt it will be that slow given the size of the stuff we store in memcached,
I think we may need to talk to ops about the migration process here - it's much deeper than just the constants - tabs and collection timestamps are also stored in incompatible formats.
Turns out JSON do add some overhead :/

Serializing 200 tabs with 500 chars payloads using JSON instead of the binary serializer adds an overhead of 5 ms, which is quite a lot on requests that can last 20 ms.

I'll start a bug to discuss the transition
Transition : Bug 600482
Status: NEW → RESOLVED
Closed: 14 years ago
Keywords: push-needed
Resolution: --- → FIXED
Whiteboard: [qa-]
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: