Closed Bug 664658 Opened 13 years ago Closed 12 years ago

tool to wipe all memcache data for given user(s)

Tracking

(Not tracked)

Status:

VERIFIED FIXED

People

(Reporter: Atoll, Assigned: telliott)

Details

(Whiteboard: [qa+])

Attachments

(2 files, 1 obsolete file)

Script to clean out a user's memcache. Designed to work in the ops node purge chain 13 years ago Toby Elliott [:telliott] 560 bytes, text/plain	Atoll : review+	Details
revised version with python memcache names 13 years ago Toby Elliott [:telliott] 562 bytes, text/plain		Details
revised version with python memcache names 13 years ago Toby Elliott [:telliott] 569 bytes, text/plain	tarek : review+	Details

:Atoll

Reporter

Description

•

13 years ago

As part of solving the HMAC mismatch bug, it turns out that we're not wiping the users data in memcache.  So let's do that.  Need a tool that can wipe out the data in memcache for a given list of userids on STDIN.  Userids can be in any format, just let us know.  (Username, numeric userid, ldap dn, whatever.)

This is also known as the "HMAC mismatch bug prevention tool", decide ETA accordingly.

Richard Newman [:rnewman]

Comment 1

•

13 years ago

Background explanation: an HMAC mismatch occurs when a client downloads a record which was encrypted with a different key than the one they are holding for that collection.

In this case, it's possible for the following to happen:

* Client A uploads key K1.
* Client A uploads tab record K1(R1).
* Server is partially wiped: DB-stored data (K1) is eliminated.
* Client B generates and uploads new key K2.
* Client B downloads tab record K1(R1).
* HMAC mismatch.

Builds after Bug 650208 are no longer susceptible to this bug, because when a new key is uploaded, all server data is deleted. However, this won't land for another 3 or 4 months (Firefox 6), so it would be valuable to avoid the situation. The most direct way to do that is to kill the contents of memcache for a migrated user.

(A workaround for tab HMAC problems is to temporarily disable, sync, then re-enable tab sync.)

Toby Elliott [:telliott]

Assignee

Comment 2

•

13 years ago

to clarify - "server is partially wiped" meaning that we do the wiping, due to a disk problem or migration?

:Atoll

Reporter

Comment 3

•

13 years ago

yes

:Atoll

Reporter

Comment 4

•

13 years ago

(oops.)

Yes, that was the known cause.  There could be other ways to trigger that scenario, but this tool was requested specifically to address the one scenario where we know we're causing the problem.

Mike Connor [:mconnor]

Updated

•

13 years ago

Group: services-infra

Toby Elliott [:telliott]

Assignee

Updated

•

13 years ago

Assignee: nobody → telliott

Toby Elliott [:telliott]

Assignee

Comment 5

•

13 years ago

Attached file Script to clean out a user's memcache. Designed to work in the ops node purge chain — Details

cat data | memcache_clean.py server1 server2 server3 server4

Will output something like:

Cleaning id: 1 (OK)
Cleaning id: 2 (OK)
Cleaning id: 3 (OK)
...


It's an ops script, so I don't think it belongs in any of our hg repos. Just wherever you have the rest of this chain.

Attachment #539934 - Flags: review?(rsoderberg)

:Atoll

Reporter

Updated

•

13 years ago

Attachment #539934 - Flags: review?(rsoderberg) → review+

:Atoll

Reporter

Comment 6

•

13 years ago

Curious if attachment 539934 [details] works for python and php, or just python.

Toby Elliott [:telliott]

Assignee

Comment 7

•

13 years ago

It doesn't actually care about the data, so unless the keys are encoded differently (which would shock me) or the hashing algo to determine a server is different (which wouldn't), should work for either.

Richard Newman [:rnewman]

Comment 8

•

13 years ago

Linkage: Bug 646269.

atoll, if this is r+, how close is this to RESOLVED FIXED?

OS: Mac OS X → All

Hardware: x86 → All

:Atoll

Reporter

Comment 9

•

13 years ago

(In reply to comment #8)
> Linkage: Bug 646269.
> 
> atoll, if this is r+, how close is this to RESOLVED FIXED?

This bug appears to live outside of process (no hg, no developer r?, no qa), so there's no clear answer to your question.

I would expect no sooner than Python sync is deployed to production.

:Atoll

Reporter

Comment 10

•

13 years ago

(In reply to comment #7)
> It doesn't actually care about the data, so unless the keys are encoded
> differently (which would shock me) or the hashing algo to determine a server
> is different (which wouldn't), should work for either.

"just python", then. Thanks!

Toby Elliott [:telliott]

Assignee

Comment 11

•

13 years ago

(In reply to comment #9)

> This bug appears to live outside of process (no hg, no developer r?, no qa),
> so there's no clear answer to your question.

I assume you're going to take it and put it in the same location as the rest of the script of which it's part of the chain. This is really just an ops script that happened to be written by a developer, so I'd put it through the same process you put the rest of that script.

Tarek Ziadé (:tarek)

Comment 12

•

13 years ago

Maybe https://hg.mozilla.org/services/admin-scripts/ could be the place were we collect all those scripts. 

In the future, it could be packaged and contain more python goodies to deal with the various servers, use core etc..

Tarek Ziadé (:tarek)

Comment 13

•

13 years ago

Comment on attachment 539934 [details]
Script to clean out a user's memcache. Designed to work in the ops node purge chain

since we did not retain the memcached python/php compatibility, the keys are different in python.

The keys to wipe are:

"UID:tabs"
"UID:meta:global"
"UID:size"
"UID:collections:stamps:NAME"
"UID:stamps"

with UID = user id, NAME = collection name

Tarek Ziadé (:tarek)

Comment 14

•

13 years ago

scratch this one (typo) "UID:collections:stamps:NAME"

Toby Elliott [:telliott]

Assignee

Comment 15

•

13 years ago

Attached file revised version with python memcache names (obsolete) — Details

Attaching one with the new names. We should run this against an actual python install, though I don't know how we'd pick up ones that we missed.

Toby Elliott [:telliott]

Assignee

Comment 16

•

13 years ago

Attached file revised version with python memcache names — Details

Attachment #542199 - Attachment is obsolete: true

Tarek Ziadé (:tarek)

Updated

•

13 years ago

Attachment #542204 - Flags: review+

:Atoll

Reporter

Comment 17

•

13 years ago

While testing to see if it can handle uidnumbers, I determined that it cannot:

[root@wp-web01.phx.weave petef]# echo 2222 | ./memcache_dump.py
Traceback (most recent call last):
  File "./memcache_dump.py", line 21, in <module>
    print "%s:%s\t%s" % (username, key, memc.get("%s:%s" % (id, key)))
  File "/usr/lib/python2.6/site-packages/memcache.py", line 793, in get
    return self._get('get', key)
  File "/usr/lib/python2.6/site-packages/memcache.py", line 762, in _get
    server, key = self._get_server(key)
  File "/usr/lib/python2.6/site-packages/memcache.py", line 296, in _get_server
    server = self.buckets[serverhash % len(self.buckets)]
ZeroDivisionError: integer division or modulo by zero

:Atoll

Reporter

Comment 18

•

13 years ago

above comment in wrong bug, sorry

Toby Elliott [:telliott]

Assignee

Comment 19

•

12 years ago

Script in use and seems happy.

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

:Atoll

Reporter

Comment 20

•

12 years ago

lives in sysadmins/svc/scripts/ now, which only ops can see.

James Bonacci [:jbonacci]

Updated

•

12 years ago

Whiteboard: [qa+]

James Bonacci [:jbonacci]

Updated

•

12 years ago

Status: RESOLVED → VERIFIED

BMO Automation

Updated

•

1 year ago

Product: Cloud Services → Cloud Services Graveyard

You need to log in before you can comment on or make changes to this bug.