Closed
Bug 995512
Opened 10 years ago
Closed 8 years ago
Implement expiry of old blobs for some ceph buckets
Categories
(Socorro :: Backend, task)
Socorro
Backend
Tracking
(Not tracked)
RESOLVED
INVALID
People
(Reporter: glandium, Unassigned)
Details
The mozilla-releng-ceph-cache-scl3-try bucket needs an equivalent to an s3 expiry policy. Our current policy for the mozilla-releng-s3-cache-* buckets on s3 is to expire after 2 weeks. We'd need to scan the ceph bucket and remove keys with a creation date older than 2 weeks on a regular basis.
Updated•10 years ago
|
Assignee: server-ops-storage → relops
Component: Server Operations: Storage → RelOps
Product: mozilla.org → Infrastructure & Operations
QA Contact: dparsons → arich
Comment 1•10 years ago
|
||
Since Ceph doesn't support this natively, we need some kind of ad-hoc solution that runs periodically. Do you have that coded up already? And, I think we want to use access date, not creation date, as the index. Also, is this really necessary for a proof of concept? Could we do something much more quick-and-dirty, like just delete and re-create the bucket once a week or between experiments? Or is the idea here to prove the concept of external object expiration is practical?
Comment 2•10 years ago
|
||
http://tracker.ceph.com/issues/4099 suggests storing an explicit expiration date in the object metadata. I don't know the details of the ceph protocol, but that might be the best plan to work around the lack of access times.
Comment 3•10 years ago
|
||
Since there's no mechanism to do this in ceph, what are the requirements for this, glandium/taras? E.g. can you code it in the metadata? Do you need to be able to change the window? Is it a static window for all data (aka always 2 weeks for everything). Do you need it based on access time not creation time? Any other restrictions/requirements?
Comment 4•10 years ago
|
||
(In reply to Amy Rich [:arich] [:arr] from comment #3) > Since there's no mechanism to do this in ceph, what are the requirements for > this, glandium/taras? E.g. can you code it in the metadata? Do you need to > be able to change the window? Is it a static window for all data (aka > always 2 weeks for everything). Do you need it based on access time not > creation time? Any other restrictions/requirements? As far as I can tell the easiest way to do this is via S3 layer(eg via python/boto or etc). List all of the objects in the bucket, check their timestamp against current time, issue delete requests accordingly. Run that on a daily cronjob somewhere. Can test this against amazon s3 for faster dev.
Reporter | ||
Comment 5•10 years ago
|
||
What taras says. And while it would be theoretically better to use access time, s3 does it with creation time, which ceph keeps track of. Might as well do it like s3.
Comment 6•10 years ago
|
||
Selena, Lars, Gozer, and I are talking about this now, since socorro needs object expiration as well. We found some good solutions to expiry for socorro, but they're purpose-specific and not particularly generalizable. I haven't seen an answer to my question in comment 1. If Ceph isn't sufficiently performant, or if memcached turns out to be a better solution, then this is not a good use of time. So maybe this is a good time to reiterate: this particular usecase is a *far* better fit for memcached than for ceph. Aside from that, it sounds like someone needs to spend a few hours with Python and then jam the crontask into some host somewhere, since this is all just a POC. Not clear who that is.
Comment 7•10 years ago
|
||
Note, ceph is sufficiently performant in the POC. We got a 10% build speed up on windows even with traffic going between datacenters.
Comment 8•10 years ago
|
||
Wrote an expiry tool: https://github.com/tarasglek/s3expire Ceph deletions are a bit problematic. They didn't implement bulk deletes correctly. Instead of being faster, they are slower(than doing equivalent deletes in parallel). Ceph perf http://vps.glek.net/ceph.txt AWS perf http://vps.glek.net/aws.txt Note I'm accessing aws via the mozilla vpn to to give ceph every chance i can :) Run: nodejs s3expire.js config.json bench to reproduce
Comment 9•10 years ago
|
||
Also, please deploy this expiry tool :)
Comment 10•10 years ago
|
||
(In reply to Taras Glek (:taras) from comment #9) > Also, please deploy this expiry tool :) Actually, on second thought, don't deploy it. I think there are enough issues here, to warrant looking at alternatives. Gozer, please tell us how much space is taken up what is currently stored in s3. I don't have permissions to list mozilla-releng-ceph-cache-scl3-try
Comment 11•10 years ago
|
||
(In reply to Taras Glek (:taras) from comment #10) > (In reply to Taras Glek (:taras) from comment #9) > > Also, please deploy this expiry tool :) > > Actually, on second thought, don't deploy it. I think there are enough > issues here, to warrant looking at alternatives. > > Gozer, please tell us how much space is taken up what is currently stored in > s3. I don't have permissions to list mozilla-releng-ceph-cache-scl3-try
Flags: needinfo?(gozer)
Comment 12•10 years ago
|
||
Ian filed a ceph bug for bulk deletions http://tracker.ceph.com/issues/8210
Comment 13•10 years ago
|
||
(In reply to Taras Glek (:taras) from comment #10) > (In reply to Taras Glek (:taras) from comment #9) > > Also, please deploy this expiry tool :) > > Actually, on second thought, don't deploy it. I think there are enough > issues here, to warrant looking at alternatives. > > Gozer, please tell us how much space is taken up what is currently stored in > s3. I don't have permissions to list mozilla-releng-ceph-cache-scl3-try { "bucket": "mozilla-releng-ceph-cache-scl3-try", "pool": ".rgw.buckets", "index_pool": ".rgw.buckets.index", "id": "default.4343.5", "marker": "default.4343.5", "owner": "glandium", "ver": 1363489, "master_ver": 0, "mtime": 1396943600, "max_marker": "", "usage": { "rgw.none": { "size_kb": 0, "size_kb_actual": 0, "num_objects": 1}, "rgw.main": { "size_kb": 274942778, "size_kb_actual": 276300636, "num_objects": 674548}}, "bucket_quota": { "enabled": false, "max_size_kb": -1, "max_objects": -1}}
Flags: needinfo?(gozer)
Comment 14•10 years ago
|
||
Since we aren't going to use ceph, is this r/invalid?
Comment 15•10 years ago
|
||
to the best of my knowledge, Socorro is moving forward with Ceph as an HBase replacement. Since object expiry is important to us, this bug is not r/invalid.
Comment 16•10 years ago
|
||
Okay, so we should move this bug to the correct property, then, since ceph is not being used by releng (which is what this bug was for).
Updated•10 years ago
|
Assignee: relops → nobody
Component: RelOps → Backend
Product: Infrastructure & Operations → Socorro
QA Contact: arich
Comment 17•8 years ago
|
||
Socorro is no longer planning to use ceph, so I'm closing this out.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•