Closed Bug 1161618 Opened 10 years ago Closed 9 years ago

Defragment job_artifact & performance_artifact tables once the non-gzipped records expire

Categories

(Tree Management :: Treeherder: Infrastructure, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Bug 1142648 made us gzip new json blobs in the job_artifact & performance_artifact tables. However old records are not compressed, and as they are purged (at 4 months on prod and 45 days on stage), the actual size of the table will be significantly less. However we need to defrag to see the benefit. (Doing so on only a handful of stage tables in bug 1142648 comment 5 after 15 days of gzipping saved 44GB - so the wins will be large on prod after 4 months). Bug 1142648 was deployed to stage on ~20th March and to prod on ~27th March, so the old records will have expired on stage as of ~today, and prod by ~27th July.
I've run this on stage and the *_artifact table usage has dropped from ~115GB to 29 GB (and that's already after the 44GB saving mentioned in comment 0). As a result stage disk usage is now 113GB (of which 58GB is binlog). We won't be able to get away with running this online on prod - we'll need to failover (sometime after ~27th July).
Depends on: 1170613
Depends on: 1169264
Depends on: 1176130
Blocks: 1176486
Stage was done by me a while ago; but prod needs doing (and sufficient time has now passed for the old rows to expire). I'll file a bug in database operations for them to run an offline defrag and then failover.
Summary: Defragment job_artifact & performance_artifact tables once the non-gzipped records expire → Defragment job_artifact & performance_artifact tables on stage/prod once the non-gzipped records expire
Depends on: 1190361
Assignee: nobody → emorley
Status: NEW → ASSIGNED
Update: We're waiting on Pythian getting access to the load balancer in bug 1190361, so they can do the failover.
(Making this about Heroku too)
Summary: Defragment job_artifact & performance_artifact tables on stage/prod once the non-gzipped records expire → Defragment job_artifact & performance_artifact tables once the non-gzipped records expire
This is done now isn't it?
Flags: needinfo?(emorley)
No, since we're waiting on the failover for prod (see dependant bug) and also for the same on Heroku (which I'll do at some point, but not so much of a rush & I need to figure out the best way to script it or something).
Flags: needinfo?(emorley)
Generated using: SELECT CONCAT('OPTIMIZE TABLE ',table_schema,'.',table_name,';') OptimizeTableSQL FROM information_schema.tables WHERE table_schema = 'treeherder' OR table_schema LIKE '%_jobs%';
I started running this against Heroku's RDS instance at 1555 utc+1, is unsurprisingly taking a while. b2g-inbound.job_artifact took 20 mins etc
Heroku RDS defrag finished sometime in the night, DB size reduced from 48GB to 40GB.
Bug 1190361 has completed the defrag on stage/prod; we're all done here :-)
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: