Open
Bug 1078392
Opened 10 years ago
Updated 2 years ago
[Meta] Ideas to reduce Treeherder data usage
Categories
(Tree Management :: Treeherder: Infrastructure, defect, P3)
Tree Management
Treeherder: Infrastructure
Tracking
(Not tracked)
NEW
People
(Reporter: emorley, Unassigned)
References
Details
(Keywords: meta)
In 3.5 months, treeherder's DB has grown to ~250-300GB in size.
In addition, several changes that have increased usage only landed towards the end of that 3.5 months (eg bug 1060339), so our forward looking monthly usage is actually going to be higher than this.
On top of that, non-buildbot jobs are going to soon be submitting to treeherder, increasing growth rates further.
This meta bug is to track ways to reduce our data consumption.
Reporter | ||
Updated•10 years ago
|
Priority: -- → P3
Reporter | ||
Updated•10 years ago
|
No longer blocks: 1080757
Component: Treeherder → Treeherder: Data Ingestion
Reporter | ||
Updated•10 years ago
|
Priority: P3 → P4
Reporter | ||
Updated•7 years ago
|
Component: Treeherder: Data Ingestion → Treeherder: Infrastructure
Reporter | ||
Updated•7 years ago
|
Priority: P4 → P3
Reporter | ||
Comment 1•7 years ago
|
||
RDS is reporting prod disk usage as 776 GB out of the 1000 GB allocated.
Table size breakdown of prod (after an ANALYSE TABLE of all tables to refresh the stats):
SELECT table_name,
ROUND((data_length+index_length)/power(1024,3), 1) size_gb
FROM information_schema.tables
WHERE table_schema = 'treeherder'
ORDER BY size_gb DESC LIMIT 15;
+ ------------------------- + ------------ +
| table_name | size_gb |
+ ------------------------- + ------------ +
| failure_line | 318.6 |
| job_detail | 233.1 |
| performance_datum | 61.4 |
| job | 43.2 |
| job_log | 35.0 |
| text_log_step | 13.1 |
| text_log_error | 9.4 |
| taskcluster_metadata | 2.3 |
| machine | 1.0 |
| text_log_error_metadata | 1.0 |
| commit | 0.7 |
| reference_data_signatures | 0.4 |
| group_failure_lines | 0.3 |
| push | 0.1 |
| performance_signature | 0.1 |
+ ------------------------- + ------------ +
Note these are the rough calculated sizes of data+indexes, not the size on disk (which will be larger due to fragmentation).
Top 5 worst fragmented tables (presuming the stats are accurate, which they may not be):
SELECT table_name,
ROUND((data_length+index_length)/power(1024,3), 1) size_gb,
ROUND(data_free/power(1024,3), 1) data_free_gb
FROM information_schema.tables
WHERE table_schema = 'treeherder'
ORDER BY data_free DESC LIMIT 5;
+ --------------- + ------------ + ----------------- +
| table_name | size_gb | data_free_gb |
+ --------------- + ------------ + ----------------- +
| text_log_step | 13.1 | 15.6 |
| text_log_error | 9.4 | 9.1 |
| failure_match | 0.0 | 0.0 |
| bugscache | 0.0 | 0.0 |
| job | 43.2 | 0.0 |
+ --------------- + ------------ + ----------------- +
Reporter | ||
Comment 2•7 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #1)
> RDS is reporting prod disk usage as 776 GB out of the 1000 GB allocated.
7.5 days later it's 809 GB used, or ~30GB/week - giving us at most ~4 weeks before we have to bump RDS quota or fix cycle_data and/or expiry of failure_line. (We should perhaps still wait until after the Firefox 57 release to be safe, given some of the changes will need manual DB operations).
Reporter | ||
Comment 3•7 years ago
|
||
I'm now getting alerts for treeherder-{stage,dev} (which didn't have the recent manual stop-gap cleanup):
Event Source : db-instance
Identifier Link: https://console.aws.amazon.com/rds/home?region=us-east-1#dbinstance:id=treeherder-stage
SourceId: treeherder-stage
Notification time : 2017-12-06 03:45:11.349
Message : The free storage capacity for DB Instance: treeherder-stage is low at 8% of the provisioned storage [Provisioned Storage: 984.02 GB, Free Storage: 82.08 GB]. You may want to increase the provisioned storage to address this issue.
Prod is at 848GB (152GB free).
Reporter | ||
Comment 4•7 years ago
|
||
I've performed an OPTIMIZE TABLE on prod, for all but the 5 largest tables [1] (which are too large to do midweek), which has increased free space from 152GB to 191GB. This should give us a bit more breathing room until bug 1346567 is resolved.
[1] failure_line, job_detail, performance_datum, job, job_log
Comment 5•2 years ago
|
||
Prima di prenotare un servizio di accompagnatrice, verifica sempre la reputazione e leggi le recensioni per assicurarti che stai selezionando un servizio affidabile e quello che meglio si adatta alle tue necessità. https://www.escorta.com/escorts/firenze/
You need to log in
before you can comment on or make changes to this bug.
Description
•