Analyse current DB size & growth rates

RESOLVED FIXED

Status

P2
normal
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: emorley, Assigned: emorley)

Tracking

(Blocks: 1 bug)

Details

(Whiteboard: [data:consultative])

Attachments

(3 attachments)

In order to find ways to help with bug 1078392, it would be great to have DB usage broken down by:
* repo
* table
* for job artefacts, by artefact type

Once we have the first round of stats, we may need to drill deeper (eg log parser artefact for successful vs not successful jobs).
We'll likely also want to just look at the last 2 weeks worth of data to avoid the skew from past lower consumption - and also likely again after bug 1078415 lands, to see if that helped.
Summary: Analyse current DB usage → Analyse current DB storage usage
Summary: Analyse current DB storage usage → Analyse current DB size & growth rates
Blocks: 1080757
Priority: -- → P3
Priority: P3 → P2
Assignee: nobody → mdoglio
Created attachment 8525300 [details]
prod_db_size_20141119.csv

The attached csv file contains a list of all the databases on production with their respective size in GB.
New relic says that the disk on treeherder1.db.scl3.mozilla.com will be full in 16 days and and the one on treeherder2.db.scl3.mozilla.com will be full in 13 days. Raising the priority here to P1
Priority: P2 → P1
Whoops.  Can we prune a month of old data as an emergency measure to give us more space to fix this correctly?
Summing the table sizes in the attached CSV gives a total of 330GB - bug 1076740 supposedly gave us 700GB to play with. Why the substantial difference?
a) New Relic incorrectly reporting free space?
b) The 700GB is also used for something else? (stage?)
c) The increase to 700GB didn't stick?
d) The values in the CSV are incorrect (eg don't including indexes)?

That aside, perhaps we could dial down the 6 month expiry to something shorter for now in case the 16 days figure is true?

Some of the oldest data has less use at the moment, since (a) the trees weren't being sheriffed then, so things like the failure classifications are missing, (b) job name regexes were wrong etc. Also, by the time we have dashboards written to use Treeherder as a datasource (eg OrangeFactor v2), we'll have expired these older jobs even if we just left the timeframe at the current 6 months.

Bug 1078392 comment 0 implies as of today, we have ~5 months of data in treeherder - is this correct?

Reducing the 6 months to say 4, should give us some breathing room for now - plus also test our data expiration code.

We should also remove repos we've marked as "onhold" at some point, however looking at the CSV, the top 20 tables account for 97% of the data usage, and none of them are repos we've hidden/deprecated.
At the work week, Fubar showed us a new relic view which claimed we were using 400GB+ of db space.  Maybe he can give us current figures?
Flags: needinfo?(klibby)
Created attachment 8525452 [details]
prod_try_tables_size_20141119.csv

Size of each table in try_jobs_1
Created attachment 8525458 [details]
job_artifact_type_size_20141119.csv

This list tells us what's the impact of each artifact type on the size of ~25000 rows. Query used: 

select name, sum(char_length(`blob`)) /1024 / 1024 "size in MB" from job_artifact where id > <place a job_artifact id here> group by name;
:edmorley this is the query I used to obtain the total size of the databases
SELECT
  table_schema "Data Base Name",
  SUM( data_length + index_length ) / 1024 / 1024 / 1024 "Data Base Size in GB"
FROM
  information_schema.TABLES
GROUP BY
  table_schema ;
:jgriffin if cutting out a month of data is acceptable, that's the easiest and quickest solution. Otherwise we can ask to the DBAs if we can move try_jobs_1 to a separate VM, from the application point of view that's totally doable.
I think stripping the oldest month of data is very acceptable, for all the reasons that Ed stated in comment #5.  Let's hold off on moving try_jobs_1 to a separate VM until we've done some more work to analyze and reduce our DB growth rate.
I would suggest involving the DBAs if you have any questions at all about how much space is actually used, and by what. e.g. mysql binary logs are 1.1gb a pop, and there are 20+. also, your try table sizes above show low-60gb, but on disk I see:

69G     try_jobs_1
9.3G    try_objectstore_1


Current disk usage:
treeherder1.db.scl3# df -h /data
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1       690G  400G  255G  62% /data
Flags: needinfo?(klibby)
:jgriffin we need to improve the data cycling routine before we run it to delete one month of data. I'll work on the required changes in bug 1102228
Depends on: 1102228
Depends on: 1102257
I'm running the improved data cycling routine on production. I'll start with a 150 days cycle.
No longer blocks: 1080757
Component: Treeherder → Treeherder: Data Ingestion
(In reply to Mauro Doglio [:mdoglio] from comment #14)
> I'm running the improved data cycling routine on production. I'll start with
> a 150 days cycle.

Looking at dev/stage/prod, it seems that stage/dev are still set to 6 months (rather than 5 of dev) - is this going to cause a problem, or were the time until disk full predictions completely off after all?
Flags: needinfo?(mdoglio)
I haven't set the new data cycling policy in the setting file yet.
Flags: needinfo?(mdoglio)
I think this is no longer a P1, now that we've reduced the retained data to 4 months.,
Priority: P1 → P2
Blocks: 1120019
No longer blocks: 1120019
Blocks: 1120019
Depends on: 1125569
Sheeri, does this look like it will be at least roughly accurate, in terms of calculating table sizes and row counts? (Or is an ANALYZE required to even get the row counts in the correct ballpark?)

SELECT table_schema AS db, table_name, ROUND(SUM((data_length+index_length)/power(1024,2)),1) size_mb, table_rows
FROM information_schema.tables
GROUP BY table_schema, table_name
ORDER BY size_mb DESC;
Flags: needinfo?(scabral)
Flags: needinfo?(scabral)
Whiteboard: [data:consultative]
Initial thoughts (see below for data):

* There are some strange discrepancies between the figures on treeherder1 and treeherder2 - different data_free amounts would make sense, since the fragmentation may be different on each host - however I thought the rest should be roughly the same?
* We can save ~80GB just by running OPTIMIZE on the job_artifact table of inbound/try/fx-team/central/b2g-inbound.
** That said, it's strange that job_artifact has such a high proportion of free space compared to other tables, given the data expiration should affect them all equally. I can only think this is because in the past we've had much larger artifacts than we have at the moment? (eg the thousand line Talos error summaries; bug 1078409).
* performance_artifact accounts for 40% of total usage, even though it is only populated on trees that run Talos, and aiui hasn't even been ingested for the full 4 months of our data lifecycle yet.
** There must be some wins in terms of optimising what we store.
** Reducing lifecycle for just the perf artifacts table will get us a big win without loosing history for everything else.
* We could prune objectstores sooner than 4 months (currently 25GB, not a huge disk win, but would help with objectstore perf of bug 1125410).
* It looks like job_eta isn't being expired properly (it still has size on now-inactive repos where all other tables have been purged), though across all databases it is only 2.5GB anyway.
* We should delete databases for repositories now marked as 'onhold' (bug 1035294), though surprisingly it looks like it won't win back much disk space, since an OPTIMIZE must have been run for them post data expiration (or else the data_free field is inaccurate).
* We should spot-check the oldest rows in several tables to make sure we don't have any orphaned rows left due to issues with expiration.

...can anyone see anything else?

I'm happy to drive this bug for the moment - and will file bugs for the above :-)

===

Largest 20 databases:

> SELECT ROUND(SUM((data_length+index_length)/power(1024,3)),1) size_gb, table_schema AS db
FROM information_schema.tables GROUP BY table_schema
ORDER BY size_gb DESC LIMIT 20

treeherder1:

+ ------------ + ------- +
| size_gb      | db      |
+ ------------ + ------- +
| 136.4        | mozilla_inbound_jobs_1 |
| 61.3         | try_jobs_1 |
| 38.3         | fx_team_jobs_1 |
| 23.0         | mozilla_central_jobs_1 |
| 18.5         | b2g_inbound_jobs_1 |
| 7.8          | mozilla_inbound_objectstore_1 |
| 7.5          | try_objectstore_1 |
| 7.4          | mozilla_beta_jobs_1 |
| 6.8          | mozilla_aurora_jobs_1 |
| 3.6          | mozilla_esr31_jobs_1 |
| 3.3          | gaia_try_jobs_1 |
| 3.2          | cedar_jobs_1 |
| 2.9          | fx_team_objectstore_1 |
| 2.8          | maple_jobs_1 |
| 2.5          | ash_jobs_1 |
| 2.0          | b2g_inbound_objectstore_1 |
| 2.0          | gum_jobs_1 |
| 1.6          | mozilla_central_objectstore_1 |
| 1.3          | mozilla_b2g34_v2_1_jobs_1 |
| 1.3          | mozilla_release_jobs_1 |
+ ------------ + ------- +

treeherder2:

+ ------------ + ------- +
| size_gb      | db      |
+ ------------ + ------- +
| 166.4        | mozilla_inbound_jobs_1 |
| 63.0         | fx_team_jobs_1 |
| 61.3         | try_jobs_1 |
| 33.4         | mozilla_central_jobs_1 |
| 28.7         | b2g_inbound_jobs_1 |
| 13.0         | mozilla_aurora_jobs_1 |
| 7.8          | mozilla_inbound_objectstore_1 |
| 7.5          | try_objectstore_1 |
| 7.4          | mozilla_beta_jobs_1 |
| 3.6          | mozilla_esr31_jobs_1 |
| 3.5          | gaia_try_jobs_1 |
| 3.2          | cedar_jobs_1 |
| 2.9          | fx_team_objectstore_1 |
| 2.8          | maple_jobs_1 |
| 2.5          | ash_jobs_1 |
| 2.0          | b2g_inbound_objectstore_1 |
| 2.0          | gum_jobs_1 |
| 1.6          | mozilla_central_objectstore_1 |
| 1.3          | mozilla_b2g34_v2_1_jobs_1 |
| 1.3          | mozilla_release_jobs_1 |
+ ------------ + ------- +


Top table sizes, summed across all databases:

> SELECT ROUND(SUM((data_length+index_length)/power(1024,3)),1) size_gb, table_name
FROM information_schema.tables GROUP BY table_name
ORDER BY size_gb DESC LIMIT 10;

treeherder1:

+ ------------ + --------------- +
| size_gb      | table_name      |
+ ------------ + --------------- +
| 239.2        | job_artifact    |
| 58.0         | performance_artifact |
| 25.5         | objectstore     |
| 10.2         | job             |
| 5.3          | performance_series |
| 2.9          | job_log_url     |
| 2.3          | job_eta         |
| 0.6          | revision        |
| 0.3          | series_signature |
| 0.1          | revision_map    |
+ ------------ + --------------- +

treeherder2:

+ ------------ + --------------- +
| size_gb      | table_name      |
+ ------------ + --------------- +
| 198.0        | job_artifact    |
| 181.7        | performance_artifact |
| 25.4         | objectstore     |
| 9.9          | job             |
| 5.3          | performance_series |
| 2.8          | job_log_url     |
| 2.5          | job_eta         |
| 0.4          | revision        |
| 0.2          | series_signature |
| 0.1          | revision_map    |
+ ------------ + --------------- +


Top 10 candidates for OPTIMIZE:

> SELECT ROUND(data_free/power(1024,3),1) data_free_gb,
ROUND((data_length+index_length)/power(1024,3),1) size_gb, table_schema, table_name
FROM information_schema.tables
ORDER BY data_free DESC LIMIT 10

treeherder1:

+ ----------------- + ------------ + ----------------- + --------------- +
| data_free_gb      | size_gb      | table_schema      | table_name      |
+ ----------------- + ------------ + ----------------- + --------------- +
| 53.4              | 106.8        | mozilla_inbound_jobs_1 | job_artifact    |
| 22.3              | 54.8         | try_jobs_1        | job_artifact    |
| 6.2               | 29.0         | fx_team_jobs_1    | job_artifact    |
| 2.5               | 13.4         | mozilla_central_jobs_1 | job_artifact    |
| 2.3               | 13.9         | b2g_inbound_jobs_1 | job_artifact    |
| 0.6               | 2.2          | mozilla_inbound_jobs_1 | performance_series |
| 0.4               | 2.5          | cedar_jobs_1      | job_artifact    |
| 0.4               | 1.8          | ash_jobs_1        | job_artifact    |
| 0.4               | 0.4          | elm_jobs_1        | job_artifact    |
| 0.4               | 1.6          | maple_jobs_1      | job_artifact    |
+ ----------------- + ------------ + ----------------- + --------------- +

treeherder2:

+ ----------------- + ------------ + ----------------- + --------------- +
| data_free_gb      | size_gb      | table_schema      | table_name      |
+ ----------------- + ------------ + ----------------- + --------------- +
| 52.9              | 76.9         | mozilla_inbound_jobs_1 | job_artifact    |
| 21.6              | 42.6         | try_jobs_1        | job_artifact    |
| 6.2               | 29.1         | fx_team_jobs_1    | job_artifact    |
| 2.5               | 13.3         | mozilla_central_jobs_1 | job_artifact    |
| 2.2               | 13.9         | b2g_inbound_jobs_1 | job_artifact    |
| 0.4               | 2.4          | cedar_jobs_1      | job_artifact    |
| 0.4               | 1.8          | ash_jobs_1        | job_artifact    |
| 0.4               | 1.6          | maple_jobs_1      | job_artifact    |
| 0.3               | 0.4          | jamun_jobs_1      | job_artifact    |
| 0.3               | 2.1          | mozilla_inbound_jobs_1 | performance_series |
+ ----------------- + ------------ + ----------------- + --------------- +

I've also exported the more detailed |GROUP BY table_schema, table_name| breakdown to Google docs (public link):
https://docs.google.com/spreadsheets/d/1gOIjXkGgp4oIxulTXVoinkE8khI5s_7h1QybK6uhEvQ/edit?usp=sharing
Assignee: mdoglio → emorley
Status: NEW → ASSIGNED
(In reply to Ed Morley [:edmorley] from comment #20)
> * There are some strange discrepancies between the figures on treeherder1
> and treeherder2 - different data_free amounts would make sense, since the
> fragmentation may be different on each host - however I thought the rest
> should be roughly the same?

Sheeri, do you have any ideas about this?
Flags: needinfo?(scabral)
No longer blocks: 1120019
Ed, 

data_free is not the amount of fragmentation in a table.

I defragmented tables on 1/19, as per https://bugzilla.mozilla.org/show_bug.cgi?id=1120019#c10

Here's what changed:
Specifically:
fx_team_jobs1.job_artifact shrank from 46G to 30G
b2g_inbound_jobs_1.job_artifact shrank from 22G to 14G
mozilla_central_jobs_1.job_artifact shrank from 22G to 14G
mozilla_inbound_objectstore_1.objectstore shrank from 11G to 7.6G

These tables also shrank, but were smaller tables and could be run in real-time:
cedar_jobs_1.job_artifact
mozilla_central_objectstore_1.objectstore
fx_team_jobs_1.job
try_jobs_1.job_log_url
mozilla_inbound_jobs_1.job_log_url
mozilla_aurora_objectstore_1.objectstore
b2g_inbound_objectstore_1.objectstore
fx_team_objectstore_1.objectstore
fx_team_jobs_1.performance_series
ash_jobs_1.job_artifact
b2g_inbound_jobs_1.job
fx_team_jobs_1.job
gaia_try_jobs_1.job_artifact
jamun_jobs_1.job_artifact
maple_jobs_1.job_artifact
mozilla_aurora_jobs_1.job_artifact
mozilla_beta_jobs_1.job_artifact
mozilla_inbound_jobs_1.job
mozilla_inbound_jobs_1.performance_series
try_jobs_1.job
Flags: needinfo?(scabral)
InnoDB uses approximate statistics, so even if you use ANALYZE you won't get exact numbers. To see about fragmentation, add up data_length and index_length and compare it to the size on disk, again keeping in mind that InnoDB stats are approximate, so give it a 10% margin of error.
(In reply to Sheeri Cabral [:sheeri] from comment #23)
> InnoDB uses approximate statistics, so even if you use ANALYZE you won't get
> exact numbers. To see about fragmentation, add up data_length and
> index_length and compare it to the size on disk, again keeping in mind that
> InnoDB stats are approximate, so give it a 10% margin of error.

Meant to say thank you for this and comment 22 :-)
Breakdown of stage's mozilla-inbound job_artifact table, by artefact type:

> SELECT name as `Artefact name`, count(*) as `Count`, ROUND(SUM(LENGTH(`blob`))/power(1024,2),1) AS `Size in MB` FROM mozilla_inbound_jobs_1.job_artifact WHERE id > 9700000 GROUP BY name ORDER BY `Size in MB` DESC

+ ------------------ + ---------- + --------------- +
| Artefact name      | Count      | Size in MB      |
+ ------------------ + ---------- + --------------- +
| Structured Log     | 15007      | 75.1            |
| buildapi_complete  | 12965      | 14.6            |
| Job Info           | 14832      | 10.3            |
| Bug suggestions    | 15008      | 8.5             |
| buildapi_running   | 13033      | 4.7             |
| buildapi           | 24730      | 2.4             |
| buildapi_pending   | 15735      | 1.8             |
+ ------------------ + ---------- + --------------- +
7 rows

...strange how many more buildapi artefacts there are compared to anything else? I get that not all jobs will have {Structured Log, buildapi_complete, Bug suggestions} since they may have been coalesced, but they should still have a comparable number of {Job Info, buildapi_pending} to buildapi, right? (buildapi being where we store the buildername).
(In reply to Ed Morley [:edmorley] from comment #25)
> Breakdown of stage's mozilla-inbound job_artifact table, by artefact type:

For the last ~100,000 rows in the jobs_artifact table that is, the full query would have taken too long.
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.