Closed Bug 1682580 Opened 6 months ago Closed 5 months ago

Reclaim unused disk for operating system

Categories

(Tree Management :: Perfherder, task, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: igoldan, Assigned: sclements)

References

Details

Thanks to new data expiring strategies like bug 1667051, we're able to delete more than 100 million (no longer needed) rows from the database.

However, the database doesn't seem to give that freed up space back to the operating system.

Other specialists complain about this on SO as well & suggest various techniques & utilities such as the OPTIMIZE statement.

We need to ask our DBAs to take care of this aspect.
Jira link: https://jira.mozilla.com/browse/FXP-1390

Sarah, who would be the best person to ask for this?

Flags: needinfo?(sclements)
Priority: -- → P2
Severity: -- → S3

https://aws.amazon.com/premiumsupport/knowledge-center/rds-mysql-storage-optimization/

The innodb_file_per_table is enabled so you should be able to execute OPTIMIZE TABLE treeherder; to clear up the space on disk.
I would do this on the dev and stage before running it on prod.

Ionut, to clarify is it just the performance_datum table you're seeing the issue or also performance_signature & performance_alert tables?

Flags: needinfo?(sclements)

(In reply to Sarah Clements [:sclements] from comment #3)

Ionut, to clarify is it just the performance_datum table you're seeing the issue or also performance_signature & performance_alert tables?

I noticed this for the performance_datum table, as that's the biggest one & it's more obvious.
But I believe performance_signature has this issue as well. This table almost hit 1 million rows at some point.

WRT performance_alert table: I think this capped at 100,000 rows at some point. If not that impactful; if we can reclaim space from it as well, good; if not, we should still be ok.

I tried optimizing the performance_datum table on prototype and it worked, but it took almost 2 hours to complete during which time I paused log parsing to prevent any inserts to that table. I'm running it on stage now but for production I'll need to schedule this for a weekend where down time will be acceptable to the sheriffs. I've also discovered the need for a few other large tables, like Job and FailureLine, that need this applied to them also so I'll try to do it all in one go.

Assignee: nobody → sclements
Status: NEW → ASSIGNED

This is really great news! Surely we'll get query time improvements across the whole app.

I optimized several tables on stage and production this past weekend. I also added a note on my calendar for 6 months from now to check if it needs to be done again.

Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.