Not cleaning up old machine data
Categories
(Tree Management :: Treeherder: Data Ingestion, defect, P3)
Tracking
(Not tracked)
People
(Reporter: aryx, Unassigned)
Details
It seems old data about machines doesn't get expunged.
select count(*)
from machine
returns 8905675
.
select count(*)
from machine
left join job
on machine.id = job.machine_id
where job.id is NULL
returns 2112934
The documentation says cycle_data
runs daily. Its filter what to delete looks sane on first inspection: https://github.com/mozilla/treeherder/blob/a5df8a966b1202f3f80872a78f6093ea060cdb77/treeherder/model/management/commands/cycle_data.py#L61-L75
Any idea why there are 2 million machine names without jobs associated?
Updated•5 years ago
|
Comment 1•5 years ago
|
||
This would be good to get to. In fact, not sure it's still valueable to even KEEP machine data. Aren't they all virtual these days? We should probably just stop saving them. We don't download them with the jobs in the Treeherder view. So unless they're fetched by somebody (which I doubt) then let's kill the table completely.
![]() |
Reporter | |
Comment 2•5 years ago
|
||
Don't. We have several scripts on sql.telemetry.mozilla.org which aggregate machine health and rely on that.
Description
•