Closed Bug 1423607 Opened 6 years ago Closed 5 years ago

Decide whether less Perfherder data needs to be retained indefinitely

Categories

(Tree Management :: Perfherder, task, P1)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Unassigned)

References

(Blocks 1 open bug)

Details

Split out of bug 1346567 comment 14.

Currently Perfherder data for mozilla-{central,inbound,aurora,beta,release}, fx-team, servo and autoland is kept indefinitely. (Data for other repositories is expired after 4 months.

To help with bug 1346567 / bug 1078392, it would be good to confirm:
* whether data older than 4 months is ever useful
* whether any repos in that list can be omitted (eg just keep central/beta/release and nothing else)
* whether "indefinitely" could just be changed to "last 12 months"

It's worth noting that for any results older than the standard job data expiry (ie 4 months), the jobs to which the results refer won't exist any more (just the push data). I'm not familiar with whether this impacts any use-cases/workflows and thus whether it reduces the benefit of keeping perfherder data indefinitely.

Joel, do you have any thoughts? :-)
Flags: needinfo?(jmaher)
12 months is enough- we change our infra and tests often enough that going back further isn't useful- but to ensure a data series trend, we do look at 12 months more than we would like to admit.
Flags: needinfo?(jmaher)
And for?

(In reply to Ed Morley [:emorley] from comment #0)
> * whether any repos in that list can be omitted (eg just keep
Flags: needinfo?(jmaher)
for performance data we need repos:
mozilla-inbound
mozilla-central
mozilla-beta
autoland

possibly this will change as we are adding more tools and benchmarks to talos/perfherder- but for what we have now, this will do us good!
Flags: needinfo?(jmaher)
Great - thank you :-)
Priority: -- → P1

:emorley is there anything you need here?

Flags: needinfo?(emorley)

Data cycling is currently disabled entirely for Perfherder, since it was causing the overall Treeherder cycle data task to time out. This will need to be resolved at some point to prevent a mad rush once the DB disk usage alert triggers.

See bug 1346567 comment 15 for more info.

After that, it would still be good for someone from the Perfherder team to make a final decision on the questions raised in comment 0.

Flags: needinfo?(emorley)

(In reply to Ed Morley [:emorley] from comment #6)

After that, it would still be good for someone from the Perfherder team to make a final decision on the questions raised in comment 0.

I'm confirming Joel's opinions from comment 1 & comment 3.

(In reply to Ed Morley [:emorley] from comment #0)

It's worth noting that for any results older than the standard job data
expiry (ie 4 months), the jobs to which the results refer won't exist any
more (just the push data). I'm not familiar with whether this impacts any
use-cases/workflows and thus whether it reduces the benefit of keeping
perfherder data indefinitely.

It may impact some very edgy cases, but not grave or unresolvable ones. I don't remember any situation
in which I looked over old Treeherder data. Push data has the biggest interest for me.

Still, this doesn't reduce too much the benefit of 12 months old Perfherder data.
We require Perfherder data that's this old, as there are occasional fixes which really take several
months to be resolved.

Type: defect → task

This bug's purpose was just to decide on which repositories we wanted to keep data & for how long. It didn't imply any implementation aspects.
Conclusion here is to follow Joel's comment 3 when deploying bug 1346567.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.