Closed
Bug 1124723
Opened 10 years ago
Closed 10 years ago
Add the performance_artifact table to the list of data cycling targets
Categories
(Tree Management :: Treeherder: Data Ingestion, defect, P1)
Tree Management
Treeherder: Data Ingestion
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mdoglio, Assigned: wlach)
References
(Blocks 1 open bug)
Details
Attachments
(2 files)
At the moment there's no data cycling setup for those.
Updated•10 years ago
|
Priority: -- → P1
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → wlachance
Assignee | ||
Comment 1•10 years ago
|
||
Working on this.
Assignee | ||
Comment 2•10 years ago
|
||
Attachment #8553353 -
Flags: review?(mdoglio)
Reporter | ||
Updated•10 years ago
|
Attachment #8553353 -
Flags: review?(mdoglio) → review+
Reporter | ||
Comment 3•10 years ago
|
||
The patch looks good, it just needs to be rebased on top of master. The github ui says there are some conflicts.
Comment 4•10 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder-service
https://github.com/mozilla/treeherder-service/commit/1ae46a4cbd7336546a68360dd537826534985f14
Bug 1124723 - Add performance_artifact to list of tables to perform expiry on
Assignee | ||
Comment 5•10 years ago
|
||
Rebased and pushed.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment 6•10 years ago
|
||
I think this will break as-is - performance_artifact has a foreign key constraint on jobs.id:
https://github.com/mozilla/treeherder-service/blob/c1713cab92e353c12a0288be6b16dc2de141bebe/treeherder/model/sql/template_schema/project_jobs_1.sql.tmpl#L278
But the change here put the jobs.deletes.cycle_performance_artifact after the jobs delete:
https://github.com/mozilla/treeherder-service/blob/93ae4541feb7013bd7181f5a714644a51521e55b/treeherder/model/derived/jobs.py#L162
(See order dependency comment a few lines above that)
Also I don't see any handling for the performance_series and series_signature tables?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 7•10 years ago
|
||
(In reply to Ed Morley [:edmorley] from comment #6)
> I think this will break as-is - performance_artifact has a foreign key
> constraint on jobs.id:
> https://github.com/mozilla/treeherder-service/blob/
> c1713cab92e353c12a0288be6b16dc2de141bebe/treeherder/model/sql/
> template_schema/project_jobs_1.sql.tmpl#L278
>
> But the change here put the jobs.deletes.cycle_performance_artifact after
> the jobs delete:
> https://github.com/mozilla/treeherder-service/blob/
> 93ae4541feb7013bd7181f5a714644a51521e55b/treeherder/model/derived/jobs.
> py#L162
> (See order dependency comment a few lines above that)
Hmm, this worked locally for me? In any case, can we fix this just by shuffling the performance artifact cycling above the jobs delete?
> Also I don't see any handling for the performance_series and
> series_signature tables?
Correct; we should do a followup for those. That said, the performance artifacts are taking up the vast majority of space.
Comment 8•10 years ago
|
||
(In reply to William Lachance (:wlach) from comment #7)
> Hmm, this worked locally for me? In any case, can we fix this just by
> shuffling the performance artifact cycling above the jobs delete?
Indeed :-)
> > Also I don't see any handling for the performance_series and
> > series_signature tables?
>
> Correct; we should do a followup for those. That said, the performance
> artifacts are taking up the vast majority of space.
My concern was more that once we purge performance_artifact we lose easier ways to rescue orphans (eg job_id < N) and have to resort to checking if every signature in the other tables is present in performance_artifact.
Comment 9•10 years ago
|
||
Though tbh we're going to have to mess around already with orphan rows, once the various issues are fixed, so it can probably be fixed in another bug.
Assignee | ||
Comment 10•10 years ago
|
||
(In reply to Ed Morley [:edmorley] from comment #8)
> (In reply to William Lachance (:wlach) from comment #7)
> > Hmm, this worked locally for me? In any case, can we fix this just by
> > shuffling the performance artifact cycling above the jobs delete?
>
> Indeed :-)
Ok, I'll look into this.
> > > Also I don't see any handling for the performance_series and
> > > series_signature tables?
> >
> > Correct; we should do a followup for those. That said, the performance
> > artifacts are taking up the vast majority of space.
>
> My concern was more that once we purge performance_artifact we lose easier
> ways to rescue orphans (eg job_id < N) and have to resort to checking if
> every signature in the other tables is present in performance_artifact.
Hmm, I don't think perfomance artifacts are really all that helpful in determining when series signatures and the series themselves should be expired. I think we're just going to plain have to dig into the contents of these tables to determine whether the data is still valid or not. In the case of the series themselves, we should expire tables which are empty. In the case of series signatures, we should expire entries that no longer have a series associated with them.
Comment 11•10 years ago
|
||
Ok, let's break them out :-)
I've filed bug 1126937 for performance_series and series_signature tables.
Morphing summary to what landed here (I only reopened because the summary and comment 0 implied perfherder tables in general).
Summary: Add the perfherder tables to the list of data cycling targets → Add the performance_artifact table to the list of data cycling targets
Assignee | ||
Comment 12•10 years ago
|
||
Attachment #8556005 -
Flags: review?(emorley)
Comment 13•10 years ago
|
||
Comment on attachment 8556005 [details] [review]
Cycle performance artifacts before jobs
Thanks :-)
(btw I think maybe the reason this didn't break is that InnoDB does cascade deletes by default? I could be wrong, just skimmed some google foo. Either way I guess we want to avoid the cascade deletes for perf reasons)
Attachment #8556005 -
Flags: review?(emorley) → review+
Comment 14•10 years ago
|
||
Commits pushed to master at https://github.com/mozilla/treeherder-service
https://github.com/mozilla/treeherder-service/commit/2627427f83e6651cf459b9748bc41c72b1d02cff
Bug 1124723 - Cycle performance artifacts before jobs
https://github.com/mozilla/treeherder-service/commit/4064b07b70bc286f6912e8ed9b40b6213479d0d3
Merge pull request #357 from wlach/1124723
Bug 1124723 - Cycle performance artifacts before jobs
Assignee | ||
Comment 15•10 years ago
|
||
Gonna resolve this for now, feel free to reopen if there are more issues. :)
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•