Closed
Bug 1181572
Opened 9 years ago
Closed 9 years ago
Stop calculating/storing pending_eta
Categories
(Tree Management :: Treeherder, defect, P3)
Tree Management
Treeherder
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: emorley)
References
(Blocks 1 open bug)
Details
Attachments
(1 file, 1 obsolete file)
Since bug 1096605, the UI no longer uses pending_eta, since it's too hard to predict, and thus unreliable.
As such, we should remove it from the API response & stop calculating/storing it during ingestion.
Assignee | ||
Comment 1•9 years ago
|
||
Attachment #8631031 -
Flags: review?(mdoglio)
Assignee | ||
Comment 2•9 years ago
|
||
Comment on attachment 8631031 [details] [review]
Stop calculating/storing pending_eta
Cancelling in favour of the approach I proposed on a call with Mauro/Cameron on Wednesday (ie: rewriting job etas from scratch with a new table, therefore avoiding having to clean up rows in the old table, plus we can use a table in the main treeherder DB from the start).
Attachment #8631031 -
Flags: review?(mdoglio)
Assignee | ||
Comment 3•9 years ago
|
||
I'm unable to reopen #721, so opening a new PR for it. The conclusion previously was that we didn't want to land that PR since the job_eta bits were going to be rewritten anyway.
However:
1) parts of the PR would still need to land anyway (eg removing it from the API response + dropping the column from the jobs table - since the jobs table parts are separate from the job_eta table parts)
2) whilst we're waiting for the rewrite, it will improve perf, since it halves the number of rows being scanned in the get_last_eta_by_signatures query (and state has an index, unlike signature), plus halves the rate at which new rows are being added to the job_eta table, which buys us more time
3) it reduces the noise when looking at the existing code, to plan the rewrite
I've also split this up to make the changes clearer, and avoided some of the name-tweaking parts of #721 to make this easier to review (seeing as suboptimal function/stored proc names are going to go away as part of the job_eta rewrite anyway).
Attachment #8631031 -
Attachment is obsolete: true
Attachment #8650471 -
Flags: review?(mdoglio)
Comment 4•9 years ago
|
||
Comment on attachment 8650471 [details] [review]
Stop calculating/storing pending_eta
This is fantastic! Those patches where you delete a lot of code make my day.
Attachment #8650471 -
Flags: review?(mdoglio) → review+
Comment 5•9 years ago
|
||
Commits pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/82bf0378fb76a86bb57ad0d94d550e01be36f2f9
Bug 1181572 - No longer return pending_eta from the jobs endpoint
The UI does not use pending_eta as of bug 1096605, so there's no need
for it to be returned by the /jobs API endpoint.
https://github.com/mozilla/treeherder/commit/4f5d45532d77a1da518473b789f41bf3ed8cd35e
Bug 1181572 - Stop storing pending_eta in the jobs table
Now that the /jobs API no longer returns pending_eta, the column is
unused, so we can stop populating it on insert and drop it from the
table. The field is DEFAULT NULL, so landing this won't cause breakage
prior to the column being dropped on stage/prod.
https://github.com/mozilla/treeherder/commit/521883a9df59b46c7ba5b5fac009542948a3a637
Bug 1181572 - Stop retrieving rows of type 'pending' from job_eta table
Now that we no longer store pending_eta for each job in the jobs table,
there is no point fetching pending job related rows, when
get_last_eta_by_signatures is used as part of jobs ingestion. There is
much more simplification that could be performed (ie not returning as
nested an object in get_job_eta_times() now that we only return one
state "running") - but job_eta will be rewritten entirely soon, so it's
not worth it.
https://github.com/mozilla/treeherder/commit/29881f85d84dd07dd2b4af2a3cea8e326ad79ee3
Bug 1181572 - Stop inserting rows of type 'pending' in the job_eta table
Now that get_last_eta_by_signatures only fetches rows with state
"running", there's no point storing rows with state "pending". If we're
not storing these rows, then we also don't need to calculate their
contents, so get_eta_groups (which is used to fetch recent eta times,
during the calculate eta task) can also stop returning stats for
"pending" too.
Comment 6•9 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/3bd486709101948b0bc8206d20696a6aee958fb3
Revert "Bug 1181572 - Stop retrieving rows of type 'pending' from job_eta table"
Since for whatever reason this increased the time for the query rather
than reducing it, even though the EXPLAIN says half the number of rows
were being scanned with the change. Let's just revert this part and wait
for the job_eta rewrite in bug 1196764 rather than spend any more time
on query optimisation.
This reverts commit 521883a9df59b46c7ba5b5fac009542948a3a637.
Assignee | ||
Comment 7•9 years ago
|
||
The perf regression was:
https://rpm.newrelic.com/accounts/677903/applications/5585473/datastores?tw%5Bend%5D=1440170035&tw%5Bstart%5D=1440162489#/overview/MySQL/drilldown?metric=Datastore%252Fstatement%252FMySQL%252Fjob_eta%252Fselect&value=total_call_time_per_minute
We don't need the part that was reverted in a hurry, so can just wait for the job_eta rewrite in bug 1196764.
Calling this fixed since the other parts were landed.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•