Closed Bug 1181572 Opened 9 years ago Closed 9 years ago

Stop calculating/storing pending_eta

Categories

(Tree Management :: Treeherder, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 1 obsolete file)

Since bug 1096605, the UI no longer uses pending_eta, since it's too hard to predict, and thus unreliable. As such, we should remove it from the API response & stop calculating/storing it during ingestion.
Attached file Stop calculating/storing pending_eta (obsolete) —
Attachment #8631031 - Flags: review?(mdoglio)
Comment on attachment 8631031 [details] [review] Stop calculating/storing pending_eta Cancelling in favour of the approach I proposed on a call with Mauro/Cameron on Wednesday (ie: rewriting job etas from scratch with a new table, therefore avoiding having to clean up rows in the old table, plus we can use a table in the main treeherder DB from the start).
Attachment #8631031 - Flags: review?(mdoglio)
I'm unable to reopen #721, so opening a new PR for it. The conclusion previously was that we didn't want to land that PR since the job_eta bits were going to be rewritten anyway. However: 1) parts of the PR would still need to land anyway (eg removing it from the API response + dropping the column from the jobs table - since the jobs table parts are separate from the job_eta table parts) 2) whilst we're waiting for the rewrite, it will improve perf, since it halves the number of rows being scanned in the get_last_eta_by_signatures query (and state has an index, unlike signature), plus halves the rate at which new rows are being added to the job_eta table, which buys us more time 3) it reduces the noise when looking at the existing code, to plan the rewrite I've also split this up to make the changes clearer, and avoided some of the name-tweaking parts of #721 to make this easier to review (seeing as suboptimal function/stored proc names are going to go away as part of the job_eta rewrite anyway).
Attachment #8631031 - Attachment is obsolete: true
Attachment #8650471 - Flags: review?(mdoglio)
Blocks: 1196764
Comment on attachment 8650471 [details] [review] Stop calculating/storing pending_eta This is fantastic! Those patches where you delete a lot of code make my day.
Attachment #8650471 - Flags: review?(mdoglio) → review+
Commits pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/82bf0378fb76a86bb57ad0d94d550e01be36f2f9 Bug 1181572 - No longer return pending_eta from the jobs endpoint The UI does not use pending_eta as of bug 1096605, so there's no need for it to be returned by the /jobs API endpoint. https://github.com/mozilla/treeherder/commit/4f5d45532d77a1da518473b789f41bf3ed8cd35e Bug 1181572 - Stop storing pending_eta in the jobs table Now that the /jobs API no longer returns pending_eta, the column is unused, so we can stop populating it on insert and drop it from the table. The field is DEFAULT NULL, so landing this won't cause breakage prior to the column being dropped on stage/prod. https://github.com/mozilla/treeherder/commit/521883a9df59b46c7ba5b5fac009542948a3a637 Bug 1181572 - Stop retrieving rows of type 'pending' from job_eta table Now that we no longer store pending_eta for each job in the jobs table, there is no point fetching pending job related rows, when get_last_eta_by_signatures is used as part of jobs ingestion. There is much more simplification that could be performed (ie not returning as nested an object in get_job_eta_times() now that we only return one state "running") - but job_eta will be rewritten entirely soon, so it's not worth it. https://github.com/mozilla/treeherder/commit/29881f85d84dd07dd2b4af2a3cea8e326ad79ee3 Bug 1181572 - Stop inserting rows of type 'pending' in the job_eta table Now that get_last_eta_by_signatures only fetches rows with state "running", there's no point storing rows with state "pending". If we're not storing these rows, then we also don't need to calculate their contents, so get_eta_groups (which is used to fetch recent eta times, during the calculate eta task) can also stop returning stats for "pending" too.
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/3bd486709101948b0bc8206d20696a6aee958fb3 Revert "Bug 1181572 - Stop retrieving rows of type 'pending' from job_eta table" Since for whatever reason this increased the time for the query rather than reducing it, even though the EXPLAIN says half the number of rows were being scanned with the change. Let's just revert this part and wait for the job_eta rewrite in bug 1196764 rather than spend any more time on query optimisation. This reverts commit 521883a9df59b46c7ba5b5fac009542948a3a637.
The perf regression was: https://rpm.newrelic.com/accounts/677903/applications/5585473/datastores?tw%5Bend%5D=1440170035&tw%5Bstart%5D=1440162489#/overview/MySQL/drilldown?metric=Datastore%252Fstatement%252FMySQL%252Fjob_eta%252Fselect&value=total_call_time_per_minute We don't need the part that was reverted in a hurry, so can just wait for the job_eta rewrite in bug 1196764. Calling this fixed since the other parts were landed.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: