Closed Bug 1276030 Opened 9 years ago Closed 9 years ago

Refresh everything if it's been too long since the last poll

Categories

(Tree Management :: Treeherder, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wlach, Assigned: wlach)

Details

Attachments

(1 file, 1 obsolete file)

Looking on new relic, we've gotten some crazy requests for updates to jobs spanning 50+ result sets over 24+ hours. This takes forever and overly burdens the server. It's a bit of a hack, but I think it's best to just not poll for updates if it's been too long since it was last done-- chances are it's just a tab that needs to be closed.
Attachment #8757013 - Flags: review?(cdawson)
I think some of the sheriff's actually do this on purpose. I'm not certain, but I recall some mentioning that they will leave a tab open on Friday, let it collect all the new jobs and then on Monday let it load everything over the weekend and pick up where they left off. Not making a judgement on whether that's a good or bad workflow. But if they do this, then their page will look pretty static until they force a reload. This is probably the right thing to do anyway, but we should maybe give them a heads up on it. cc'ing a couple sheriff's for visibility.
If the patch were "don't poll for updates, and show a huge persistent message covering the entire window saying that the tab is broken and must be reloaded" then I'd be all for it, because that would be an improvement over the current state where it's a crapshoot whether I'll get nothing freshly loaded for the overnight results, or a random selection of pushes fully loaded and others not at all, or some pushes partly loaded, or everything perfectly loaded, so I generally try to remember to reload every tab first thing in the morning and after work but sometimes just merrily proceed as though everything was fine when it isn't. But silently breaking, making it absolutely certain that sleeping a laptop for 16 minutes rather than the usual 14 that it takes to go from one office to another will result in a completely and utterly broken treeherder display that shows no sign of being broken? That's awful. And don't just think about sheriffs' treeherder tabs with their knowledge of both what should be visible and how bad we are at waking up; think about developer use-cases, and how bad they are at interpreting what should be showing on their try push when they wake up. In particular, if you do go ahead with that approach, you would need to be absolutely certain that *nothing* would update in any way, that all jobs which were pending when they went to sleep were still shown as pending, that the percentage complete would still show a lie about only being partly done, so you don't give them a static non-updated display of zero unstarred failures together with a "- Complete -" when only the percent complete is accurate and you are hiding from them the fact that their push blew up on the jobs that finished after they went to sleep.
Yeah, you'll need to have some very persistent notification that updating is disabled until you refresh the page.
Comment on attachment 8757013 [details] [review] [treeherder] wlach:1276030 > mozilla:master Ok this sounds like a bad idea, based on what I'm hearing. A better solution is probably just to *reload* all the jobs for a result set if it's been long enough since it was last polled. That will save us the crazy expensive mysql queries (creating a temporary table to iterate through 80000 jobs in sorted order is *expensive*, way more so than just loading everything all over again) and is probably more robust to boot. I'll give that a try tomorrow.
Attachment #8757013 - Attachment is obsolete: true
Attachment #8757013 - Flags: review?(cdawson)
Assignee: nobody → wlachance
Summary: Limit polling interval to updates in the last 15 minutes → Refresh everything if it's been too long since the last poll
Comment on attachment 8758817 [details] [review] [treeherder] wlach:1276030 > mozilla:master Ok, I think this should be more reasonable. It just reloads everything if it's been too long since the last time we polled.
Attachment #8758817 - Flags: review?(cdawson)
Sorry for the delay on reviewing this. I've was trying to wrap something up today and just took too long. I'll review this tomorrow for sure.
Comment on attachment 8758817 [details] [review] [treeherder] wlach:1276030 > mozilla:master One little nit, but overall this looks great! I didn't actually test it, but I presume you did. :) Thanks for doing this!
Attachment #8758817 - Flags: review?(cdawson) → review+
Commits pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/6d07bbb9899373de9aefa58eb1e9c1634260af6b Bug 1276030 - Fix job fixtures in ui unit tests We had a redundant number in the middle of the list, which meant that the last_modified and build_platform_id fields were invalid. https://github.com/mozilla/treeherder/commit/c4c6004ae8fbabea497a1bce413dd947e4704281 Bug 1276030 - Somewhat simplify the polling logic for new jobs This also fixes a minor bug where fetching older jobs could result in some updates being re-fetched needlessly. https://github.com/mozilla/treeherder/commit/edbf510f942e49a1813d0c422a230d758222b912 Bug 1276030 - Refetch jobs if it's been too long since we last polled If it's been a really long time (> 15 minutes) since the last update, the query to get a full set of "updates" can be extremely taxing on the server (not to mention slow). Better just to refetch everything in this case. https://github.com/mozilla/treeherder/commit/3c683a2fa75eb5a3994aff742c324865c778150c Merge pull request #1523 from wlach/1276030 Bug 1276030 - Refetch all job data after a long period of inactivity
Will, can this be marked as fixed? :-)
Yes, I believe so. I don't seem to be seeing those crazy queries in new relic anymore, which is a good sign. :)
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: