Closed
Bug 1164055
Opened 9 years ago
Closed 9 years ago
don't reclaim tasks so often in buildbot bridge
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bhearsum, Assigned: bhearsum)
References
Details
(Whiteboard: [bbb])
Attachments
(1 file)
7.12 KB,
patch
|
jonasfj
:
review+
bhearsum
:
checked-in+
|
Details | Diff | Splinter Review |
Probably once every 5min or so is good.
Comment 1•9 years ago
|
||
The formal semantics are that you must reclaim before takenUntil expires. takenUntil is a timestamp. To allow for a little clock drift, network delays, it's often a good idea to reclaim ~5min before takenUntil expires. This is very different from "every 5 min or so". In particular when implemented with these semantics we can increase the duration of takenUntil server-side, if we want to reduce the overhead of reclaimTask calls. Hence use the condition: (takenUntil < now() - 5min) If your loop is particularly slow, you can use 10 min instead. See code sample, as suggested in review: > + if int(t.takenUntil) < (arrow.utcnow().timestamp - 5 * 60): > try: > result = self.tc_queue.reclaimTask(t.taskId, int(t.runId)) > ... Insert it right here: https://github.com/mozilla/buildbot-bridge/blob/62d122fff50eb4513fb89db89364c4dba1788b9d/bbb/services.py#L227-L228
Assignee | ||
Comment 2•9 years ago
|
||
Thanks for the clarifications around using 5min before expiry vs. every 5 min. There's three instances of the reflector running, so I _think_ we'll be able to keep up. We can keep an eye on this as we start moving schedulers over and adjust it if needed. I gave this a test and the log output has it working correctly. We have the original claim (done by the BuildbotListener): 2015-05-13 08:52:04,674 - bbb.services - Claiming fHD3e-RnSze9JOZoTSwxIg Which gets: "takenUntil": "2015-05-13T16:12:04.930Z" We get a bunch of "in progress" messages from the reflector: 2015-05-13 08:52:21,976 - bbb.services - BuildRequest 67 is in progress 2015-05-13 08:53:22,037 - bbb.services - BuildRequest 67 is in progress 2015-05-13 08:54:22,088 - bbb.services - BuildRequest 67 is in progress ... 2015-05-13 09:06:22,844 - bbb.services - BuildRequest 67 is in progress And eventually: 2015-05-13 09:07:22,910 - bbb.services - BuildRequest 67 is in progress 2015-05-13 09:07:22,910 - bbb.services - Claim for BuildRequest 67 is more than 5min old, reclaiming Which gets: "takenUntil": "2015-05-13T16:27:23.206Z" ...and the cycle repeats.
Assignee | ||
Updated•9 years ago
|
Attachment #8605331 -
Flags: review?(jopsen)
Comment 3•9 years ago
|
||
Comment on attachment 8605331 [details] [diff] [review] slower-reclaim-bbb.diff Review of attachment 8605331 [details] [diff] [review]: ----------------------------------------------------------------- So, this line looks very sketchy to me: > if arrow.now() > arrow.get(t.takenUntil).replace(minutes=-5): But checking the docs .replace(minutes=-5) is relative. While .replace(minute=5) is not... So definitely works... it just looks very wrong, he he :) (ie. if it replace minute with 5 we would have some very weird behaviour; for the record it does make a relative offset as we want)
Attachment #8605331 -
Flags: review?(jopsen) → review+
Assignee | ||
Comment 4•9 years ago
|
||
Comment on attachment 8605331 [details] [diff] [review] slower-reclaim-bbb.diff I pushed this and bumped the version number. I'll get it rolled out soon, hopefully.
Attachment #8605331 -
Flags: checked-in+
Assignee | ||
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•9 years ago
|
Whiteboard: [bbb]
Updated•6 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•