do hg.m.o polling outside of the Buildbot master process

RESOLVED FIXED

Status

Release Engineering
General
P3
normal
RESOLVED FIXED
9 years ago
5 years ago

People

(Reporter: bhearsum, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [buildmasters][automation])

Attachments

(4 attachments, 4 obsolete attachments)

(Reporter)

Description

9 years ago
We need to pull out the polling of hg.m.o for a couple of reasons:
* Performance - the less work we have to do on the master, the better. Parsing so many pushlogs all the time is a suspected slow spot.
* Remove duplicate polling - since we separated the masters we poll mozilla-central and other repositories in multiple places. Once polling is centralized, away from the master we can remove that.

The idea is to poll every repository we need to, once, and then notify all appropriate masters of all changes. It would be good to only send wanted changes to each master but sending all changes to all masters would be fine too.
(Reporter)

Updated

9 years ago
Depends on: 513058
Is this worth doing compared to the concept of a shared pushlog db for all affected repos? If we'd get some momentum behind that, we might be off easier, and more robust.

Not that I know how you've planned to fix this one, and how much risk that is.
(Reporter)

Comment 2

9 years ago
I forgot to mention it in the summary, but it's something we need on the road to multiple master instances being fed jobs by the same scheduling system, too. Perf is the main concern at the moment, though.

I plan to get your review on the patches when they're ready, but I'm taking a simple approach, I think:
1) Factor out all polling code into base classes; have barebones classes that inherit them, and ChangeSource. These will be functionality equivalent to the current HgPoller and HgAllLocalesPoller classes.
2) Create subclasses of the new bases that will run outside of Buildbot and send the changes where we want them to goo.

Put simply:
The bases will be responsible for polling, the children will be responsible for passing the changes along.
(Reporter)

Comment 3

9 years ago
I had to add Properties support to the Buildbot Change objects to make it possible to make it possible for the locale polling to work outside of the Buildbot process. http://github.com/djmitche/buildbot/commit/bcd04d730f38cd9303e7e22efce7e21778b2c35b
No longer depends on: 513058
(Reporter)

Updated

9 years ago
Depends on: 513058
(Reporter)

Comment 4

9 years ago
Created attachment 397953 [details] [diff] [review]
[wip] buildbotcustom changes to support standalone hg polling

This patch passes the unit tests but is untested inside a master.

Needs a bit more work probably, but here's what's done so far:
* Separation of Buildbot specific code and general polling code. The latter is in the Base* classes, the former in Buildbot*
* removal changeHook() (because the Change object supports properties now)
* A few small things, including: remove unnecessary kwargs from HgLocalePoller, spacing fixes
* Update unit tests with new names

If anyone has time to look at this I'd appreciate it, but there's definitely no need for a real review yet.
(Reporter)

Comment 5

9 years ago
Created attachment 397955 [details] [diff] [review]
[wip] standalone hg pollers

Same thing here, WIP, here's what I've done:
* Create pollers that use Sendchange to forward to the master.
* Create minimal config (ini format)

I still need support for queuing up changes when a master goes down and I want to find a better way to deal with shutdown (this is hard). Other than that, I believe it's feature complete. Needs more testing before a real review.
(Reporter)

Updated

9 years ago
Depends on: 514242
(Reporter)

Comment 6

9 years ago
Created attachment 398199 [details] [diff] [review]
HgPoller refactor, buildbotcustom patch

This patch (and the incoming buildbot-configs one) is phase 1 of pulling out the pollers. At a high level, this patch moves all of the Buildbot specific code into subclasses, and leaves the HgPolling stuff in the bases. Renames have been made to most classes to match their new roles. The changeHook has disappeared in favour of the new Change Properties, which will also be used in the out of process pollers. Some minor changes have happened here too: spacing nits, removal of unnecessary constructor arguments, etc.
Attachment #397953 - Attachment is obsolete: true
Attachment #398199 - Flags: review?(ccooper)
Attachment #398199 - Flags: review?(catlee)
(Reporter)

Comment 7

9 years ago
Created attachment 398200 [details] [diff] [review]
master side patch for poller refactor

This one is pretty straightforward - just switches all the affected masters to using the new names for the pollers. Some unused imports were removed too.
Attachment #397955 - Attachment is obsolete: true
Attachment #398200 - Flags: review?(kairo)
Attachment #398200 - Flags: review?(gozer)
Attachment #398200 - Flags: review?(catlee)
(Reporter)

Comment 8

9 years ago
Created attachment 398730 [details] [diff] [review]
out of process HGPollers

Here's my attempt at out of process polling. The highlights:
* .ini config file that should get the script to do all of the polling we currently do, and report the changes to all the right places.
* BaseSendchangeHgPoller.submitChange is called when there's a change that needs sending. It handles sending to all of its masters and supports a variable number of retries and customizable interval between retrying. Defaults to an infinite amount of retries spaced at 1 minute apart. I think the other two pollers are self explanatory.
* Handles shutting down on SIGTERM, SIGQUIT, or SIGINT. I don't know of a better way than sleeping to try to shut down cleanly.
* Logging is handled by twisted, configurable on the command line.

This is currently running out of ~cltbld/bhearsumblah/tools/buildfarm/hg on staging-master. You can have a look at the logs there if you want to. I've been watching the waterfall and comparing it to pm and pm02 - we haven't missed any changes that they picked up.
Attachment #398730 - Flags: review?(catlee)
(Reporter)

Comment 9

9 years ago
Created attachment 398731 [details] [diff] [review]
patch to disable HgPoller's in mozilla2* and tryserver configs

This one is pretty straightforward. It probably seems silly to have this place the previous refactor patches but I don't want to have to back out *everything* if the oop polling script blows up.
Attachment #398731 - Flags: review?(catlee)

Updated

9 years ago
Attachment #398200 - Flags: review?(catlee) → review+
Comment on attachment 398731 [details] [diff] [review]
patch to disable HgPoller's in mozilla2* and tryserver configs

I like deleting code.
Attachment #398731 - Flags: review?(catlee) → review+
Comment on attachment 398730 [details] [diff] [review]
out of process HGPollers

Just a few nits...

>+        for s in self.senders:
>+            # The last argument here is the number of retries that have
>+            # happened so far. Needs to be tracked here in case we geadjustedChangeTime, properties        # overlapped failures

Something funky going on with the comment here.

>+    def failure(self, res, sender, change, adjustedChangeTime, properties,
>+                retries):
>+        if res.check(networkErrors) in networkErrors:

This is a bit awkward, it can probably be written as
    if res.check(networkErrors):
Attachment #398730 - Flags: review?(catlee) → review-
Comment on attachment 398200 [details] [diff] [review]
master side patch for poller refactor

Looking good from my end.
Attachment #398200 - Flags: review?(gozer) → review+
(Reporter)

Comment 13

9 years ago
Created attachment 399716 [details] [diff] [review]
out of process polling script, rev 2

I've addressed your review comments in this version, Chris.
Attachment #398730 - Attachment is obsolete: true
Attachment #399716 - Flags: review?(catlee)

Updated

9 years ago
Attachment #398200 - Flags: review?(kairo) → review+
As said on the buildbot mailing list, http://thread.gmane.org/gmane.comp.python.buildbot.devel/4950, I continue to think that this is a design bug in the making. The problem that we're not having a centralized change management on hg.m.o should be fixed there, and not hacked around by making things more complex.
(Reporter)

Comment 15

9 years ago
Created attachment 399745 [details] [diff] [review]
fix compare_attrs in BuildbotHgAllLocalesPoller
Attachment #398199 - Attachment is obsolete: true
Attachment #399745 - Flags: review?(catlee)
Attachment #398199 - Flags: review?(ccooper)
Attachment #398199 - Flags: review?(catlee)

Updated

9 years ago
Attachment #399745 - Flags: review?(catlee) → review+

Updated

9 years ago
Attachment #399716 - Flags: review?(catlee) → review+
(Reporter)

Updated

9 years ago
Depends on: 515686
(Reporter)

Comment 16

9 years ago
Comment on attachment 398200 [details] [diff] [review]
master side patch for poller refactor

changeset:   1497:16da32d6d680
Attachment #398200 - Flags: checked-in+
(Reporter)

Comment 17

9 years ago
Comment on attachment 399745 [details] [diff] [review]
fix compare_attrs in BuildbotHgAllLocalesPoller

changeset:   407:eaf5fa3e5740
Attachment #399745 - Flags: checked-in+
(Reporter)

Comment 18

9 years ago
Comment on attachment 399745 [details] [diff] [review]
fix compare_attrs in BuildbotHgAllLocalesPoller

Backed out because silly me forgot that these patches depend on Change Properties
Attachment #399745 - Flags: checked-in+ → checked-in-
(Reporter)

Updated

9 years ago
Attachment #398200 - Flags: checked-in+ → checked-in-
(Reporter)

Comment 19

9 years ago
Seems like this isn't going to happen soon.
Assignee: bhearsum → nobody
Component: Release Engineering → Release Engineering: Future

Updated

9 years ago
Status: ASSIGNED → NEW
Mass move of bugs from Release Engineering:Future -> Release Engineering. See
http://coop.deadsquid.com/2010/02/kiss-the-future-goodbye/ for more details.
Component: Release Engineering: Future → Release Engineering
Priority: -- → P3

Updated

8 years ago
Whiteboard: [buildmasters][automation]
Can we close this one, now that we do polling in the scheduler master?
(Reporter)

Comment 22

8 years ago
Yeah, having a separate scheduler master is functionally equivalent. RESO FIXED, in my opinion.
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
(Assignee)

Updated

5 years ago
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.