Closed Bug 1105485 Opened 10 years ago Closed 10 years ago

Fix up submission of release blobs so they can be used for beta builds

Categories

(Release Engineering :: Release Automation: Other, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Assigned: bhearsum)

References

Details

Attachments

(10 files, 10 obsolete files)

24.71 KB, patch
rail
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
1.05 KB, patch
rail
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
2.46 KB, patch
nthomas
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
109.69 KB, patch
nthomas
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
23.45 KB, patch
nthomas
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
46.47 KB, patch
nthomas
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
1.90 KB, patch
rail
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
676 bytes, patch
rail
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
675 bytes, patch
rail
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
676 bytes, patch
rail
: review+
bhearsum
: checked-in+
Details | Diff | Splinter Review
Basically automating to avoid bug 1105228. We did a lot of this in bug 1021026, but some new issues might have slipped in.
We were inconsistent with the test channel we used for 34.0 and 34.0.5. For 34.0, we used beta-localtest, for 34.0.5, we used beta-cdntest. I don't really care which we use, but let's make sure we stay consistent. I think we need to do the following to remove all manual tweaking: * Bouncer submitter: ** Create the complete MAR bouncer entries (eg, firefox-34.0build2-complete) which point at the candidates directory. ** Change the partial MAR bouncer entries (eg, firefox-34.0-partial-34.0b11) to point at the candidates directory - because they currently get created pointing at the releases directory. This part could be tricky, because we still want the non-beta partials to point at releases. * Balrog submitter: ** Create fileUrls for both beta and test channel that point at the bouncer entries from above. So, unless I'm missing something it looks like the complexity is more in bouncer entries than in the Balrog submitter.
(In reply to Ben Hearsum [:bhearsum] from comment #1) > We were inconsistent with the test channel we used for 34.0 and 34.0.5. For > 34.0, we used beta-localtest, for 34.0.5, we used beta-cdntest. I don't > really care which we use, but let's make sure we stay consistent. After a reread of this it seems pretty clear we should be using beta-cdntest, since we ship these MARs off of the cdns.
I'll be looking at this soon.
Assignee: nobody → bhearsum
Per our conversation yesterday, creating build1 entries for ALL partials (rather than just ones from betas) is the cleanest way of doing this. We couldn't figure out a way to only create them for betas that didn't involve hacks like "if isBeta(version)" somewhere. I tested this against bouncer stage, you can have a look at the 34.0.10 locations on http://bounceradmin.allizom.org/admin/mirror/location/ to see the results. "git diff -w upstream/master" might help in reviewing the partials section of this, since most of it is just indentation changes.
Attachment #8544635 - Flags: review?(rail)
AFAICT this is the only place we call it.
Attachment #8544637 - Flags: review?(rail)
Attachment #8544635 - Flags: review?(rail) → review+
Attachment #8544637 - Flags: review?(rail) → review+
Attachment #8544635 - Flags: checked-in+
Attachment #8544637 - Flags: checked-in+
This gets us fileUrls sections like: "fileUrls": { "beta": { "partials": { "Firefox-33.1-build3": "http://download.mozilla.org/?product=firefox-34.0build2-partial-33.1&os=%OS_BOUNCER%&lang=%LOCALE%" }, "completes": { "*": "http://download.mozilla.org/?product=firefox-34.0build2-complete&os=%OS_BOUNCER%&lang=%LOCALE%" } }, "*": { "partials": { "Firefox-33.1-build3": "http://download.mozilla.org/?product=firefox-34.0-partial-33.1&os=%OS_BOUNCER%&lang=%LOCALE%" }, "completes": { "*": "http://download.mozilla.org/?product=firefox-34.0-complete&os=%OS_BOUNCER%&lang=%LOCALE%" } }, "beta-cdntest": { "partials": { "Firefox-33.1-build3": "http://download.mozilla.org/?product=firefox-34.0build2-partial-33.1&os=%OS_BOUNCER%&lang=%LOCALE%" }, "completes": { "*": "http://download.mozilla.org/?product=firefox-34.0build2-complete&os=%OS_BOUNCER%&lang=%LOCALE%" } }, "betatest": { "partials": { "Firefox-33.1-build3": "http://dev-stage01.srv.releng.scl3.mozilla.com/pub/mozilla.org/firefox/candidates/34.0-candidates/build2/update/%OS_FTP%/%LOCALE%/firefox-33.1-34.0.partial.mar" }, "completes": { "*": "http://dev-stage01.srv.releng.scl3.mozilla.com/pub/mozilla.org/firefox/candidates/34.0-candidates/build2/update/%OS_FTP%/%LOCALE%/firefox-34.0.complete.mar" } } }, ...which look correct to me. You'll notice in this patch that there's some changes to the release configs implied. The pluralization of cdnTestChannels is to allow for pushing to release-cdntest + beta-cdntest through automation. otherChannels is to be able to specify "beta". The number of variables we have for update channels now is a bit disturbing but I don't want to over reach here. At some point it may be good to have a single data structure that includes all of the relevant channels + rule ids.
Attachment #8546149 - Flags: review?(nthomas)
Attached patch adjust release configs (obsolete) — Splinter Review
This adjusts the release configs to use the new variables, adjusts the ready for cdn testing mail to list multiple channels, and adds the beta-cdntest rule id to the test channel ids. That last part is potentially problematic if we're building a point release at the same time as a beta. I'm not quite sure how to work around that though...
Attachment #8546153 - Flags: review?(nthomas)
Attached patch buildbotcustom adjustment (obsolete) — Splinter Review
Attachment #8546154 - Flags: review?(nthomas)
There's some stuff that still needs to happen before we can call RCs -> beta 100% automated, but we should do that in other bugs: * Run update verify (somewhere, somehow) * Run bouncer submitter for ALL builds (because we now have build-specific bouncer entries)
Comment on attachment 8546154 [details] [diff] [review] buildbotcustom adjustment Seems to be unused, r+ to remove it instead.
Attachment #8546154 - Flags: review?(nthomas) → review+
Comment on attachment 8546149 [details] [diff] [review] set correct fileUrls for beta/beta-cdntest Review of attachment 8546149 [details] [diff] [review]: ----------------------------------------------------------------- ::: lib/python/balrog/submitter/cli.py @@ +168,5 @@ > > + # "*" is for the default set of fileUrls, which generally points at > + # bouncer. It's helpful to have this to reduce duplication between > + # the live channel and the cdntest channel (which eliminates the > + # possibility that those two channels serve different contents). The example blob has the same content for beta and beta-cdntest, so I wonder if we could use tuples as the dict keys, eg ('beta', 'beta-cdntest'). Not going to block on that given it needs a Balrog change to handle it.
Attachment #8546149 - Flags: review?(nthomas) → review+
Comment on attachment 8546153 [details] [diff] [review] adjust release configs Review of attachment 8546153 [details] [diff] [review]: ----------------------------------------------------------------- ::: mozilla/release-firefox-mozilla-release.py.template @@ +129,5 @@ > 'macosx64': 'xulrunner/config/mozconfigs/macosx-universal/xulrunner', > 'win32': 'xulrunner/config/mozconfigs/win32/xulrunner', > } > releaseConfig['releaseChannel'] = 'release' > releaseConfig['releaseChannelRuleIds'] = [] # Still on AUS3 Will releaseChannelRuleIds be updated after the cutover to Balrog ? ::: mozilla/release_templates/ready_for_releasetest_testing_success @@ +1,2 @@ > +%(releaseName)s: Updates available on %(cdnTestChannels)s > +Updates to %(releaseName)s are now ready for testing on the %(cdnTestChannels)s channel(s). FTR, we're only checking a subset of all the bouncer products we add (all the oldest stuff). And actually, won't this be a little confusing when we push to mirrors for a release channel build ? beta-cdntest will have been live a long time at that point. What if you put beta-cdntest in otherChannels instead of cdnTestChannels ? You could ... uh .. revert the pluralising then ;-) ::: mozilla/staging_release-firefox-mozilla-release.py.template @@ +136,1 @@ > releaseConfig['testChannelRuleIds'] = [19,20] Need to add 40 or 41 to testChannelRuleIds.
Attachment #8546153 - Flags: review?(nthomas) → review-
(In reply to Nick Thomas [:nthomas] from comment #14) > ::: mozilla/release-firefox-mozilla-release.py.template > > releaseConfig['releaseChannelRuleIds'] = [] # Still on AUS3 > Will releaseChannelRuleIds be updated after the cutover to Balrog ? n/m, I see that's on bug 986990.
(In reply to Nick Thomas [:nthomas] from comment #12) > Comment on attachment 8546154 [details] [diff] [review] > buildbotcustom adjustment > > Seems to be unused, r+ to remove it instead. It's actually used to fill in this template: https://github.com/mozilla/build-buildbot-configs/blob/master/mozilla/release_templates/ready_for_releasetest_testing_success. Pyflakes doesn't know about this because we use locals() when filling it in :(. (In reply to Nick Thomas [:nthomas] from comment #13) > Comment on attachment 8546149 [details] [diff] [review] > set correct fileUrls for beta/beta-cdntest > > Review of attachment 8546149 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: lib/python/balrog/submitter/cli.py > @@ +168,5 @@ > > > > + # "*" is for the default set of fileUrls, which generally points at > > + # bouncer. It's helpful to have this to reduce duplication between > > + # the live channel and the cdntest channel (which eliminates the > > + # possibility that those two channels serve different contents). > > The example blob has the same content for beta and beta-cdntest, so I wonder > if we could use tuples as the dict keys, eg ('beta', 'beta-cdntest'). Not > going to block on that given it needs a Balrog change to handle it. Looks like Python dicts support this, but not JSON: >>> {('foo', 'bar'): 2} {('foo', 'bar'): 2} >>> import json >>> json.dumps({('foo', 'bar'): 2}) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/json/__init__.py", line 243, in dumps return _default_encoder.encode(obj) File "/usr/lib/python2.7/json/encoder.py", line 207, in encode chunks = self.iterencode(o, _one_shot=True) File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode return _iterencode(o, 0) TypeError: keys must be a string I agree this isn't ideal though. Perhaps we could support wildcards in channel names? Eg, "beta*". I don't think that would make these too much more difficult for humans to interpret. I filed bug 1122557 for this.
(In reply to Nick Thomas [:nthomas] from comment #14) > Comment on attachment 8546153 [details] [diff] [review] > adjust release configs > > Review of attachment 8546153 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: mozilla/release-firefox-mozilla-release.py.template > @@ +129,5 @@ > > 'macosx64': 'xulrunner/config/mozconfigs/macosx-universal/xulrunner', > > 'win32': 'xulrunner/config/mozconfigs/win32/xulrunner', > > } > > releaseConfig['releaseChannel'] = 'release' > > releaseConfig['releaseChannelRuleIds'] = [] # Still on AUS3 > > Will releaseChannelRuleIds be updated after the cutover to Balrog ? Yeah. I left that out of here because I wasn't sure if this was going to land first or not. > ::: mozilla/release_templates/ready_for_releasetest_testing_success > @@ +1,2 @@ > > +%(releaseName)s: Updates available on %(cdnTestChannels)s > > +Updates to %(releaseName)s are now ready for testing on the %(cdnTestChannels)s channel(s). > > FTR, we're only checking a subset of all the bouncer products we add (all > the oldest stuff). And actually, won't this be a little confusing when we > push to mirrors for a release channel build ? beta-cdntest will have been > live a long time at that point. Hm, good point. My brain muddled all cdntest channels together, but that doesn't make sense. I'll definitely revert this. We'll need some other way to notify about beta-cdntest being available though...
Just to summarize the reviews (mostly for my own benefit), it looks like the main issue is with notification. My current patches will notify for beta-cdntest, but not until we push to mirrors - which is way too late. I need to fix this to notify around the same time release-localtest becomes available. Ideally we'll have another update % check for the buildN entries, but perhaps it's good enough just to assume that they're available when the "updates" builder finishes?
Attachment #8546153 - Attachment is obsolete: true
Attachment #8546154 - Attachment is obsolete: true
This isn't technically blocking the release channel switch over - updating to match reality. It will be fixed before the next RC though.
No longer blocks: balrog-release
I did some more thinking about this today. In order to get all the notifications and verifications correct, it seems to me that we need to duplicate the Updates builder and all of its downstreams (update verify, update monitoring, final verify, update shipping and all of the notification builders in there). It's more work than I bargained for but I _think_ it's the right way to go as it will also get rid of the need to manually tweak the mozBeta patcher config with every beta 1. I put this together as a potential change to the release configs to enable this: -releaseConfig['patcherConfig'] = 'mozRelease-branch-patcher2.cfg' -releaseConfig['verifyConfigs'] = { - 'linux': 'mozRelease-firefox-linux.cfg', - 'linux64': 'mozRelease-firefox-linux64.cfg', - 'macosx64': 'mozRelease-firefox-mac64.cfg', - 'win32': 'mozRelease-firefox-win32.cfg' -} -releaseConfig['releaseChannel'] = 'release' -releaseConfig['releaseChannelRuleIds'] = [31] -releaseConfig['localTestChannel'] = 'betatest' -releaseConfig['cdnTestChannel'] = 'releasetest' -releaseConfig['testChannelRuleIds'] = [19,20] +# need to move verify + patcher configs in here? +# need a flag to toggle waiting on push to mirrors +releaseConfig['updateChannels'] = { + "release": { + "ruleId": 31, + "patcherConfig": "mozRelease-branch-patcher2.cfg", + "verifyConfigs": { + "linux": "mozRelease-firefox-linux.cfg", + "linux64": "mozRelease-firefox-linux64.cfg", + "macosx64": "mozRelease-firefox-mac64.cfg", + "win32": "mozRelease-firefox-win32.cfg" + }, + "testChannels": { + "release-localtest": { + "ruleId": 19, + }, + "release-cdntest": { + "ruleId": 20, + }, + }, + }, + "beta": { + "ruleId": TODO, + "requiresMirrors": False, + "testChannels": { + "beta-cdntest": { + "ruleId": 41, + } + } + } +} Basically, all of the stuff that's tied to a specify update channel (rule ids, channel names, verify configs, patcher config) are moved into the updateChannels dict. This will require a fair amount of change to consumers like release.py and the Balrog submission scripts. The updates builder and its downstreams will end up with a channel name in them. The only ugliness I can see at this point is the need for the "requiresMirrors" flag. This will be necessary because we now have some things on the cdntest channel that require us to have pushed to mirrors (release-cdntest) and some that don't (beta-cdntest in the RC case). I'd love to get some feedback on this before going too far in because it's more than a trivial amount of work. I'm happy to videochat about this if you'd like.
Flags: needinfo?(nthomas)
Coincidentally I was heading in a similar direction a while back: http://hg.mozilla.org/users/nthomas_mozilla.com/buildbot-configs/file/2f551c9b7b2d/mozilla/staging_release-firefox-mozilla-release.py#l109 I missed the mirrors parameter, but added a toggle so we can turn off pushing to beta for x.0.y releases (see bug 1055858 for a ship-it patch to use this); and a regex on versions, so we can put all the partials we want in ship-it and automatically use them in the right places (eg bouncer, patcher configs).
Flags: needinfo?(nthomas)
Depends on: 984888
(In reply to Nick Thomas [:nthomas] from comment #21) > Coincidentally I was heading in a similar direction a while back: > http://hg.mozilla.org/users/nthomas_mozilla.com/buildbot-configs/file/ > 2f551c9b7b2d/mozilla/staging_release-firefox-mozilla-release.py#l109 > > I missed the mirrors parameter, but added a toggle so we can turn off > pushing to beta for x.0.y releases (see bug 1055858 for a ship-it patch to > use this); and a regex on versions, so we can put all the partials we want > in ship-it and automatically use them in the right places (eg bouncer, > patcher configs). It seems like a good idea to pick up the enabled/disabled (which will need some release runner help to get flipped, I think). Can you expand on why the regex is needed? My read is that it may not be necessary after we turn off snippet uploading. I'd love to have a brief chat about this to make sure we get the best of both worlds.
My staging runs for this are going pretty well. Quite a few issues have been caught and fixed. The only remaining one so far is figuring out how to cope with the now-common push races in the updates builder (bug 794333). Since the release and beta builders fire at the same time, one of them almost always loses a race. I still need to do a staging release that simulates a point release with no updates from Beta to make sure that configuration works, too.
Ben, I will mark this bug as blocking bug 735184 for the updated routing key names. It's not a hard blocker but I hope you are ok with it.
Blocks: 735184
This patch is probably going to be painful to review. Mostly what it does is wrap anything to do with updates in a "for c in updateChannels" loop, and add channel names builder/scheduler names. Comparing the before and after builder+scheduler graphs might be helpful in reviewing it: http://people.mozilla.org/~bhearsum/release-mozilla-release-firefox_reset_schedulers%20scheduler.png http://people.mozilla.org/~bhearsum/release-schedulers-multichannel.png I tested this and the associated patches extensively in staging. It's hard to be 100% confident because I hit so many staging specific issues, but I _think_ these will be fine in production. In any case, I'm on the hook for the 37 cycle, which means any fallout should land on me. The only thing I'm truly worried about is the potential race condition in the two updates builders for an RC. I added the extra "hg pull" and "hg up" steps to try to protect against it, but there's no guarantee there. Worst case scenario is one of them requires being rebuilt. If it plagues us, I can look for a better solution later.
Attachment #8546149 - Attachment is obsolete: true
Attachment #8565513 - Flags: review?(nthomas)
This is a big patch but there's just two things going on: 1) Adjust release configs and templates to new updateChannels format. I've triplechecked all of the rule ids, so they should all be correct. 2) Adjust a billion release mail templates to support new builder names. This part kindof sucks because the channel names are now part of the builder names. I couldn't figure out a way to have all of the new builders share mail templates because the function that generates the messages (createReleaseMessage) doesn't know what channel the builder is for. I considered parsing the builder name to try and figure it out, but that seemed worse than duplicating the contents. The big downside here is that we have to remember to create new mail templates when we create new branches (such as esr38).
Attachment #8565514 - Flags: review?(nthomas)
This patch builds on the previous tools repo patch. I have: * Fixed Balrog submission for all channels. Note the way we get beta/beta-cdntest entries for RCs is a bit hacky. Not sure what to do about that. * Removed betatest/esrtest support from Balrog submission (yay new channel names!) * Adjusted update verify and balrog submission scripts to require channel to be passed at runtime.
Attachment #8565516 - Flags: review?(nthomas)
These patches explicitly don't fix the unnecessary-bouncer-entries-added-for-point-releases, but I plan to address that in a follow-up.
Comment on attachment 8565513 [details] [diff] [review] make it possible to have multiple streams of update builders/schedulers Review of attachment 8565513 [details] [diff] [review]: ----------------------------------------------------------------- r+ with some minor changes. Thanks for the scheduler graphs, do they leave out connections via Triggers or something ? eg nothing upstream of final_verification. ::: process/factory.py @@ +4030,5 @@ > + # Before committing we pull and update in an effort to reduce the > + # potential of hitting a push race. > + self.addStep(ShellCommand( > + name='tag_configs', > + command=["hg", "pull"], Nit, better name here and for the hg up in the next step. ::: process/release.py @@ +1222,5 @@ > }) > > + for channel, updateConfig in updateChannels.iteritems(): > + if releaseConfig.get("skip_updates") or not updateConfig.get("enabled", True): > + break break is fine for 'skip_updates', but should be continue for updateConfig['enabled'] (eg if beta was listed before release in releaseConfig["updateChannels"]). @@ +1297,5 @@ > # Releases that aren't automatically pushed to mirrors have their > # updates tested on an internal channel first. For these, we need to > # send out mail to let people know that it's ready to test. > + # Update channels that require a push to mirrors to ship > + # are always considered unimportant. It would help to which builder the message comes from for the automatic pushes/release-on-beta cases.
Attachment #8565513 - Flags: review?(nthomas) → review+
Comment on attachment 8565514 [details] [diff] [review] convert release configs to new format and adjust mail templates Review of attachment 8565514 [details] [diff] [review]: ----------------------------------------------------------------- Looks pretty good after a reasonable set of checks, but I wouldn't be surprised if I missed something here. I ignored the non-template configs for production branches, and the ruleid's in staging (just did a prod vs staging comparision). r+ with a few fixes on landing. ::: mozilla/release-fennec-mozilla-beta.py.template @@ +111,5 @@ > + "beta-localtest": { > + "ruleId": 89, > + }, > + "beta-cdntest": { > + "ruleId": 90, Did you consider dropping testChannels and using something like this ? "localTestChannel": ("beta-localtest", 89), "cdnTestChannel": ("beta-cdntest", 90) Not worth reworking everything for the compactness though. ::: mozilla/release-firefox-mozilla-release.py.template @@ +125,5 @@ > } > +releaseConfig["releaseChannel"] = "release" > +releaseConfig['updateChannels'] = { > + "release": { > + "versionRegex": r"\d+\.\d+(\.\d+)?$", Any reason you use ^ in regex in other config files but not this one ? Not that it matters hugely using re.match() but consistency. OK OK, pedantry. @@ +148,5 @@ > + }, > + }, > + "beta": { > + "enabled": {{ betaChannelEnabled }}, > + "versionRegex": r"\d+\.\d+b\d+$", It would be great to handle RC2 or higher too, where a partial from the previous RC would be helpful. Possibly with this: "versionRegex": r"^(\d+\.\d+b\d+|%s)$" % releaseConfig['version'].replace('.','\.') @@ +158,5 @@ > + "verifyConfigs": { > + "linux": "mozBeta-firefox-linux.cfg", > + "linux64": "mozBeta-firefox-linux64.cfg", > + "macosx64": "mozBeta-firefox-mac64.cfg", > + "win32": "mozBeta-firefox-win32.cfg" Lets be careful when win64 is enabled that we add it here too. ::: mozilla/release_templates/firefox_release_ready_for_release_success @@ +1,1 @@ > +%(releaseName)s: ready for release on release Yay, overloaded words! Some alternatives: launch, deliver, publish, boldly send forth. whimsy++ ::: mozilla/staging_release-firefox-mozilla-esr31.py.template @@ +127,5 @@ > + "macosx64": "mozEsr31-firefox-mac64.cfg", > + "win32": "mozEsr31-firefox-win32.cfg", > + }, > + "testChannels": { > + "beta-cdntest": { s/beta/esr/ twice.
Attachment #8565514 - Flags: review?(nthomas) → review+
Comment on attachment 8565516 [details] [diff] [review] make balrog submission support RCs; update scripts support channel arguments Review of attachment 8565516 [details] [diff] [review]: ----------------------------------------------------------------- ::: lib/python/balrog/submitter/cli.py @@ +225,5 @@ > + # See comment above about these channels for explanation. > + if not requiresMirrors and channel in ("beta", "beta-cdntest"): > + bouncerProduct = "%s-%sbuild%s-partial-%s" % (productName.lower(), version, buildNumber, previousVersion) > + else: > + bouncerProduct = "%s-%s-partial-%s" % (productName.lower(), version, previousVersion) Some issues here for RC3 or higher and partials, eg a product name of firefox-37.0build3-partial-37.0, is that RC2 or RC1 for the origin ? A bit of an edgecase so we can handle it with an if, or figure out how to avoid a foot gun some other way. Possibly implications for the bouncer submitter too. ::: scripts/updates/balrog-release-pusher.py @@ +101,5 @@ > creator = ReleaseCreatorV4(options.api_root, auth) > # XXX: remove me later > partials = release_config['partialUpdates'].copy() > if release_config.get('extraPartials'): > partials.update(release_config['extraPartials']) Partials could also be filtered with the regex, which would avoid the Firefox-37.0-build2 blob having a 35.0.1 url for beta*, while release* and '*' include 36.0b7. I'd like this fixed before it lands. extraPartials could die at some point too.
Attachment #8565516 - Flags: review?(nthomas) → review-
(In reply to Nick Thomas [:nthomas] from comment #29) > Comment on attachment 8565513 [details] [diff] [review] > make it possible to have multiple streams of update builders/schedulers > > Review of attachment 8565513 [details] [diff] [review]: > ----------------------------------------------------------------- > > r+ with some minor changes. Thanks for the scheduler graphs, do they leave > out connections via Triggers or something ? eg nothing upstream of > final_verification. Ah, yeah - they do, sorry for not mentioning that! (In reply to Nick Thomas [:nthomas] from comment #30) > ::: mozilla/release-fennec-mozilla-beta.py.template > @@ +111,5 @@ > > + "beta-localtest": { > > + "ruleId": 89, > > + }, > > + "beta-cdntest": { > > + "ruleId": 90, > > Did you consider dropping testChannels and using something like this ? > "localTestChannel": ("beta-localtest", 89), > "cdnTestChannel": ("beta-cdntest", 90) I hadn't considered that. I'll keep that in mind for future work though. > ::: mozilla/release-firefox-mozilla-release.py.template > @@ +125,5 @@ > > } > > +releaseConfig["releaseChannel"] = "release" > > +releaseConfig['updateChannels'] = { > > + "release": { > > + "versionRegex": r"\d+\.\d+(\.\d+)?$", > > Any reason you use ^ in regex in other config files but not this one ? Not > that it matters hugely using re.match() but consistency. OK OK, pedantry. Likely an oversight. I'll fix this though - consistency is good! > @@ +148,5 @@ > > + }, > > + }, > > + "beta": { > > + "enabled": {{ betaChannelEnabled }}, > > + "versionRegex": r"\d+\.\d+b\d+$", > > It would be great to handle RC2 or higher too, where a partial from the > previous RC would be helpful. Possibly with this: > "versionRegex": r"^(\d+\.\d+b\d+|%s)$" % > releaseConfig['version'].replace('.','\.') ++, thanks! > ::: mozilla/release_templates/firefox_release_ready_for_release_success > @@ +1,1 @@ > > +%(releaseName)s: ready for release on release > > Yay, overloaded words! Some alternatives: launch, deliver, publish, boldly > send forth. whimsy++ I will do something here....."release on release" is pretty bad on a reread. (In reply to Nick Thomas [:nthomas] from comment #31) > Comment on attachment 8565516 [details] [diff] [review] > make balrog submission support RCs; update scripts support channel arguments > > Review of attachment 8565516 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: lib/python/balrog/submitter/cli.py > @@ +225,5 @@ > > + # See comment above about these channels for explanation. > > + if not requiresMirrors and channel in ("beta", "beta-cdntest"): > > + bouncerProduct = "%s-%sbuild%s-partial-%s" % (productName.lower(), version, buildNumber, previousVersion) > > + else: > > + bouncerProduct = "%s-%s-partial-%s" % (productName.lower(), version, previousVersion) > > Some issues here for RC3 or higher and partials, eg a product name of > firefox-37.0build3-partial-37.0, is that RC2 or RC1 for the origin ? A bit > of an edgecase so we can handle it with an if, or figure out how to avoid a > foot gun some other way. Possibly implications for the bouncer submitter too. Hm. Have we ever had to deal with this by hand? I'm curious what we did in a case like that. I think we should figure this out before this lands, no pointing in committing this footgun. > ::: scripts/updates/balrog-release-pusher.py > @@ +101,5 @@ > > creator = ReleaseCreatorV4(options.api_root, auth) > > # XXX: remove me later > > partials = release_config['partialUpdates'].copy() > > if release_config.get('extraPartials'): > > partials.update(release_config['extraPartials']) > > Partials could also be filtered with the regex, which would avoid the > Firefox-37.0-build2 blob having a 35.0.1 url for beta*, while release* and > '*' include 36.0b7. I'd like this fixed before it lands. Good idea about filtering with the regex, will do.
(In reply to Nick Thomas [:nthomas] from comment #31) > Comment on attachment 8565516 [details] [diff] [review] > make balrog submission support RCs; update scripts support channel arguments > > Review of attachment 8565516 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: lib/python/balrog/submitter/cli.py > @@ +225,5 @@ > > + # See comment above about these channels for explanation. > > + if not requiresMirrors and channel in ("beta", "beta-cdntest"): > > + bouncerProduct = "%s-%sbuild%s-partial-%s" % (productName.lower(), version, buildNumber, previousVersion) > > + else: > > + bouncerProduct = "%s-%s-partial-%s" % (productName.lower(), version, previousVersion) > > Some issues here for RC3 or higher and partials, eg a product name of > firefox-37.0build3-partial-37.0, is that RC2 or RC1 for the origin ? A bit > of an edgecase so we can handle it with an if, or figure out how to avoid a > foot gun some other way. Possibly implications for the bouncer submitter too. Per IRC, we should be able to use things like firefox-37.0build3-partial-37.0build2. I'll make this change in the bouncer and balrog submitters.
(In reply to Nick Thomas [:nthomas] from comment #29) > @@ +1297,5 @@ > > # Releases that aren't automatically pushed to mirrors have their > > # updates tested on an internal channel first. For these, we need to > > # send out mail to let people know that it's ready to test. > > + # Update channels that require a push to mirrors to ship > > + # are always considered unimportant. > > It would help to which builder the message comes from for the automatic > pushes/release-on-beta cases. Hmm, I'm not sure what you're getting at here. Can you expand?
Attachment #8565513 - Attachment is obsolete: true
Attachment #8565514 - Attachment is obsolete: true
Attachment #8565516 - Attachment is obsolete: true
Attachment #8568603 - Flags: review?(nthomas)
Diff of actual changes is: https://github.com/bhearsum/buildbotcustom/compare/40c8acf87a7098e2721ccdeda9c8f9dd9bebbf2f...balrog-rc The build number appending here is a bit weird. For the bouncer_submitter.py mozharness script I'm always passing the build number with each previous version. That script will split it into prev_version and prev_build_number (patch for that incoming), which get substituded properly in all bouncer submission cases. I couldn't figure out a better way to pass that along. For get_release_uptake, TriggerBouncerCheck pulls the release config at runtime so we do the build number appending there, when necessary. get_release_uptake isn't smart enough to do it itself.
Attachment #8568604 - Flags: review?(nthomas)
Attachment #8568603 - Flags: review?(nthomas) → review+
Comment on attachment 8568604 [details] [diff] [review] fix update skipping logic; candidates dir bouncer entries Review of attachment 8568604 [details] [diff] [review]: ----------------------------------------------------------------- Interdiff looks good, but I think you need just one fix which I missed earlier (see below). Then I wondered if we have two uptake checkers, one each for beta and another for release as they need to fire at different times, and it appears not. So either not check the beta products in the release uptake and rely on update verify to throw errors, or add a new checker (might be fun to schedule!). ::: scheduler.py @@ +300,5 @@ > self.release_config.get('productName').capitalize() > + version = self.release_config.get('version') > + partialVersions = [] > + if self.appendBuildNumber: > + version += '-%s' % self.release_config.get('buildNumber') Why not append build<buildNumber> here like the partialVersions does ?
Attachment #8568604 - Flags: review?(nthomas) → review-
Also can't find anything setting appendBuildNumber=True for the beta products.
Attachment #8568605 - Flags: review?(nthomas) → review+
Comment on attachment 8568606 [details] [diff] [review] kill extra partials; improve some versionRegex's, release mail; fix bug in esr configs Review of attachment 8568606 [details] [diff] [review]: ----------------------------------------------------------------- Interdiff looks good.
Attachment #8568606 - Flags: review?(nthomas) → review+
(In reply to Nick Thomas [:nthomas] from comment #39) > Comment on attachment 8568604 [details] [diff] [review] > fix update skipping logic; candidates dir bouncer entries > > Review of attachment 8568604 [details] [diff] [review]: > ----------------------------------------------------------------- > > Interdiff looks good, but I think you need just one fix which I missed > earlier (see below). Then I wondered if we have two uptake checkers, one > each for beta and another for release as they need to fire at different > times, and it appears not. So either not check the beta products in the > release uptake and rely on update verify to throw errors, or add a new > checker (might be fun to schedule!). Good catch, I'll fix this up in some way....probably be adding an extra uptake checker. > ::: scheduler.py > @@ +300,5 @@ > > self.release_config.get('productName').capitalize() > > + version = self.release_config.get('version') > > + partialVersions = [] > > + if self.appendBuildNumber: > > + version += '-%s' % self.release_config.get('buildNumber') > > Why not append build<buildNumber> here like the partialVersions does ? This is a bug, I'll fix it :). I think it actually needs to be "version += 'build%s' % buildNumber", to match what's going on in the bouncer submitter mozharness configs...
(In reply to Ben Hearsum [:bhearsum] from comment #42) > (In reply to Nick Thomas [:nthomas] from comment #39) > > Comment on attachment 8568604 [details] [diff] [review] > > fix update skipping logic; candidates dir bouncer entries > > > > Review of attachment 8568604 [details] [diff] [review]: > > ----------------------------------------------------------------- > > > > Interdiff looks good, but I think you need just one fix which I missed > > earlier (see below). Then I wondered if we have two uptake checkers, one > > each for beta and another for release as they need to fire at different > > times, and it appears not. So either not check the beta products in the > > release uptake and rely on update verify to throw errors, or add a new > > checker (might be fun to schedule!). > > Good catch, I'll fix this up in some way....probably be adding an extra > uptake checker. Turns out this was me missing the last couple of blocks that looked for verifyConfigs, oops. The fix for this part is pretty easy, and results in a more sensible scheduler graph AFAICT: http://people.mozilla.org/~bhearsum/release-schedulers-multichannel-with-uptake.png You can see the two uptake schedulers near the bottom right. Both are blocked on push_to_mirrors, and each is blocked on its own "updates" builder as well. Unfortunately, I don't have triggerable schedulers graphing correctly, so we have to trust that their downstreams are accurate based on the code... Sorry for the delay on getting back to this, I didn't mean to let it sit so long.
Attachment #8568604 - Attachment is obsolete: true
Attachment #8575487 - Flags: review?(nthomas)
Is https://github.com/bhearsum/buildbotcustom/compare/6d737af...34b2fa1 the right interdiff to look at ? That excludes the merge of latest changes but otherwise is the recent changes on your balrog-rc branch. I was surprised to see <blah>_beta_uptake_check scheduler have <blah2>_push_to_mirrors as an upstream. And no changes to which products are being checked in each uptake job. It may be necessary to split the beta bouncer locations out to another bouncer config file, and put the bouncer config in the releaseConfig["updateChannels"] dict. The bouncer checking has a bunch of hardcoded expectations, which might creep the scope further :-S Or I might be on some other planet altogether from what you're trying to do here.
(In reply to Nick Thomas [:nthomas] from comment #44) > Is https://github.com/bhearsum/buildbotcustom/compare/6d737af...34b2fa1 > the right interdiff to look at ? That excludes the merge of latest changes > but otherwise is the recent changes on your balrog-rc branch. Yeah, that's the right interdiff. > I was surprised to see <blah>_beta_uptake_check scheduler have > <blah2>_push_to_mirrors as an upstream. Good point...that logic needs to look at requireMirror and not add that upstream for beta. > And no changes to which products are > being checked in each uptake job. It may be necessary to split the beta > bouncer locations out to another bouncer config file, and put the bouncer > config in the releaseConfig["updateChannels"] dict. The bouncer checking has > a bunch of hardcoded expectations, which might creep the scope further :-S > Or I might be on some other planet altogether from what you're trying to do > here. I _think_ the Bouncer stuff is closer than you think, modulo the fact that we look for release partials in the candidates dir and beta partials in the release dir. My read of https://github.com/bhearsum/buildbotcustom/blob/34b2fa104d7fe35b1dfe46824698bd581389be69/scheduler.py#L292 has us passing, eg, "37.0build1" in the version field for the beta scheduler, and "37.0" for the release one. This ends up with products such as: Firefox-37.0build1-Complete Firefox-37.0build1-Partial-37.0b10build1 However, it looks like appendBuildNumber isn't getting passed for the beta version, which looks like another case of needing to pay attention to requireMirrors.
Attached patch additional scheduling fixes (obsolete) — Splinter Review
I think this addresses the comments...but I went a little further with it. Normally I'd avoid complicating a bug like this, but it's so hard to get into the headspace for this scheduler work that I figure it's worth it in this instance. It become clear to me while adjusting the update schedulers that "almost ready for release" only exists because we used to have different uptake requirements for final verify vs. release. This hasn't been the case for a long time, so I killed that builder+scheduler. I also added "antivirus" to the ready for release upstreams, which is something I've been meaning to do for awhile. I updated the scheduler graph: http://people.mozilla.org/~bhearsum/release-schedulers-multichannel-with-uptake.png And here's a new compare link: https://github.com/bhearsum/buildbotcustom/compare/6d737af...balrog-rc The graph is a bit of a mess, but you can see that push to mirrors no longer blocks anything in the beta update chain, and the "ready for release" upstreams have been adjusted as described above.
Attachment #8575487 - Attachment is obsolete: true
Attachment #8575487 - Flags: review?(nthomas)
Attachment #8576698 - Flags: review?(nthomas)
Comment on attachment 8576698 [details] [diff] [review] additional scheduling fixes The new graph and code looked fine, but I didn't quite wrap my head around post_antivirus_builders doing a mirror push in some circumstances, and the bouncer check depending on antivirus. I'm going to assume you thought about that and it's OK. The issue with the bouncer check actually checks remains though, since I didn't explain myself very well. I agree you're passing in different versions, but at http://hg.mozilla.org/build/tools/file/default/lib/python/util/tuxedo.py#l66 there is no way to turn off checking for installers. Hence the uptake check will never complete in the release on beta case.
Attachment #8576698 - Flags: review?(nthomas) → feedback+
(In reply to Nick Thomas [:nthomas] from comment #47) > Comment on attachment 8576698 [details] [diff] [review] > additional scheduling fixes > > The new graph and code looked fine, but I didn't quite wrap my head around > post_antivirus_builders doing a mirror push in some circumstances I don't think I changed anything around this. If you're talking about the context around https://github.com/bhearsum/buildbotcustom/compare/master...balrog-rc#diff-83e0ea542c6bd3b915e668b2b593aceeL1821, I think we're in the first block for Fennec (where we have no AV builder), but the second block for Firefox/Thunderbird. > and the > bouncer check depending on antivirus. I'm going to assume you thought about > that and it's OK. I think you're talking about this part here: https://github.com/bhearsum/buildbotcustom/compare/master...balrog-rc#diff-83e0ea542c6bd3b915e668b2b593aceeR1811 If so, that's the "ready_for_release" builder depending on antivirus, not the bouncer check: https://github.com/bhearsum/buildbotcustom/compare/master...balrog-rc#diff-83e0ea542c6bd3b915e668b2b593aceeR1836 > The issue with the bouncer check actually checks remains though, since I > didn't explain myself very well. I agree you're passing in different > versions, but at > http://hg.mozilla.org/build/tools/file/default/lib/python/util/tuxedo.py#l66 > there is no way to turn off checking for installers. Hence the uptake check > will never complete in the release on beta case. Aaaaah, okay. I get it now. Thanks for spelling it out for me. These next two patches should address it. Here's the new compare links of just these changes: https://github.com/bhearsum/buildbotcustom/compare/818d140...balrog-rc https://github.com/mozilla/build-tools/compare/d1e3639...bhearsum:balrog-rc?expand=1
Attachment #8568603 - Attachment is obsolete: true
Attachment #8576698 - Attachment is obsolete: true
Attachment #8577436 - Flags: review?(nthomas)
Comment on attachment 8577436 [details] [diff] [review] make installer checking optional in get_release_uptake The interdiffs look good to me, thanks for making that easy.
Attachment #8577436 - Flags: review?(nthomas) → review+
Attachment #8577437 - Flags: review?(nthomas) → review+
Attachment #8577437 - Flags: checked-in+
Attachment #8577436 - Flags: checked-in+
Attachment #8568606 - Flags: checked-in+
Attachment #8568605 - Flags: checked-in+
Turns out I didn't test my latest changes with Fennec enabled :S. It's currently in a state where it has updates being pushed, but no test channels, so there's no update verify configs. This should be a fine stopgap measure, which passes my local checkconfig now...
Attachment #8578027 - Flags: review?(rail)
Attachment #8578027 - Flags: review?(rail) → review+
Attachment #8578027 - Flags: checked-in+
I had a couple of follow-ups in buildbotcustom to fix up builder name shortening due to bug 1143683, but this is landed now, and a reconfig is in progress.
Things went generally well with yesterday's Beta that included these patches, except that "ready for beta-cdntest" and "final verify" didn't fire, because of a mismatch in naming between the new "ready for beta-cdntest" scheduler + builder. We can see the new scheduler being added to a trigger builder here: https://github.com/mozilla/build-buildbotcustom/commit/2d6690a1609bc3ab33ad081beb7b62087d7eecec#diff-83e0ea542c6bd3b915e668b2b593aceeR1577 ...as builderPrefix('ready-for-%s' % updateConfig["cdnTestChannel"]) But the scheduler is created later: https://github.com/mozilla/build-buildbotcustom/commit/2d6690a1609bc3ab33ad081beb7b62087d7eecec#diff-83e0ea542c6bd3b915e668b2b593aceeR1824 ...as name=builderPrefix('%s_ready-for-%s' % (channel, updateConfig["cdnTestChannel"])) The latter is what we need, because we want channel-unique schedulers. This patch should fix things up.
Attachment #8578617 - Flags: review?(rail)
Attachment #8578617 - Flags: review?(rail) → review+
Attachment #8578617 - Flags: checked-in+
Hit a bug starting 36.0.2: Traceback (most recent call last): File "release-runner.py", line 474, in <module> main(options) File "release-runner.py", line 396, in main ssh_username=hg_username, ssh_key=hg_ssh_key) File "/builds/releaserunner/tools/lib/python/util/hg.py", line 559, in apply_and_push changer(localrepo, 1) File "release-runner.py", line 374, in process_configs productionBranch=buildbot_configs_branch) File "release-runner.py", line 144, in bump_configs releaseConfig = substituteReleaseConfig(template, **subs) File "/builds/releaserunner/tools/lib/python/release/config.py", line 17, in substituteReleaseConfig appVersion=appVersion, **other) File "/builds/releaserunner/lib/python2.7/site-packages/jinja2/environment.py", line 894, in render return self.environment.handle_exception(exc_info, True) File "<template>", line 151, in top-level template code jinja2.exceptions.UndefinedError: 'betaChannelEnabled' is undefined notify_to: contains invalid character ':' I've hard disabled on the production branch with https://hg.mozilla.org/build/buildbot-configs/rev/cf2ce6e8fa20. NB: didn't land on default, so reconfig doing a merge from default and then a build2 could fail the same way.
(In reply to Nick Thomas [:nthomas] from comment #59) > Hit a bug starting 36.0.2: > > Traceback (most recent call last): > File "release-runner.py", line 474, in <module> > main(options) > File "release-runner.py", line 396, in main > ssh_username=hg_username, ssh_key=hg_ssh_key) > File "/builds/releaserunner/tools/lib/python/util/hg.py", line 559, in > apply_and_push > changer(localrepo, 1) > File "release-runner.py", line 374, in process_configs > productionBranch=buildbot_configs_branch) > File "release-runner.py", line 144, in bump_configs > releaseConfig = substituteReleaseConfig(template, **subs) > File "/builds/releaserunner/tools/lib/python/release/config.py", line 17, > in substituteReleaseConfig > appVersion=appVersion, **other) > File > "/builds/releaserunner/lib/python2.7/site-packages/jinja2/environment.py", > line 894, in render > return self.environment.handle_exception(exc_info, True) > File "<template>", line 151, in top-level template code > jinja2.exceptions.UndefinedError: 'betaChannelEnabled' is undefined Hm, the lines on this traceback don't match up to the current code. It looks like I didn't update and restart release runner after landing, boo :(. I've updated and restarted release runner for this. > notify_to: contains invalid character ':' This one is more confusing....and might not be relevant. That line looks like this in the config: notify_to: release-automation-notifications@mozilla.com Which doesn't have a ":". I'll look into this some more.
(In reply to Ben Hearsum [:bhearsum] from comment #60) > This one is more confusing....and might not be relevant. That line looks > like this in the config: > notify_to: release-automation-notifications@mozilla.com > > Which doesn't have a ":". I'll look into this some more. Looks like this is probably coming from sendmail: http://www.jethrocarr.com/2012/06/10/mailx-contains-invalid-character-2/. It looks like release-runner.sh doesn't handle e-mail addresses without <> in them: https://github.com/mozilla/build-tools/blob/master/buildfarm/release/release-runner.sh#L42. I filed bug 1146336 for that. With both of these errors fixed I backed out http://hg.mozilla.org/build/buildbot-configs/rev/cf2ce6e8fa20. RC is coming soon, so hopefully this will work there....
Things seem to have mostly worked for 37.0 RC1: all of the correct updates were published, update verify/final verify passes on beta-localtest/cdntest. However, it looks like I broke the TriggerBouncerCheck scheduler: 28310 2015-03-25 02:45:50-0700 [-] Unhandled error in Deferred: 28311 2015-03-25 02:45:50-0700 [-] Unhandled Error 28312 Traceback (most recent call last): 28313 File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/base.py", line 1174, in mainLoop 28314 self.runUntilCurrent() 28315 File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/base.py", line 796, in runUntilCurrent 28316 call.func(*call.args, **call.kw) 28317 File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/task.py", line 163, in start 28318 self() 28319 File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/task.py", line 194, in __call__ 28320 d = defer.maybeDeferred(self.f, *self.a, **self.kw) 28321 --- <exception caught here> --- 28322 File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/defer.py", line 125, in maybeDeferred 28323 result = f(*args, **kw) 28324 File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbotcustom/scheduler.py", line 306, in poll 28325 for partialVersion, info in self.releaseConfig.get("partialUpdates").iteritems(): 28326 exceptions.AttributeError: TriggerBouncerCheck instance has no attribute 'releaseConfig' 28327 I think this boils down to a stupid typo (releaseConfig vs. release_config)...
It would be good to try and get this in before the next release starts, AFAICT it is affecting all releases that have updates....
Attachment #8583096 - Flags: review?(rail)
Attachment #8583096 - Flags: review?(rail) → review+
Attachment #8583096 - Flags: checked-in+
TriggerBouncerCheck is still broken here =\: 23885 2015-03-27 01:15:32-0700 [-] Unhandled error in Deferred: 23886 2015-03-27 01:15:32-0700 [-] Unhandled Error 23887 Traceback (most recent call last): 23888 File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/base.py", line 1174, in mainLoop 23889 self.runUntilCurrent() 23890 File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/base.py", line 796, in runUntilCurrent 23891 call.func(*call.args, **call.kw) 23892 File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/task.py", line 163, in start 23893 self() 23894 File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/task.py", line 194, in __call__ 23895 d = defer.maybeDeferred(self.f, *self.a, **self.kw) 23896 --- <exception caught here> --- 23897 File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/internet/defer.py", line 125, in maybeDeferred 23898 result = f(*args, **kw) 23899 File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbotcustom/scheduler.py", line 307, in poll 23900 partialVersions.append("%sbuild%s" % partialVersion, info["buildNumber"]) 23901 exceptions.TypeError: not enough arguments for format string 23902
Attachment #8584534 - Flags: review?(rail)
Attachment #8584534 - Flags: review?(rail) → review+
Attachment #8584534 - Flags: checked-in+
I'm calling this fixed now. It worked for 37.0build2 and 38.0b1 AFAICT. If any issues come up we can file follow-ups for them. Big thanks to Nick for all the reviews - they caught lots of issues that would've busted production.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
I filed bug 1149544 as a follow-up to fix the bouncer side of things to only submit/look for the necessary entries for each channel.
Depends on: 1201383
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: