flip the switch to move beta users over to balrog

RESOLVED FIXED

Status

Release Engineering
Release Automation
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: bhearsum, Assigned: bhearsum)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 1 obsolete attachment)

(Assignee)

Description

3 years ago
We'll need to adjust the conditionals that guard against accidental uplift to pass for Beta, but fail for Release and ESR:
http://hg.mozilla.org/mozilla-central/file/f78e532e8a10/browser/app/profile/firefox.js#l145
http://hg.mozilla.org/mozilla-central/file/f78e532e8a10/browser/app/profile/firefox.js#l188
http://hg.mozilla.org/mozilla-central/file/f78e532e8a10/browser/metro/profile/metro.js#l475
http://mxr.mozilla.org/comm-central/source/mail/app/profile/all-thunderbird.js#83
http://mxr.mozilla.org/comm-central/source/mail/app/profile/all-thunderbird.js#124

Nothing to do for Android -- we don't serve updates to Beta/Release there, so we did a straight switch to aus4.mozilla.org for all Android builds.
(Assignee)

Comment 1

3 years ago
There doesn't seem to be great a great way to distinguish betas from releases in the build system. MOZ_UPDATE_CHANNEL might be the best way, especially because this is temporary.
(Assignee)

Comment 2

3 years ago
Created attachment 8456204 [details] [diff] [review]
change update server for everything except release and esr

Gavin, per the thread on r-d this is something I plan to land sometime early in the beta cycle (hopefully between beta 1 and 2). This new #if isn't ideal - it means that on-change builds for release/esr will have app.update.url set to aus4.m.o, because they don't have MOZ_UPDATE_CHANNEL set. I don't think that's harmful, because we don't serve updates to those users (there's no nightlies), but it's a bit weird. I couldn't find a way to express "is not built from release or esr source code" better than this, because we don't have RELEASE_BUILD or NIGHTLY_BUILD type vars for this situation.

I tested this with manually:
➜  gecko  PYTHONPATH=python/mozbuild python -m mozbuild.action.preprocessor browser/app/profile/firefox.js -DAB_CD=foo  | grep app.update.url
pref("app.update.url", "https://aus4.mozilla.org/update/3/%PRODUCT%/%VERSION%/%BUILD_ID%/%BUILD_TARGET%/%LOCALE%/%CHANNEL%/%OS_VERSION%/%DISTRIBUTION%/%DISTRIBUTION_VERSION%/update.xml");

➜  gecko  PYTHONPATH=python/mozbuild python -m mozbuild.action.preprocessor browser/app/profile/firefox.js -DAB_CD=foo -DMOZ_UPDATE_CHANNEL=release | grep app.update.url
pref("app.update.url", "https://aus3.mozilla.org/update/3/%PRODUCT%/%VERSION%/%BUILD_ID%/%BUILD_TARGET%/%LOCALE%/%CHANNEL%/%OS_VERSION%/%DISTRIBUTION%/%DISTRIBUTION_VERSION%/update.xml");

➜  gecko  PYTHONPATH=python/mozbuild python -m mozbuild.action.preprocessor browser/app/profile/firefox.js -DAB_CD=foo -DMOZ_UPDATE_CHANNEL=esr | grep app.update.url
pref("app.update.url", "https://aus3.mozilla.org/update/3/%PRODUCT%/%VERSION%/%BUILD_ID%/%BUILD_TARGET%/%LOCALE%/%CHANNEL%/%OS_VERSION%/%DISTRIBUTION%/%DISTRIBUTION_VERSION%/update.xml");

➜  gecko  PYTHONPATH=python/mozbuild python -m mozbuild.action.preprocessor browser/app/profile/firefox.js -DAB_CD=foo -DMOZ_UPDATE_CHANNEL=beta | grep app.update.url
pref("app.update.url", "https://aus4.mozilla.org/update/3/%PRODUCT%/%VERSION%/%BUILD_ID%/%BUILD_TARGET%/%LOCALE%/%CHANNEL%/%OS_VERSION%/%DISTRIBUTION%/%DISTRIBUTION_VERSION%/update.xml");
Assignee: nobody → bhearsum
Status: NEW → ASSIGNED
Attachment #8456204 - Flags: review?(gavin.sharp)
(Assignee)

Comment 3

3 years ago
Created attachment 8456248 [details] [diff] [review]
switch thunderbird betas to balrog, too
Attachment #8456248 - Flags: review?(standard8)
Comment on attachment 8456248 [details] [diff] [review]
switch thunderbird betas to balrog, too

Looks good (I'm going by the manual testing. a=me for landing on whichever comm-* branches necessary when you're ready.
Attachment #8456248 - Flags: review?(standard8) → review+
(Assignee)

Comment 5

3 years ago
Note to self: backport the patches to mozilla/comm-esr31, so that nightlies on those branches will be using aus4.m.o instead of aus3.m.o.
Comment on attachment 8456204 [details] [diff] [review]
change update server for everything except release and esr

How about using EARLY_BETA_OR_EARLIER?

That won't be defined in the last few betas, but maybe that's OK? Might actually be a feature to have the switchover be a bit before release...
(Assignee)

Comment 7

3 years ago
(In reply to :Gavin Sharp [email: gavin@gavinsharp.com] from comment #6)
> Comment on attachment 8456204 [details] [diff] [review]
> change update server for everything except release and esr
> 
> How about using EARLY_BETA_OR_EARLIER?
> 
> That won't be defined in the last few betas, but maybe that's OK? Might
> actually be a feature to have the switchover be a bit before release...

Unfortunately, this means that users would switch back to aus3 at the start of the 33 Beta cycle, which we definitely don't want. (The switch to aus4 is expected to ride with 33 because we need it for openh264 plugin updates.)
(Assignee)

Comment 8

3 years ago
Comment on attachment 8456204 [details] [diff] [review]
change update server for everything except release and esr

Update from IRC conversation: My current patch won't work for RC builds, because they'll have MOZ_UPDATE_CHANNEL=release. Gavin suggested that we could embed two sets of prefs (one for aus3, one for aus4) and make the decision at runtime based on the update channel. It's pretty icky though. I'm going to talk to IT about the possibility of doing this only for beta users through Zeus first, but we can fallback to Gavin's idea if necessary.
Attachment #8456204 - Attachment is obsolete: true
Attachment #8456204 - Flags: review?(gavin.sharp)
(Assignee)

Updated

3 years ago
Blocks: 1039559
(Assignee)

Comment 9

3 years ago
Comment on attachment 8456248 [details] [diff] [review]
switch thunderbird betas to balrog, too

This patch for Thunderbird should work fine still since we don't ship RCs there. However, if we end up taking the IT solution for Firefox, it'll probably work for Thunderbird for free. I'll leave this patch for now, but it may not get used.
(Assignee)

Comment 10

3 years ago
I spoke with cturra about doing the redirection and it thinks it's doable. We're going to look into more on Wednesday, after we're clear of the releases shipping this week.
Depends on: 1041745
(Assignee)

Updated

3 years ago
No longer blocks: 1039559
(Assignee)

Comment 11

3 years ago
Per bug 1041745, Chris has set-up a test redirect for any requests with "/cturra/" or "/cturra-cdntest/" anywhere in the request path. With that, we can use the distribution or distribution version field of the URL to enable or disable redirection.

Separately, I've been putting together some stuff to help us verify that aus3 and aus4 will return the same results for any given beta channel request. We've already got a script that compares requests from two AUS servers (https://github.com/mozilla/balrog/blob/master/scripts/compare-update-requests.py), but we need some good URLs to feed it. I hacked up update verify to do this, and added in a bunch of old releases to update verify configs. I probably still need to do the same for the Thunderbird ones, but my WIP is here: https://github.com/bhearsum/tools/compare/update-verify-hacky-hacky?expand=1

It may be smart to consider dropping a bunch of locales from those tests, otherwise they may take days to run.
(Assignee)

Comment 12

3 years ago
Created attachment 8464029 [details]
beta urls with a limited locale set

This set of update URLs should cover all of the combinations we need to worry about. It covers every beta release of Firefox and Thunderbird going back to 10.0, all 4 platforms, a few locales, and two os versions per platform. One OS version for each platform was blocked between 10.0 and now, which will help us verify bug 1021022. The other is currently unblocked.

All of the mac tests use Darwin_x86_64-gcc3-u-i386-x86_64 for the build target. I don't think there's even been a meaningful difference between it and the 32-bit version. PPC support was dropped as of Darwin 10 (http://en.wikipedia.org/wiki/Darwin_%28operating_system%29#Release_history), and since we block everything prior to Darwin 10, I don't think we need any extra rules are tests here. Aka, any supported Darwin version can run the latest version of Firefox. Anyone on an unsupported Darwin version will get no update via the OS blocking rules.

Currently, the snippet comparison tool gives the following results:
Tested 11488 paths.
Pass count: 4293
Fail count: 7195
Error count: 0

After bug 1045583 is fixed, we'll have a better pass rate. We still need bug 1021022 + associated rules to land to hit 100% though. (Right now, all updates are pointing to 32.0b1, instead of the 29.0b8 watershed for older versions.)
(Assignee)

Comment 13

3 years ago
With bug 1041745 fixed, we've got a better pass rate:
Tested 11488 paths.
Pass count: 5073
Fail count: 6415
Error count: 0

I went over the remaining failures and found a few more things:
* Lack of watershed updates for Firefox - fixed by bug 1049550
* Thunderbird detailsURLs - won't be fixing this because it requires Balrog changes, and we'll want to change it back later anyways.
* Thunderbird candidates dir - this will get fixed when 33.0b1 runs
* SHA512 vs. sha512 is the snippets - I've simply modified the snippet comparison tool to make these consistent. aus3 uses uppercase and lowercase in different places, so it's hard to make balrog consistent here.

We're getting pretty close now!
(Assignee)

Updated

3 years ago
Depends on: 1065445
(Assignee)

Updated

3 years ago
Depends on: 1065639
(Assignee)

Updated

3 years ago
Depends on: 1068179
(Assignee)

Comment 14

3 years ago
(In reply to Ben Hearsum [:bhearsum] from comment #13)
> With bug 1041745 fixed, we've got a better pass rate:
> Tested 11488 paths.
> Pass count: 5073
> Fail count: 6415
> Error count: 0
> 
> I went over the remaining failures and found a few more things:
> * Lack of watershed updates for Firefox - fixed by bug 1049550
> * Thunderbird detailsURLs - won't be fixing this because it requires Balrog
> changes, and we'll want to change it back later anyways.
> * Thunderbird candidates dir - this will get fixed when 33.0b1 runs
> * SHA512 vs. sha512 is the snippets - I've simply modified the snippet
> comparison tool to make these consistent. aus3 uses uppercase and lowercase
> in different places, so it's hard to make balrog consistent here.
> 
> We're getting pretty close now!

We're now at:
Tested 11488 paths.
Pass count: 9371
Fail count: 2115

100% of the Firefox tests pass, and I _think_ the Thunderbird failures are just the ones noted above. I'll dive into that more tomorrow.
(Assignee)

Comment 15

3 years ago
(In reply to Ben Hearsum [:bhearsum] from comment #14)
> (In reply to Ben Hearsum [:bhearsum] from comment #13)
> > With bug 1041745 fixed, we've got a better pass rate:
> > Tested 11488 paths.
> > Pass count: 5073
> > Fail count: 6415
> > Error count: 0
> > 
> > I went over the remaining failures and found a few more things:
> > * Lack of watershed updates for Firefox - fixed by bug 1049550
> > * Thunderbird detailsURLs - won't be fixing this because it requires Balrog
> > changes, and we'll want to change it back later anyways.
> > * Thunderbird candidates dir - this will get fixed when 33.0b1 runs
> > * SHA512 vs. sha512 is the snippets - I've simply modified the snippet
> > comparison tool to make these consistent. aus3 uses uppercase and lowercase
> > in different places, so it's hard to make balrog consistent here.
> > 
> > We're getting pretty close now!
> 
> We're now at:
> Tested 11488 paths.
> Pass count: 9371
> Fail count: 2115
> 
> 100% of the Firefox tests pass, and I _think_ the Thunderbird failures are
> just the ones noted above. I'll dive into that more tomorrow.

We're now at 100% pass after I tweaked the comparison tool to ignore the Thunderbird differences:
Tested 11488 paths.
Pass count: 11488
Fail count: 0
Error count: 0

For posterity, these are the tweaks I made to ignore the sha512 + thunderbird differences:
diff --git a/auslib/util/testing.py b/auslib/util/testing.py
index a7b9036..b8b634c 100644
--- a/auslib/util/testing.py
+++ b/auslib/util/testing.py
@@ -10,11 +10,17 @@ def compare_snippets(url1, url2, retries=3, timeout=10, diff=True):
     xml1 = retry(requests.get, sleeptime=5, attempts=retries, args=(url1,),
                  retry_exceptions=(requests.HTTPError, requests.ConnectionError),
                  kwargs={'timeout': timeout, 'config': cfg})
-    xml1 = xml1.content.splitlines()
+    import re
+    xml1 = xml1.content.replace("SHA512", "sha512")
+    xml1 = re.sub("detailsURL=\"[^\"]*\"", "detailsURL=\"blah\"", xml1)
+    xml1 = xml1.splitlines()
     xml2 = retry(requests.get, sleeptime=5, attempts=retries, args=(url2,),
                  retry_exceptions=(requests.HTTPError, requests.ConnectionError),
                  kwargs={'timeout': timeout, 'config': cfg})
-    xml2 = xml2.content.splitlines()
+    xml2 = xml2.content.replace("SHA512", "sha512")
+    xml2 = xml2.replace("Thunderbird-32.0b1-Complete", "thunderbird-32.0b1-complete")
+    xml2 = re.sub("detailsURL=\"[^\"]*\"", "detailsURL=\"blah\"", xml2)
+    xml2 = xml2.splitlines()
     ret = [url1, xml1, url2, xml2]
     if xml1 != xml2:
         if diff:
(Assignee)

Comment 16

3 years ago
We flipped the switch about 20 minutes ago. Load is looking fine on the backend nodes and QE's tests are looking good. Leaving this open until they're done and the dust settles.
(Assignee)

Comment 17

3 years ago
Things are still looking good. Catlee suggested looking at Sentry to see what Tracebacks came up, and I've filed a few bugs for them. They're not critical, but we should fix them before pushing to the release channel, I think:
https://bugzilla.mozilla.org/show_bug.cgi?id=1069454
https://bugzilla.mozilla.org/show_bug.cgi?id=1069508
https://bugzilla.mozilla.org/show_bug.cgi?id=1069512
(Assignee)

Comment 18

3 years ago
We flipped the switch live yesterday!
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.