Uploads are broken on try for desktop and mobile builds since S3 migration

RESOLVED FIXED

Status

RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: nthomas, Unassigned)

Tracking

unspecified

Firefox Tracking Flags

(firefox44 fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

3 years ago
Not all jobs show this, because some people are based on older code, but this is a fair example:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=dc9691794015

So busted buildbot builds for firefox and fennec, hit this when starting 'make upload':

subprocess.CalledProcessError: Command '['ssh', '-o', 'IdentityFile=/home/mock_mozilla/.ssh/trybld_dsa', 'trybld@upload.ffxbld.productdelivery.prod.mozaws.net', 'mktemp -d']' returned non-zero exit status 255
(Reporter)

Comment 1

3 years ago
Created attachment 8676707 [details] [diff] [review]
[gecko] Workaround

The problem is that the upload host should be upload.trybld.productdelivery.prod.mozaws.net (instead of upload.ffxbld.pr...). This is specific to try with its level 1 access, and all the other branches use upload.ffxbld.

Attached is a work-around for developers to use until we fix this. It should not land in m-c or friends!
(Reporter)

Comment 2

3 years ago
I has hoping I could fix this with a patch like so:

diff --git a/testing/mozharness/configs/builds/branch_specifics.py b/testing/mozharness/configs/builds/branch_specifics.py
--- a/testing/mozharness/configs/builds/branch_specifics.py
+++ b/testing/mozharness/configs/builds/branch_specifics.py
@@ -66,20 +66,21 @@ config = {
     },
     'try': {
         'repo_path': 'try',
         'clone_by_revision': True,
         'clone_with_purge': True,
         'tinderbox_build_dir': '%(who)s-%(got_revision)s',
         'to_tinderbox_dated': False,
         'include_post_upload_builddir': True,
         'release_to_try_builds': True,
         'use_branch_in_symbols_extra_buildid': False,
+        'stage_server': 'upload.trybld.productdelivery.prod.mozaws.net',
         'stage_username': 'trybld',
         'stage_ssh_key': 'trybld_dsa',
         'branch_supports_uploadsymbols': False,
...

However the precedence order at https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/mozilla/building/buildbase.py#225 means the production config will override the special branch value, so we still get upload.fffbld. Making branch trump prod/staging would fix the issue here, but potentially bust other parameters which relying on the current order.
(Reporter)

Comment 4

3 years ago
Created attachment 8676719 [details] [diff] [review]
[gecko] Hack to fix try

Here's a hack to fix this, landed as bustage fix at
  https://hg.mozilla.org/integration/mozilla-inbound/rev/0ee21e8d5ca6

Devs push your changes on top of this (or a descendent, obvs).
Attachment #8676719 - Flags: checked-in+
(Reporter)

Comment 5

3 years ago
Jordan, how could we do this better ?
Severity: critical → normal
Flags: needinfo?(jlund)
(Reporter)

Comment 6

3 years ago
Try push to verify this actually works - https://treeherder.mozilla.org/#/jobs?repo=try&revision=c7073e6ceb85 - and stage_server is set properly. Also checked inbound push is not changed.
(Reporter)

Updated

3 years ago
Attachment #8676719 - Attachment is obsolete: true
(Reporter)

Updated

3 years ago
Attachment #8676719 - Attachment is obsolete: false
(Reporter)

Updated

3 years ago
Attachment #8676707 - Attachment is obsolete: true
I'm hitting conflicts trying to get this onto anything beta or older.
Flags: needinfo?(nthomas)
(In reply to Nick Thomas [:nthomas] from comment #5)
> Jordan, how could we do this better ?

hrm, how about a 'try' pool in build_pool_specifics that is a dupe of 'production' but has its own stage_server.

benefits
1) wouldn't require change in hierarchy configs
2) wouldn't require 'stage_server' explicitly set in each branch

downsides:
1) we would also have to patch all the script calls for our builds in Buildbot and Taskcluster for try specifically.
Flags: needinfo?(jlund)
https://hg.mozilla.org/mozilla-central/rev/0ee21e8d5ca6
Status: NEW → RESOLVED
Last Resolved: 3 years ago
status-firefox44: --- → fixed
Resolution: --- → FIXED
(Reporter)

Comment 12

3 years ago
(In reply to Wes Kocher (:KWierso) from comment #9)
> I'm hitting conflicts trying to get this onto anything beta or older.

I actually need to merge attachment 8672532 [details] [diff] [review] (bug 1213721) and the patch here to beta/release/esr38. It's not super urgent because we're not using mozharness for the builds there, just in case someone pushes to try.
Flags: needinfo?(nthomas)
(Reporter)

Comment 13

3 years ago
(In reply to Jordan Lund (:jlund) from comment #10)
> hrm, how about a 'try' pool in build_pool_specifics that is a dupe of
> 'production' but has its own stage_server.

I like it. Maybe go a step further to allow for staging try as well, ie
   staging, production, staging-try, production-try
(Reporter)

Comment 14

3 years ago
Spun that out to bug 1217271.
Comment hidden (typo)
You need to log in before you can comment on or make changes to this bug.