bouncer check (formerly known "uptake monitoring") fails for en-US builds during 60.0b1

RESOLVED FIXED

Status

defect
RESOLVED FIXED
Last year
Last year

People

(Reporter: mtabara, Assigned: mtabara)

Tracking

unspecified

Firefox Tracking Flags

(firefox60 fixed)

Details

Attachments

(5 attachments, 2 obsolete attachments)

Seems like bouncer check is failing in the Devedition 60.0b1:

 [task 2018-03-05T11:00:07.395Z] 11:00:07     INFO - [mozharness: 2018-03-05 11:00:07.395767Z] Finished check-bouncer step (failed)
[task 2018-03-05T11:00:07.396Z] 11:00:07    FATAL - Uncaught exception: Traceback (most recent call last):
[task 2018-03-05T11:00:07.396Z] 11:00:07    FATAL -   File "/builds/worker/checkouts/gecko/testing/mozharness/mozharness/base/script.py", line 2059, in run
[task 2018-03-05T11:00:07.396Z] 11:00:07    FATAL -     self.run_action(action)
[task 2018-03-05T11:00:07.396Z] 11:00:07    FATAL -   File "/builds/worker/checkouts/gecko/testing/mozharness/mozharness/base/script.py", line 1998, in run_action
[task 2018-03-05T11:00:07.397Z] 11:00:07    FATAL -     self._possibly_run_method(method_name, error_if_missing=True)
[task 2018-03-05T11:00:07.397Z] 11:00:07    FATAL -   File "/builds/worker/checkouts/gecko/testing/mozharness/mozharness/base/script.py", line 1938, in _possibly_run_method
[task 2018-03-05T11:00:07.397Z] 11:00:07    FATAL -     return getattr(self, method_name)()
[task 2018-03-05T11:00:07.397Z] 11:00:07    FATAL -   File "testing/mozharness/scripts/release/bouncer_check.py", line 141, in check_bouncer
[task 2018-03-05T11:00:07.397Z] 11:00:07    FATAL -     f.result()
[task 2018-03-05T11:00:07.397Z] 11:00:07    FATAL -   File "/builds/worker/checkouts/gecko/build/venv/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result
[task 2018-03-05T11:00:07.397Z] 11:00:07    FATAL -     return self.__get_result()
[task 2018-03-05T11:00:07.397Z] 11:00:07    FATAL -   File "/builds/worker/checkouts/gecko/build/venv/lib/python2.7/site-packages/concurrent/futures/thread.py", line 62, in run
[task 2018-03-05T11:00:07.397Z] 11:00:07    FATAL -     result = self.fn(*self.args, **self.kwargs)
[task 2018-03-05T11:00:07.397Z] 11:00:07    FATAL -   File "testing/mozharness/scripts/release/bouncer_check.py", line 91, in check_url
[task 2018-03-05T11:00:07.397Z] 11:00:07    FATAL -     retry(do_check_url, sleeptime=3, max_sleeptime=10, attempts=3)
[task 2018-03-05T11:00:07.397Z] 11:00:07    FATAL -   File "/builds/worker/checkouts/gecko/build/venv/lib/python2.7/site-packages/redo/__init__.py", line 162, in retry
[task 2018-03-05T11:00:07.397Z] 11:00:07    FATAL -     return action(*args, **kwargs)
[task 2018-03-05T11:00:07.398Z] 11:00:07    FATAL -   File "testing/mozharness/scripts/release/bouncer_check.py", line 86, in do_check_url
[task 2018-03-05T11:00:07.398Z] 11:00:07    FATAL -     r.raise_for_status()
[task 2018-03-05T11:00:07.398Z] 11:00:07    FATAL -   File "/builds/worker/checkouts/gecko/build/venv/lib/python2.7/site-packages/requests/models.py", line 935, in raise_for_status
[task 2018-03-05T11:00:07.398Z] 11:00:07    FATAL -     raise HTTPError(http_error_msg, response=self)
[task 2018-03-05T11:00:07.398Z] 11:00:07    FATAL - HTTPError: 404 Client Error: Not Found for url: https://download.mozilla.org/?product=Devedition-60.0b1&os=win64&lang=en-US

Seems that this is solely isolated to the en-US builds for some reason. The rest of locales work as expected.

Recent development in bouncer check was done in bug 1398796.
Investigating now ...
Also, unrelated issue, was accidentally glancing at view-source:https://www.mozilla.org/en-US/firefox/new/?scene=2 and saw the following: 

 <li><a href="https://download-sha1.allizom.org/?product=firefox-stub&amp;os=win&amp;lang=en-US"
               class="download-link button button-green"
               data-link-type="download"
               data-download-version="winsha1"


which sounds a lot like we're serving content from a staging instance. Given that it's sha1 stuff on win32, this might actually be expected behavior but just thought I'd ask to be sure. 

@nthomas - before I escalate, any hints of why we do serve the sh1 from allizom instance?
Flags: needinfo?(nthomas)
Seems like this is larger than just the en-US. Other are failing as well.
Looking closer to bouncer  ..

https://download.mozilla.org/?product=Devedition-60.0b1-Complete&os=win64&lang=en-US

redirects to 
https://download-installer.cdn.mozilla.net/pub/devedition/releases/60.0b1/updates/win64/ro/firefox-60.0b1.complete.mar

which is WRONG. 
It's "update", not "updates" in the url. The artifacts are present under http://archive.mozilla.org/pub/devedition/releases/60.0b1/update/win64/ro/firefox-60.0b1.complete.mar but obviously not under http://archive.mozilla.org/pub/devedition/releases/60.0b1/updates/win64/ro/firefox-60.0b1.complete.mar

I need to understand where's that "updates" instead of "update" is coming from. Most likely in some configs.
Bouncer entries for products/locations are 

For 59.0b14 
/firefox/releases/59.0b13/update/linux-x86_64/:lang/firefox-59.0b13.complete.mar 

while the new 60.0b1 are
/devedition/releases/60.0b1/updates/linux-x86_64/:lang/firefox-60.0b1.complete.mar

Something is fishy, as I know :rail actually uplifted his patches around b11-b12 so these shouldn't be different.
I wonder if it could be related to bug 1437565.
URL: 1437565
URL: 1437565
See Also: → 1437565
Digging more, it seems that:

`Devedition-60.0b1` fails for en-US across all platforms but works for all other locales which makes no sense since bouncer product is configured to serve "/devedition/releases/60.0b1/${platform}/:lang/Firefox%2060.0b1.dmg"

`Devedition-60.0b1-SSL` the same as above with bouncer configs serving "/devedition/releases/60.0b1/${platform}/:lang/Firefox%2060.0b1.dmg"

`Devedition-60.0b1-Complete` fails for all platforms, all locales, bouncer configs being "/devedition/releases/60.0b1/updates/${platform}/:lang/firefox-60.0b1.complete.mar"

`Devedition-60.0b1-stub` fails for all platforms, all locales, bouncer config being "/devedition/releases/60.0b1/{$platform}/:lang/Firefox%20Installer.exe" which seems wrong as the installers are called "Firefox Setup 60.0b1.exe"

No partials have been checked because script gave up.
I guess I was wrong about the "stub installer" above, that's sort of expected. Both win and win64 are supposed to serve the stub installer as "Firefox Installer.exe" while the "Firefox Setup $version.exe" is the full-installer one which is called probably from withih the stub installer, as it's not served via bouncer anyway.

So far, except the s/"updates"/"update" issue, nothing else doesn't seem culprit. 
Still digging.
Doh, underlying problem is actually under bouncer submission task. Multiple ones apparently, according to the logs[1]

a) we're using wrong calls to bouncer[2], they should be "$url/api" (specifically without the trailing slash)
b) s/"updates"/"update" within in-tree transforms [3]

[1]: https://taskcluster-artifacts.net/Y86FlD3cSxqo8HiRSvoUdQ/0/public/logs/live_backing.log
[2]: https://hg.mozilla.org/build/puppet/file/tip/modules/bouncer_scriptworker/manifests/settings.pp#l13
[3]: https://hg.mozilla.org/releases/mozilla-beta/file/default/taskcluster/taskgraph/transforms/bouncer_submission.py#l184
Question is though, how come we do have products available in bouncer, even though the api calls seem wrong?
See Also: → 1433459
Assignee: nobody → mtabara
Attachment #8956054 - Flags: review?(bugspam.Callek)
Attachment #8956054 - Flags: review?(bugspam.Callek) → review+
Previous in-tree patch is the real fix but in order to avoid a build4 let's hack bouncerscript to do this change.  We'll unroll after that.

You only need to review last commit - e33d17f as the rest of the code is already in production, but didn't merge it.
Attachment #8956060 - Flags: review?(jlorenzo)
Attachment #8956060 - Flags: review?(bugspam.Callek)
(In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #6)
> Doh, underlying problem is actually under bouncer submission task. Multiple
> ones apparently, according to the logs[1]
> 
> a) we're using wrong calls to bouncer[2], they should be "$url/api"
> (specifically without the trailing slash)

This is now fixed in puppet and deployed to the production nodes.

> b) s/"updates"/"update" within in-tree transforms [3]

real in-tree patch fix is  https://bugzilla.mozilla.org/attachment.cgi?id=8956059
temp hack to unblock 60.0b1 is a massage in bouncerscript - https://bugzilla.mozilla.org/attachment.cgi?id=8956060

After both are landed, we need to rerun the bouncer submission job + bouncer check after that.
The only problem left is what do I do with the existing products in bouncer. I'm leaning towards deleting them altogether

That's .. all "Devedition-60.0b1*" products from bouncer.
I'll double check with :rail that's a smart move to do :)
Attachment #8956060 - Flags: review?(bugspam.Callek) → review+
Comment on attachment 8956059 [details]
Bug 1443104 - fix update folder in bouncer submission release job.

https://reviewboard.mozilla.org/r/224998/#review230960
Attachment #8956059 - Flags: review?(bugspam.Callek) → review+
(In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #12)
> After both are landed, we need to rerun the bouncer submission job + bouncer
> check after that.
> The only problem left is what do I do with the existing products in bouncer.
> I'm leaning towards deleting them altogether
> 
> That's .. all "Devedition-60.0b1*" products from bouncer.
> I'll double check with :rail that's a smart move to do :)


Rail confirmed me in irccloud that we've done this before so I'll go on with this.
Both fixes are now in production so flushing existing Devedition-60.0b1* products from bouncer is the only leftover before rerunning tasks.
Just noticed another issue while inspecting the tasks. Apparently 'en-US' is not part of the locale list, hence why we've seen all the en-US failing across things. See more in the task-payload-locales definition [1].

[1]: https://tools.taskcluster.net/groups/SdybP-zoSbaWp1x6FdHW1A/tasks/Y86FlD3cSxqo8HiRSvoUdQ/details
(In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #15)
> Rail confirmed me in irccloud that we've done this before so I'll go on with
> this.
> Both fixes are now in production so flushing existing Devedition-60.0b1*
> products from bouncer is the only leftover before rerunning tasks.

This is now flushed away. Bouncer production knows nothing of Devedition-60.0b1
droppping a NI to :jlorenzo as we need to fix this for 60.0b2 - why en-US is not among the locales list? There's 98 locales but no sign of en-US. I assume it could get flushed away during transforms or something like that?
Flags: needinfo?(jlorenzo)
Comment on attachment 8956090 [details] [review]
Hack bouncerscript to add en-US to locales in payload for now to unblock 60.0b1

Approved by :jlorenzo in the Github PR.

https://hg.mozilla.org/build/puppet/rev/4c73675104b057aac13da51f60c91063d3740f29
https://hg.mozilla.org/build/puppet/rev/751a87d1a95c
Attachment #8956090 - Flags: review?(jlorenzo)
Attachment #8956090 - Flags: review+
Attachment #8956090 - Flags: checked-in+
(In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #19)
> droppping a NI to :jlorenzo as we need to fix this for 60.0b2 - why en-US is
> not among the locales list? There's 98 locales but no sign of en-US. I
> assume it could get flushed away during transforms or something like that?

Sounds like the issue is that we don't have en-US here[1]. Question is then, how did this work so far in previous bouncer submission? I'll dig further along, removing NI from :jlorenzo.

[1]: https://hg.mozilla.org/releases/mozilla-beta/file/tip/browser/locales/l10n-changesets.json
Flags: needinfo?(jlorenzo)
mtabara> anyone have any idea why en-US is missing from https://hg.mozilla.org/releases/mozilla-beta/file/tip/browser/locales/l10n-changesets.json ?
15:45:25 <bhearsum> mtabara: AFAIK we have never put it there
15:45:31 <mtabara> hm
15:45:37 <bhearsum> that file only contains revisions used when cloning l10n repos
15:45:48 <bhearsum> https://hg.mozilla.org/releases/mozilla-beta/file/tip/browser/locales/shipped-locales might be what you're looking for?
15:46:02 <bhearsum> that contains all locales we ship, including en-US, and specifies platform exceptions
New messages

@jlorenzo: I wonder if we need to tweak https://hg.mozilla.org/releases/mozilla-beta/file/tip/taskcluster/ci/release-bouncer-sub/kind.yml#l34 to point to https://hg.mozilla.org/releases/mozilla-beta/file/tip/browser/locales/shipped-locales instead
Flags: needinfo?(jlorenzo)
Pushed by mtabara@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f04735dda824
fix update folder in bouncer submission release job. r=jlorenzo,Callek a=release CLOSED TREE
(In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #22)
> mtabara> anyone have any idea why en-US is missing from
> https://hg.mozilla.org/releases/mozilla-beta/file/tip/browser/locales/l10n-
> changesets.json ?
> 15:45:25 <bhearsum> mtabara: AFAIK we have never put it there
> 15:45:31 <mtabara> hm
> 15:45:37 <bhearsum> that file only contains revisions used when cloning l10n
> repos
> 15:45:48 <bhearsum>
> https://hg.mozilla.org/releases/mozilla-beta/file/tip/browser/locales/
> shipped-locales might be what you're looking for?
> 15:46:02 <bhearsum> that contains all locales we ship, including en-US, and
> specifies platform exceptions
> New messages
> 
> @jlorenzo: I wonder if we need to tweak
> https://hg.mozilla.org/releases/mozilla-beta/file/tip/taskcluster/ci/release-
> bouncer-sub/kind.yml#l34 to point to
> https://hg.mozilla.org/releases/mozilla-beta/file/tip/browser/locales/
> shipped-locales instead

bhearsum> for whatever it's worth, the old submitter used shipped-locales: https://github.com/mozilla/gecko-dev/blob/master/testing/mozharness/scripts/bouncer_submitter.py
15:52:06 <bhearsum> having extra locales in bouncer causes no harm AFAIK
15:52:28 <bhearsum> it's not ideal, but it's not going to cause anything to show up on websites or anything like that
Couldn't rerun the bouncer submission job as it passed its expiration date (1day). For some reason I thought we have 5 days.

Edit + recreate: https://tools.taskcluster.net/groups/Sij5TFrLTM6_BzMTcI46oQ/tasks/Sij5TFrLTM6_BzMTcI46oQ/details
(In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #26)
> Couldn't rerun the bouncer submission job as it passed its expiration date
> (1day). For some reason I thought we have 5 days.
> 
> Edit + recreate:
> https://tools.taskcluster.net/groups/Sij5TFrLTM6_BzMTcI46oQ/tasks/
> Sij5TFrLTM6_BzMTcI46oQ/details

Won't work, CoT erors for different groups. Editing amends some of the fields.
We need to copy-paste task def (possibly amend deps) in task Creator and make sure it runs under the same graph.

Follow-up task: https://tools.taskcluster.net/groups/SdybP-zoSbaWp1x6FdHW1A/tasks/Wl8a-hwNQNqa02o7oEtCng/details
Okay, moving forward with the errors!  Devedition-60.0b1, Devedition-60.0b1-SSL and Devedition-60.0b1-Complete worked like a charm but we're still failing on stub installers.

Failing in:

  WARNING - FAIL: https://download.mozilla.org/?product=Devedition-60.0b1-stub&os=win64&lang=en-US, status: 404
[task 2018-03-05T16:15:34.197Z] 16:15:34  WARNING - FAIL: https://download.mozilla.org/?product=Devedition-60.0b1-stub&os=win64&lang=it, status: 404
[task 2018-03-05T16:15:34.291Z] 16:15:34  WARNING - FAIL: https://download.mozilla.org/?product=Devedition-60.0b1-stub&os=win64&lang=zh-TW, status: 404
[task 2018-03-05T16:15:34.332Z] 16:15:34  WARNING - FAIL: https://download.mozilla.org/?product=Devedition-60.0b1-stub&os=win64&lang=de, status: 404

That redirects to https://download-installer.cdn.mozilla.net/pub/devedition/releases/60.0b1/win64/en-US/Firefox%20Installer.exe
Indeed we don't have that under http://archive.mozilla.org/pub/devedition/releases/60.0b1/win64/en-US/ but I don't think it's supposed to be there anyway.
Fixed the bouncer entry in the production bouncer entry. 

Follow-up in-tree fix is needed but for now we are unblocked, we have green bouncer-submission, bouncer-check and final verification.

To sum up:
a) locales are missing en-US - for now we have a temp hack in bouncerscript to append it
b) api_root was wrong in puppet - that is now fixed in production
c) s/updates/update in paths for each platform/path/product. In-tree patch is pushed to both beta/inbound and there's also a hack in bouncerscript to deal with that.
d) we're pushing the wrong path for win64 stub installer, they are supposed to point at the win32 ones instead. Manual fix was done here but we should push an in-tree fix

Unrelated:
* comment 1
* why release graphs expiry after 1 day instead of five?
(In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #1)
> @nthomas - before I escalate, any hints of why we do serve the sh1 from
> allizom instance?

I don't know, but notice that download-sha1.mozilla.org doesn't resolve. pmac or oremj would be a good people to ask.
Flags: needinfo?(nthomas)
(In reply to Nick Thomas [:nthomas] (UTC+13) from comment #30)
> (In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #1)
> > @nthomas - before I escalate, any hints of why we do serve the sh1 from
> > allizom instance?
> 
> I don't know, but notice that download-sha1.mozilla.org doesn't resolve.
> pmac or oremj would be a good people to ask.

Okay, sounds good. I filed bug 1443303 to track this.
See Also: → 1443303
Comment on attachment 8956059 [details]
Bug 1443104 - fix update folder in bouncer submission release job.

https://reviewboard.mozilla.org/r/224998/#review231278
Attachment #8956059 - Flags: review?(jlorenzo) → review+
https://hg.mozilla.org/mozilla-central/rev/f04735dda824
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → FIXED
We still need to land a bunch of in-tree fixes here and on bouncerscript side. Reopening.
Removing :jlorenzo's NI as we talked about this over vidyo today.
Status: RESOLVED → REOPENED
Flags: needinfo?(jlorenzo)
Resolution: FIXED → ---
Attachment #8956837 - Attachment is obsolete: true
Attachment #8956837 - Flags: review?(jlorenzo)
Attachment #8956839 - Flags: review?(jlorenzo)
Comment on attachment 8956839 [details] [diff] [review]
Properly fix the missing en-US and win4 stub installer in-tree

Review of attachment 8956839 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not sure to follow the filter, but otherwise, LGTM.

::: taskcluster/taskgraph/transforms/bouncer_submission.py
@@ +92,5 @@
>              job, 'scopes', item_name=job['name'], project=config.params['project']
>          )
>  
> +        # No need to filter out ja-JP-mac, we need to upload both; but we do
> +        # need to filter out the platforms they come with

I'm not sure to understand the change here. Why do we filter out linux and win32 but not linux64 and win64?

Nit: A list comprehension is generally admitted to be clearer than filter(). I think, it applies in this case.

``` py
all_locales = sorted([
    locale
    for locale in parse_locales_file(job['locales-file']).keys()
    if locale not in ('linux', 'win32', 'osx')
])
```

What do you think?

@@ +189,5 @@
>              file=file_name,
>          )
> +        # We currently have a sole win32 stub installer that is to be used
> +        # in both windows platforms to toggle between full installers
> +        if 'Installer.exe' in file_name and 'win64' in file_relative_location:

I wonder if it wouldn't better to change `ftp_platform` before assigning `file_relative_location`. I won't block on this though.
Attachment #8956839 - Flags: review?(jlorenzo) → review+
(In reply to Johan Lorenzo [:jlorenzo] from comment #37)
> I'm not sure to understand the change here. Why do we filter out linux and
> win32 but not linux64 and win64?

Because of this https://hg.mozilla.org/releases/mozilla-beta/file/tip/browser/locales/shipped-locales#l52

> Nit: A list comprehension is generally admitted to be clearer than filter().

Sure, I can refactor, thanks.

> I wonder if it wouldn't better to change `ftp_platform` before assigning
> `file_relative_location`. I won't block on this though.

Makes more sense that way, indeed. Let me refactor and throw it again to review.
Thanks!
Comment on attachment 8957167 [details] [diff] [review]
Properly fix the missing en-US and win4 stub installer in-tree

Review of attachment 8957167 [details] [diff] [review]:
-----------------------------------------------------------------

Makes sense. Thank you for the explanations!
Attachment #8957167 - Flags: review?(jlorenzo) → review+
Pushed by mtabara@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4c664530b7e3
Properly fix the missing en-US and win4 stub installer for release-bouncer-submission. r=jlorenzo. a=release DONTBUILD
Comment on attachment 8957167 [details] [diff] [review]
Properly fix the missing en-US and win4 stub installer in-tree

https://hg.mozilla.org/integration/mozilla-inbound/rev/4c664530b7e36e2abdb73e1accded6c9df9da2eb

Will get to beta by Monday by the ongoing merges from central to beta.
Attachment #8957167 - Flags: checked-in+
See Also: → 1444131
Pushed by mtabara@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5b2d608c1924
add shipped-locales in taskcluster taskgraph sparse profiles. r=me,bustage CLOSED TREE
See Also: → 1444159
(In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #29)
> Fixed the bouncer entry in the production bouncer entry. 
> 
> Follow-up in-tree fix is needed but for now we are unblocked, we have green
> bouncer-submission, bouncer-check and final verification.
> 
> To sum up:
> a) locales are missing en-US - for now we have a temp hack in bouncerscript
> to append it

fixed with https://hg.mozilla.org/integration/mozilla-inbound/rev/4c664530b7e36e2abdb73e1accded6c9df9da2eb, to land in central and get uplifted to beta by Monday

> b) api_root was wrong in puppet - that is now fixed in production

fixed already in puppet production

> c) s/updates/update in paths for each platform/path/product. In-tree patch

fixed and landed in beta since last week

> d) we're pushing the wrong path for win64 stub installer, they are supposed

fixed with https://hg.mozilla.org/integration/mozilla-inbound/rev/4c664530b7e36e2abdb73e1accded6c9df9da2eb, to land in central and get uplifted to beta by Monday

> Unrelated:
> * comment 1

tracked and solved under 1443303

> * why release graphs expiry after 1 day instead of five?

tracked under 1444159

Once the last two patches make it to central, we can safely close this.
The beetmover hacks are to be removed along the bouncerscript I'm pushing under bug 1443305.
See Also: → 1443305
https://hg.mozilla.org/mozilla-central/rev/4c664530b7e3
https://hg.mozilla.org/mozilla-central/rev/5b2d608c1924
Status: REOPENED → RESOLVED
Closed: Last yearLast year
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.