Closed Bug 1631243 Opened 4 years ago Closed 4 years ago

cron tasks for autoland, mozilla-esr68, comm-* intermittently fail due to: abort: unable to apply stream clone: unsupported format: sparserevlog

Categories

(Firefox Build System :: Task Configuration, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: aryx, Assigned: nthomas)

References

(Regression)

Details

(Keywords: regression)

Attachments

(4 files)

comm-beta cron tasks suddenly started failing: https://firefox-ci-tc.services.mozilla.com/tasks/Qdzuk3hSQk20KlyChNAD-A/runs/0/logs/https%3A%2F%2Ffirefox-ci-tc.services.mozilla.com%2Fapi%2Fqueue%2Fv1%2Ftask%2FQdzuk3hSQk20KlyChNAD-A%2Fruns%2F0%2Fartifacts%2Fpublic%2Flogs%2Flive.log

[vcs 2020-04-19T11:15:55.960Z] applying clone bundle from https://s3-us-west-2.amazonaws.com/moz-hg-bundles-us-west-2/mozilla-unified/ae4f8f7008e9d990179ec9388d69422482ebb6b8.stream-v2.hg
[vcs 2020-04-19T11:16:31.851Z] PERFHERDER_DATA: {"framework": {"name": "vcs"}, "suites": [{"extraOptions": ["c5.xlarge"], "lowerIsBetter": true, "name": "clone_errored", "shouldAlert": false, "subtests": [], "value": 36.67322516441345}, {"extraOptions": ["c5.xlarge"], "lowerIsBetter": true, "name": "overall", "shouldAlert": false, "subtests": [], "value": 37.87255096435547}]}
[vcs 2020-04-19T11:16:31.851Z] abort: unable to apply stream clone: unsupported format: sparserevlog

First failure was at 09:31 UTC: https://firefox-ci-tc.services.mozilla.com/tasks/D9a2ksALS96mugnivSh1ow
It also failed at :46 and :16 but there is no failure email for :01.

Flags: needinfo?(sheehan)

09:31 was comm-central which failed only once. comm-beta started at 09:46.

The fix here is to update the version of Mercurial used to clone. Preferably the latest version (5.3.2) but any after 4.7 will work. Upgrading Mercurial should always be a safe (ie backwards compatible) operation so it should just be a simple requirements file version bump somewhere.

As a stop-gap I have rolled back to yesterday's clonebundle manifest for mozilla-unified, which will allow CI to continue cloning from a non-sparserevlog bundle. This will work for the next 7 days, after which the bundle will expire and only the sparserevlog bundles will remain.

Flags: needinfo?(sheehan)

Just to clarify - this will work until this evening, when new bundles will be generated. We'll need to roll back the bundles manifest nightly until this is fixed, to keep serving the non-sparserevlog bundle. At the end of the week the non-sparse bundle will expire and it will be sparse-only from here on out.

Someone (?) should update hg on the affected service ASAP.

Summary: comm-beta cron task fails due to: abort: unable to apply stream clone: unsupported format: sparserevlog → cron tasks for autoland, mozilla-esr68, comm-* intermittently fail due to: abort: unable to apply stream clone: unsupported format: sparserevlog

Tom, Justin, can you upgrade in these worker images, please? It's spamming sheriffs and ciduty and the consequences are unknown to me.

Flags: needinfo?(mozilla)
Flags: needinfo?(bugspam.Callek)

We need to redo bug 1502976 as the cron jobs are still using a payload.image of taskcluster/decision:2.1.0@sha256:6db3..., which has the old hg of version 4.5.2. Other tasks using taskcluster/decision:2.2.0 have hg v4.8.1 and are not having issues cloning.

Assignee: nobody → nthomas
Status: NEW → ASSIGNED

r+ on Matrix from Callek, landed at https://hg.mozilla.org/ci/ci-configuration/rev/5236d0a649cbe91e3c7fce1f1a1280faaf8d1d9d

# At about Mon 20 Apr 2020 02:07:00 UTC
$ ci-admin apply --environment=firefoxci  --grep 'Hook.*cron'
Updating Hook=project-releng/cron-task-releases-mozilla-beta
Updating Hook=project-releng/cron-task-comm-central
Updating Hook=project-releng/cron-task-projects-jamun
Updating Hook=project-releng/cron-task-projects-ash
Updating Hook=project-releng/cron-task-releases-mozilla-esr68
Updating Hook=project-releng/cron-task-releases-mozilla-beta/ship-geckoview
Updating Hook=project-releng/cron-task-integration-autoland
Updating Hook=project-releng/cron-task-projects-birch/nightly-desktop
Updating Hook=project-releng/cron-task-mozilla-central/nightly-desktop
Updating Hook=project-releng/cron-task-projects-maple
Updating Hook=project-releng/cron-task-releases-mozilla-release
Updating Hook=project-releng/cron-task-mozilla-central/ship-geckoview
Updating Hook=project-releng/cron-task-mozilla-central
Updating Hook=project-releng/cron-task-releases-comm-esr68
Updating Hook=project-releng/cron-task-mozilla-central/nightly-desktop-win32
Updating Hook=project-releng/cron-task-releases-comm-beta
Updating Hook=project-releng/cron-task-releases-mozilla-beta/daily-releases
Updating Hook=project-releng/cron-task-mozilla-central/nightly-desktop-osx
Updating Hook=project-releng/cron-task-releases-mozilla-beta/l10n-bumper
Updating Hook=project-releng/cron-task-integration-mozilla-inbound
Updating Hook=project-releng/cron-task-mozilla-central/nightly-desktop-win64
Updating Hook=project-releng/cron-task-projects-pine
Updating Hook=project-releng/cron-task-releases-mozilla-esr68/l10n-bumper
Updating Hook=project-releng/cron-task-mozilla-central/nightly-desktop-win64-aarch64
Updating Hook=project-releng/cron-task-comm-central/nightly-desktop
Updating Hook=project-releng/cron-task-projects-birch
Updating Hook=project-releng/cron-task-mozilla-central/l10n-bumper
Updating Hook=project-releng/cron-task-releases-mozilla-release/ship-geckoview
Updating Hook=project-releng/cron-task-releases-mozilla-esr68/daily-releases

Those two patches are to deal with a new error, eg https://firefoxci.taskcluster-artifacts.net/fkaD-d-hRiO1qGkh4vQydg/0/public/logs/live_backing.log

Status: Downloaded newer image for taskcluster/decision@sha256:cbeadf57300de60408bf1337e723f0cb1f0200f559799cb54deb9535d1e03b4a
[taskcluster 2020-04-20 02:15:41.106Z] === Task Starting ===
[setup 2020-04-20T02:15:41.503Z] run-task started in /
usage: run-task [-h] [--user USER] [--group GROUP]
                [--gecko-checkout GECKO_CHECKOUT]
                [--gecko-sparse-profile GECKO_SPARSE_PROFILE]
                [--comm-checkout COMM_CHECKOUT]
                [--comm-sparse-profile COMM_SPARSE_PROFILE]
                [--fetch-hgfingerprint]
run-task: error: unrecognized arguments: --vcs-checkout=/builds/worker/checkouts/gecko --sparse-profile=build/sparse-profiles/taskgraph

run-task was also changed between 2.1.0 and 2.2.0 of taskcluster/decision.

Attached file ci-admin output
Changes landed and applied.

Needed to back out some of the ci-configuration change in https://hg.mozilla.org/ci/ci-configuration/rev/fa0352303e2e5df90f21c5687da32cdde8004960 for breaking mobile hooks.

date; ci-admin apply  --environment=firefoxci  --grep 'Hook.*cron' ;date
Mon 20 Apr 2020 16:54:13 NZST
Updating Hook=project-releng/cron-task-mozilla-mobile-fenix/bump-android-components
Updating Hook=project-releng/cron-task-mozilla-mobile-fenix/nightly
Updating Hook=project-releng/cron-task-mozilla-mobile-android-components/snapshot
Updating Hook=project-releng/cron-task-mozilla-mobile-fenix
Updating Hook=project-releng/cron-task-mozilla-mobile-fenix/raptor
Updating Hook=project-releng/cron-task-mozilla-mobile-reference-browser/bump-android-comp
Updating Hook=project-releng/cron-task-mozilla-extensions-xpi-manifest
Updating Hook=project-releng/cron-task-mozilla-l10n-android-l10n-tooling
Updating Hook=project-releng/cron-task-mozilla-mobile-android-components/nightly
Updating Hook=project-releng/cron-task-mozilla-mobile-android-components
Updating Hook=project-releng/cron-task-JohanLorenzo-fenix/nightly
Updating Hook=project-releng/cron-task-mozilla-mobile-reference-browser/nightly
Updating Hook=project-releng/cron-task-mozilla-application-services
Updating Hook=project-releng/cron-task-mozilla-l10n-android-l10n-tooling/update-l10n
Updating Hook=project-releng/cron-task-mozilla-mobile-fenix/browsertime
Updating Hook=project-releng/cron-task-mozilla-l10n-android-l10n-tooling/update-projects
Updating Hook=project-releng/cron-task-JohanLorenzo-fenix
Updating Hook=project-releng/cron-task-escapewindow-test-xpi-manifest
Updating Hook=project-releng/cron-task-mozilla-mobile-focus-android
Updating Hook=project-releng/cron-task-mozilla-mobile-reference-browser
Updating Hook=project-releng/cron-task-mozilla-mobile-fenix/nightly-on-google-play
Mon 20 Apr 2020 16:54:46 NZST
Component: Mercurial: hg.mozilla.org → Task Configuration
Product: Developer Services → Firefox Build System
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Flags: needinfo?(mozilla)
Flags: needinfo?(bugspam.Callek)
Resolution: --- → FIXED

Sheriffs are still getting errors, eg
https://firefox-ci-tc.services.mozilla.com/tasks/Z_y_JFZRR4ej7YdNv_6KVQ
https://firefox-ci-tc.services.mozilla.com/tasks/J6vNOni_RpCV43C4G7Md0Q

Basically run-task: error: unrecognized arguments: --vcs-checkout=/builds/worker/checkouts/gecko with the v2.2.1 decision task but old run-task arguments.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

I'm not sure how we got into a bad state but it should be better now.

$ ci-admin apply --environment=firefoxci --grep 'Hook='
Updating Hook=project-releng/cron-task-releases-comm-beta
Updating Hook=project-releng/cron-task-releases-mozilla-esr68
Updating Hook=project-releng/cron-task-mozilla-central
Updating Hook=project-releng/cron-task-projects-birch/nightly-desktop
Updating Hook=project-releng/cron-task-projects-ash
Updating Hook=project-releng/cron-task-releases-mozilla-beta/ship-geckoview
Updating Hook=project-releng/cron-task-integration-mozilla-inbound
Updating Hook=project-releng/cron-task-mozilla-central/nightly-desktop-osx
Updating Hook=project-releng/cron-task-mozilla-central/ship-geckoview
Updating Hook=project-releng/cron-task-projects-jamun
Updating Hook=project-releng/cron-task-mozilla-central/l10n-bumper
Updating Hook=project-releng/cron-task-releases-comm-esr68
Updating Hook=project-releng/cron-task-releases-mozilla-esr68/l10n-bumper
Updating Hook=project-releng/cron-task-releases-mozilla-beta/l10n-bumper
Updating Hook=project-releng/cron-task-releases-mozilla-release/ship-geckoview
Updating Hook=project-releng/cron-task-projects-maple
Updating Hook=project-releng/cron-task-releases-mozilla-release
Updating Hook=project-releng/cron-task-projects-birch
Updating Hook=project-releng/cron-task-projects-pine
Updating Hook=project-releng/cron-task-mozilla-central/nightly-desktop-win32
Updating Hook=project-releng/cron-task-mozilla-central/nightly-desktop-win64-aarch64
Updating Hook=project-releng/cron-task-releases-mozilla-esr68/daily-releases
Updating Hook=project-releng/cron-task-mozilla-central/nightly-desktop
Updating Hook=project-releng/cron-task-mozilla-central/nightly-desktop-win64
Updating Hook=project-releng/cron-task-comm-central/nightly-desktop
Updating Hook=project-releng/cron-task-integration-autoland
Updating Hook=project-releng/cron-task-comm-central
Updating Hook=project-releng/cron-task-releases-mozilla-beta
Updating Hook=project-releng/cron-task-releases-mozilla-beta/daily-releases
$ date
Mon 20 Apr 2020 21:55:21 NZST
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Has Regression Range: --- → yes
Keywords: regression
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: