Closed Bug 1837440 Opened 2 years ago Closed 18 days ago

verify mar integrity for nightly builds

Categories

(Release Engineering :: General, enhancement)

enhancement

Tracking

(firefox150 fixed)

RESOLVED FIXED
Tracking Status
firefox150 --- fixed

People

(Reporter: bhearsum, Assigned: bhearsum)

References

(Depends on 7 open bugs, )

Details

(Whiteboard: [updates])

Attachments

(11 files)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

This was originally https://bugzilla.mozilla.org/show_bug.cgi?id=588396, but that bug was closed without the issue actually being fixed.

Essentially, we should run our existing update-verify tests on Nightly. These tests are responsible for verifying that installers and updates get a user to the same place, and typically catch issues where an installer or MAR is missing a file or has one that it shouldn't.

This came up recently after a change to macOS notarization was made which caused us to drop a file accidentally. (See https://github.com/mozilla-releng/scriptworker-scripts/pull/761.)

These tests don't necessarily have to block shipping each Nightly update. Even if they ran after the fact it would be a big benefit, and prevent the issue above from hitting Beta or Release.

As the original bug notes, we could (and maybe should) run these tests in a way that bypasses the need for Balrog. This isn't strictly necessary if we accept that Nightly will ship before update verify runs, though.

jcristau suggested that we add a new task downstream of repackage (or maybe repackage-signing) that could do the "compare MAR to installer" part of update verify, which is the main part we're concerned about. This would essentially function like a unit test in that it only requires data from the upstream task to run, making it a very tractable solution.

QA Contact: jlorenzo

:jlorenzo adding a note, in the Fx136 cycle we hit issues with UV tests.

  • 136.0b1 required 4 respins before the tests were fixed.
  • 136.0 RC1 was followed by a quick fix in 136.0 RC2.
    This was mentioned in the Fx136 retro. Wondering if there is any priority on this bug?
    Catching failures in Nightly would have saved a lot of back-and-forth.
Flags: needinfo?(jlorenzo)

Hey :dmeehan! I totally agree. I'm getting the leadership approval to get this prioritized in FFXP-3289.

Flags: needinfo?(jlorenzo)
Assignee: nobody → ahal
Status: NEW → ASSIGNED

I've been working on getting something up and running that does the mar integrity part of these checks, which (as noted in comment #1) is the main part we care about for Nightly.

I should have a patch stack to share soon; I plan to land it as a fairly minimal MVP and iterate from there, so this may branch out into a number of other bugs.

Assignee: ahal → bhearsum

Updating the title to reflect what's actually going to happen.

Summary: run update verify on nightlies → verify mar integrity for nightly builds

This is a bit of a weird assumption to make, and while it works for the current update verify tests, it restricts our ability to call this function from elsewhere.

Notably without:

  • A config file
  • Downloading anything as part of the test

This means the caller is responsible for figuring out and acquiring the files and information necessary to run the test, which allows for more flexibility. ie: we can start putting that logic in a caller other than verify.sh.

This allows callers to better understand failures.

This doesn't matter for verify.sh, where we always have build targets available, but build targets aren't strictly the only way to identify platforms, and it would be nice to have to use them when running check_updates.sh in a context where they aren't otherwise needed.

While it would be strictly ideal to do it, requiring both MARs to be signed with the same certificate means that partials tasks cannot work on Try without 2 try runs (one to generate completes signed with the dep cert, another that generates partials). Disabling this allows us to generate partials off of production releases, which speeds up testing cycles.

This is necessary for diff analysis of a production nightly with a MAR from Try applied to vs. a Try build to pass.

Further down this stack I'm adding a new Rust tool that lives in the tree and is built by a toolchain task. Running the tests in the toolchain task is not strictly the ideal, but I'm not aware of any solution we have for running tests in a downstream task. (An alternative might be to add a separate task that does the build and runs the tests, but that seems even less ideal.)

This is about as much of an MVP as we can have, while still being able to run it against Linux x86-64 en-US nightlies on mozilla-central. It may need additional work to be able to test other platforms (including Linux aarch64) and locales. I think that work is rightly done in follow-up patches, as is any work to get rid of the dependencies on existing shell and python scripts.

I'm not 100% certain this is the best way to do this, but it seems roughly inline with how we deal with other Rust based tools.

This is useful when grouping by multiple upstream kinds, only some of which contain l10n tasks.

This explicitly only supports linux x86-64 en-US nightly builds at the moment. It will need some enhancements to support other platforms as well as l10n builds - but it's a good starting point.

Attachment #9552936 - Attachment description: Bug 1837440: don't assume packages to unpack are in the parent directory r?#releng-reviewers → Bug 1837440: don't assume packages to unpack are in the parent directory r?#releng-reviewers!
Attachment #9552937 - Attachment description: Bug 1837440: make it possible to run check_updates.sh as a standalone script r?#releng-reviewers → Bug 1837440: make it possible to run check_updates.sh as a standalone script r?#releng-reviewers!
Attachment #9552938 - Attachment description: Bug 1837440: adjust check_updates.sh return codes to be distinct r?#releng-reviewers → Bug 1837440: adjust check_updates.sh return codes to be distinct r?#releng-reviewers!
Attachment #9552939 - Attachment description: Bug 1837440: make simple platform strings accepted by update verify helper functions r?#releng-reviewers → Bug 1837440: make simple platform strings accepted by update verify helper functions r?#releng-reviewers!
Attachment #9552941 - Attachment description: Bug 1837440: don't do mar cert validation in zucchini partials on try r?hneiva → Bug 1837440: don't do mar cert validation in zucchini partials on try r?hneiva!
Attachment #9552942 - Attachment description: Bug 1837440: add compare directories entry for update verify of nightly builds on try r?#releng-reviewers → Bug 1837440: add compare directories entry for update verify of nightly builds on try r?#releng-reviewers!
Attachment #9552943 - Attachment description: Bug 1837440: add support for running tests to rust based toolchain script r?#firefox-build-system-reviewers → Bug 1837440: add support for running tests to rust based toolchain script r?#firefox-build-system-reviewers!
Attachment #9552944 - Attachment description: Bug 1837440: add a new tool that can verify mar integrity outside of update-verify r?#releng-reviewers → Bug 1837440: add a new tool that can verify mar integrity outside of update-verify r?#releng-reviewers!
Attachment #9552945 - Attachment description: Bug 1837440: build new update integrity testing tool as a toolchain task r?#firefox-build-system-reviewers → Bug 1837440: build new update integrity testing tool as a toolchain task r?#firefox-build-system-reviewers!
Attachment #9552946 - Attachment description: Bug 1837440: add ability to group tasks by platform, ignoring l10n r?#taskgraph-reviewers → Bug 1837440: add ability to group tasks by platform, ignoring l10n r?#taskgraph-reviewers!
Attachment #9552947 - Attachment description: Bug 1837440: add update integrity tests for linux x86-64 en-US nightly builds r?#taskgraph-reviewers → Bug 1837440: add update integrity tests for linux x86-64 en-US nightly builds r?#taskgraph-reviewers!

This initial work will soon land. I'll be filing separate bugs for a bunch of follow-on work (running tests on different platforms, for localized builds, and some other things).

Pushed by bhearsum@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/d467a6de6653 https://hg.mozilla.org/integration/autoland/rev/6a7dbf5ce87e don't assume packages to unpack are in the parent directory r=releng-reviewers,jcristau https://github.com/mozilla-firefox/firefox/commit/f3e9aaa98d54 https://hg.mozilla.org/integration/autoland/rev/6b77574aba7d make it possible to run check_updates.sh as a standalone script r=releng-reviewers,jcristau https://github.com/mozilla-firefox/firefox/commit/99cd22007940 https://hg.mozilla.org/integration/autoland/rev/678f0228dd5a adjust check_updates.sh return codes to be distinct r=releng-reviewers,jcristau https://github.com/mozilla-firefox/firefox/commit/5b6f2f934f56 https://hg.mozilla.org/integration/autoland/rev/709218bee6bf make simple platform strings accepted by update verify helper functions r=releng-reviewers,jcristau https://github.com/mozilla-firefox/firefox/commit/358ab31decda https://hg.mozilla.org/integration/autoland/rev/7f39be4fe754 don't do mar cert validation in zucchini partials on try r=taskgraph-reviewers,releng-reviewers,ahal https://github.com/mozilla-firefox/firefox/commit/c9f4ccf86889 https://hg.mozilla.org/integration/autoland/rev/37d4bbab1313 add compare directories entry for update verify of nightly builds on try r=releng-reviewers,jcristau https://github.com/mozilla-firefox/firefox/commit/ed24d8c5f38a https://hg.mozilla.org/integration/autoland/rev/ba2e512cef09 add support for running tests to rust based toolchain script r=firefox-build-system-reviewers,ahochheiden https://github.com/mozilla-firefox/firefox/commit/eebd408bb828 https://hg.mozilla.org/integration/autoland/rev/203a39b4d3d6 add a new tool that can verify mar integrity outside of update-verify r=releng-reviewers,ahal https://github.com/mozilla-firefox/firefox/commit/8e99c4405a88 https://hg.mozilla.org/integration/autoland/rev/1ebf699b3858 build new update integrity testing tool as a toolchain task r=firefox-build-system-reviewers,ahal,ahochheiden https://github.com/mozilla-firefox/firefox/commit/8bd08284c71e https://hg.mozilla.org/integration/autoland/rev/d36d68b908db add ability to group tasks by platform, ignoring l10n r=taskgraph-reviewers,jcristau https://github.com/mozilla-firefox/firefox/commit/19e2166742a6 https://hg.mozilla.org/integration/autoland/rev/411e41a338ff add update integrity tests for linux x86-64 en-US nightly builds r=taskgraph-reviewers,jcristau
Depends on: 2025211
Depends on: 2025213
Depends on: 2025214
Depends on: 2025215
Depends on: 2025216
Depends on: 2025217
Depends on: 2025303
Depends on: 2025304
Depends on: 2025305
Depends on: 2025307
Depends on: 2025308
Depends on: 2025310
Depends on: 2025478
Depends on: 2028905
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: