Open Bug 1656830 Opened 7 months ago Updated 2 months ago

Deploy the latest ffmpeg + ImageMagic

Categories

(Testing :: mozperftest, enhancement, P2)

enhancement

Tracking

(Not tracked)

People

(Reporter: tarek, Unassigned)

References

(Depends on 1 open bug)

Details

For visualmetrics (for http3 testing), we need ffmpeg + image magick installed on the system.

It's quite a pain to manage this system install properly from mach on all platform, so we want to make it a manual pre-requirement in all environments, including Talos.

This is what would be required for each platform:

Flags: needinfo?(dhouse)
Flags: needinfo?(aerickson)
Depends on: 1656944
Depends on: 1656946
Blocks: 1656235
Flags: needinfo?(aerickson)
Depends on: 1656963
Priority: P1 → P2

(In reply to Tarek Ziadé (:tarek) from comment #0)

It's quite a pain to manage this system install properly from mach on all platform, so we want to make it a manual pre-requirement in all environments, including Talos.

:tarek, we're putting these on the hardware workers. Do you need them installed in the cloud workers also? (docker-worker for linux, and the windows images, and others?)

Flags: needinfo?(tarek)

Yes that would be great, so all environments have them

Thanks

Flags: needinfo?(tarek)

:markco, could you setup, or discuss with Tarek, the windows workers with ffmpeg and imagemagick? (Tarek notes the static installs in comment #0)

Andy has added them on mobile, I'm working through linux+macos, and I'll ask coop/Taskcluster about linux cloud workers (docker images).

Flags: needinfo?(dhouse) → needinfo?(mcornmesser)

:coop, could you coordinate adding ffmpeg and imagemagick on the docker-worker test task image(s), or tell me who I need to talk to for getting them added? I don't know what teams work on those images or need to be involved.

Flags: needinfo?(coop)

(In reply to Dave House [:dhouse] from comment #4)

:coop, could you coordinate adding ffmpeg and imagemagick on the docker-worker test task image(s), or tell me who I need to talk to for getting them added? I don't know what teams work on those images or need to be involved.

It's unclear to me from comment #0 whether this means changing existing images or creating new ones.

Tarek: are these new tests that are being added or existing tests that are being updated?

New tests should get their own image. For existing tests, it depends on whether the current image is shared with other tests. Adding packages to an existing, shared image may influence other test results.

Flags: needinfo?(coop) → needinfo?(tarek)

Dave, thanks for the coordination,

Tarek: are these new tests that are being added or existing tests that are being updated?

Chris, this impacts any test that calls browsertime and browsertime's visualmetrics script. In the long term it's raptor/browsertime/perftest tests.
I was assuming it was best to have them deploy in all CI workers but from your feedback I guess it's better if we just do Talos+Bitbar

Thanks

Flags: needinfo?(tarek)

(In reply to Tarek Ziadé (:tarek) from comment #6)

Tarek: are these new tests that are being added or existing tests that are being updated?

Chris, this impacts any test that calls browsertime and browsertime's visualmetrics script. In the long term it's raptor/browsertime/perftest tests.
I was assuming it was best to have them deploy in all CI workers but from your feedback I guess it's better if we just do Talos+Bitbar

I ask this mostly so that "we" (in the collective Mozilla sense) know which tests also run on these images. That way, we can be proactive about running those tests in staging and warning perf sheriffs if performance for existing tests are going to change.

In general, we should have a better process around making changes to existing images. The current process is so manual and fraught with peril that we avoid it or even talk ourselves out of it, and that's not a great competitive place to be at. The Taskcluster team has at least one sprint planned over the next few months that will start to address this.

In the absence of a better process, for now we should figure out specifically which worker types and underlying images need to be updated. AFAICT the worker types that currently run Raptor/browsertime/perftest tests are the following:

  • gecko-t/t-linux-xlarge
  • gecko-t/t-linux-xlarge-source
  • proj-autophone/gecko-t-bitbar-gw-perf-g5
  • proj-autophone/gecko-t-bitbar-gw-perf-p2
  • releng-hardware/gecko-t-linux-talos
  • releng-hardware/gecko-t-osx-1014
  • releng-hardware/gecko-t-win10-64-hw

If this is general capability that we want to enable for the future, we should look at adding these libs to everything under the gecko-t/, releng-hardware/, and proj-autophone/ provisioner umbrellas.

Tarek, another question for you (sorry): do you have an idea of the cadence with which we'll want to update these libraries on the test systems? Once a month? Once a year? This will influence how much short-term effort we put into making this process repeatable.

Flags: needinfo?(tarek)

For ffmpeg, is there a specific directory we need that to live in? The package is a zip package that doesn't contain an installer, so will the exe files be called directory or do we need to have the directory be appended to the Windows environment path?

Tarek, another question for you (sorry): do you have an idea of the cadence with which we'll want to update these libraries on the test systems?
Once a month? Once a year? This will influence how much short-term effort we put into making this process repeatable.

once a year

Flags: needinfo?(tarek)

(In reply to Mark Cornmesser [:markco] from comment #8)

For ffmpeg, is there a specific directory we need that to live in? The package is a zip package that doesn't contain an installer, so will the exe files be called directory or do we need to have the directory be appended to the Windows environment path?

As long as it's in the path, any place works.
Thanks

you can look at how the path is set here for windows : https://searchfox.org/mozilla-central/source/tools/browsertime/mach_commands.py#327

Baking these tools into the base images or having them pre-installed across the workers is an anti-pattern for test prerequisites. Best practices dictates we should be couple a tests prerequisites with the tasks setup harness and not build it directly into the underlying system which does not change and iterate as fast as tests and other parts 'in-tree'.

Tarek, sorry to push back here but I'd like to work out why this is, 'quite a pain to manage this system install properly from mach on all platform' and figure out a more sustainable solution. Could you elaborate on this?

If there is a decision to try get these to install from mach, I can be available to collaborate with someone to get it working on Windows.

:sparky do you know if the ffmpeg+imagemagick setup can wait until tarek is back from pto 8.24? :dividehex asked if we can revisit installing from mach/tooltool instead of doing the system-base installs, and so I wanted to NI tarek before we do more work on it

Flags: needinfo?(gmierz2)
Flags: needinfo?(mcornmesser)

:tarek, re: https://bugzilla.mozilla.org/show_bug.cgi?id=1656830#c12 can you write what the problems were or if we could help with the setup of these under mozharness/tooltool? re: :dividehex's point that managing them on the systems can be problematic because it is not in-tree and infrequently updated.

Flags: needinfo?(gmierz2) → needinfo?(tarek)

(In reply to Jake Watkins [:dividehex] from comment #12)

Baking these tools into the base images or having them pre-installed across the workers is an anti-pattern for test prerequisites.

Thanks for the feeback and help

Why is that an anti-pattern ? You have Python installed for instance.

Best practices dictates we should be couple a tests prerequisites with the tasks setup harness and not build it directly into the underlying system which does not change and iterate as fast as tests and other parts 'in-tree'.

ffmpeg is a very common system package that is often pre-installed in some distribution. It's a system level package. Once it's added, it's just a binary that's available, like "iproute" or "nmap". I don't really get the fear here.

Tarek, sorry to push back here but I'd like to work out why this is, 'quite a pain to manage this system install properly from mach on all platform' and figure out a more sustainable solution. Could you elaborate on this?

Sure, we tried in the past and failed.

If we are installing them through mach in a way that works both locally and in the CI, we are facing numerous combinations of systems
and environments. We have tried that in "mach browsertime" and had many issues. Sometimes because ffmpeg was already there, sometimes because the way we would install it was not the right way for the given environment.

If we do this, we are also going to be dependent on worker changes, and will have to maintain the mach command when the system is modified (like changing the base distro in the future or such thing)

This package should be considered part of the system, and therefore be installed with the system. It should be done with other packages you deploy (via apt-get or something else) and be part of a whole.

Asking devs to install it themselves, and having them properly installed in the CI removes all the issues, and avoid maintaining a mach command that will be half broken most of the time.

I am curious to understand why having it added in workers is a problem. Not sure what "not in-tree" means in this context.

Flags: needinfo?(tarek) → needinfo?(jwatkins)

Jake talked with me about this more. We'll go ahead with ffmpeg and imagemagick. Windows deployment may be delayed because of circumstances (:markco?). I think mobile is updated already, Linux hardware has a test pool up, and I'll get the mac minis installed and create a test pool.

Flags: needinfo?(jwatkins) → needinfo?(mcornmesser)

Looking at the support visual metrics diff, https://hg.mozilla.org/mozilla-central/rev/ccb7fbc48547#l4.25
The static ffmpeg was coming from https://github.com/ncalexan/geckodriver/releases/tag/v0.24.0-android for the previous dev work as a workaround.

Eventually, we might look at setting up with tooltool for caching, but without mozharness. Then we could more easily update or pin to versions for tasks and platforms.

I will work on getting a Windows test pool up early next week. Leaving the NI in place as a reminder.

Flags: needinfo?(mcornmesser)
Flags: needinfo?(mcornmesser)

An update. I have a couple higher priority items in front of this one. So far I have ffmpeg installing and adding its bin folder to the path through Puppet. I am still working on getting ImageMagick to install and working through Puppet.

Flags: needinfo?(mcornmesser)
Flags: needinfo?(mcornmesser)

Within the hour there will be a test pool set up with these 2 items. The test poll will have worker type gecko-t-win10-64-ht. Could someone run some of the applicable tests on this pool?

Flags: needinfo?(mcornmesser)

thanks a lot Mark, we'll try asap

Flags: needinfo?(tarek)
Depends on: 1656965

I did an initial run at https://treeherder.mozilla.org/#/jobs?repo=try&tier=1%2C2%2C3&revision=9bba7af5c31bc36ea576e174f34a32920e919580&selectedTaskRun=YcEkax8rQuGo3Xu_Samy9g.0

Mark, is this the right pool?

Looks like it can't find ffmpeg in the path. Could you tell me its full path so I can investogate? thx

Flags: needinfo?(tarek) → needinfo?(mcornmesser)

Mark, is this the right pool?

It looks like it went to the t-win10-64-hw pool. https://firefox-ci-tc.services.mozilla.com/tasks/YcEkax8rQuGo3Xu_Samy9g .
the mach flag would probably be something like --worker-override t-win10-64-hw=gecko-t/win10-64-ht

The path will be C:\ffmpeg-20200809-6e951d0-win64-static\bin.

Flags: needinfo?(mcornmesser)

It looks it is looking for win10-64-ht and it should be gecko-t-win10-64-ht, so it looks like the matter of the flag syntax. I am not sure what the correct syntax would be if the one mentioned above is incorrect.

Flags: needinfo?(mcornmesser)
Component: Talos → mozperftest
You need to log in before you can comment on or make changes to this bug.