Open Bug 1606297 Opened 4 years ago Updated 2 years ago

Improve docs for building release binaries of geckodriver

Categories

(Testing :: geckodriver, enhancement, P2)

Version 3
enhancement

Tracking

(Not tracked)

People

(Reporter: whimboo, Unassigned)

References

()

Details

We no longer sync the source of geckodriver to the github repository and as such people aren't able to build geckodriver directly from that repository. See https://github.com/mozilla/geckodriver/issues/1635.

The current build instructions rely on a check-out of the HG repository which is large, and adds a bit of burden on top of various maintainers of geckodriver packages like brew and others.

We should improve their workflow by making it fast and easy. Here some ideas:

  1. Use shallow clone with only the latest changeset (like git --depth=1) doesn't seem to be possible with HG. With --rev X the first X changesets will be downloaded. Maybe we can get them to just clone https://github.com/mozilla/gecko-dev which downloads about 500MB:
Cloning into 'gecko-dev'...
remote: Enumerating objects: 282408, done.
remote: Counting objects: 100% (282408/282408), done.
remote: Compressing objects: 100% (207408/207408), done.
remote: Total 282408 (delta 99705), reused 150769 (delta 67869), pack-reused 0
Receiving objects: 100% (282408/282408), 484.81 MiB | 4.48 MiB/s, done.
Resolving deltas: 100% (99705/99705), done.
Checking out files: 100% (285291/285291), done.

Use a Mercurial bundle for download. The one for mozilla-central in zstd format is currently ~1.3GB in size. This would still involve more steps (download, extract, init) compared to just the clone command with git, and also is nearly 3x times that large to download.

  1. Only fetching directories from a given revision as needed to build geckodriver by circumventing the repository at all:

This will result in just 341 kB worth of data to be downloaded:

-rw-r--r--  1 henrik  admin   184393 Dec 30 10:45 Cargo.lock
-rw-r--r--  1 henrik  admin    92144 Dec 30 10:28 geckodriver.zip
-rw-r--r--  1 henrik  admin    33268 Dec 30 10:32 mozbase.zip
-rw-r--r--  1 henrik  admin    31136 Dec 30 10:32 webdriver.zip

The global Cargo.toml cannot be used because it references all the other projects in mozilla-central, but sticking the Cargo.lock into testing/geckodriver seems to do the trick. Here the builds as created from this changeset both with just cargo build --release :

mozilla-central: MD5 (target/release/geckodriver) = 4757a16bc499c0a1bc94fce9a00f33fd
download: MD5 (target/release/geckodriver) = eebbbcea545b477c06c98906069f084b

We have differences here, which are most likely based on different build settings. Andreas, do you know how cargo build works from inside testing/geckodriver?

Also we would need the following settings:

export RUSTFLAGS="-Ctarget-feature=+crt-static" (Windows)

I would much prefer option 2) with a small build script added to the geckodriver repository.

Flags: needinfo?(ato)

I don’t think this is particularly important to optimise for. There is a documented way to make reproducible builds of geckodriver and Homebrew was able to adapt to this with very few modifications.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #0)

We have differences here, which are most likely based on different build settings. Andreas, do you know how cargo build works from inside testing/geckodriver?

It depends what you mean by “works”!

Cargo always traverses up the directory hierarchy to see if what is built is part of a workspace. This means if there is a Cargo.toml with [workspace] it will be taken into account and the produced artifacts will be put in the workspace target/ folder.

It will however not use the dependencies from third_party/rust/ unless you invoke ./mach build testing/geckodriver, and instead pick them up from cargo’s default crate registry.

This means, of course, that we need to use the top-level Cargo.toml to get an accurate, reproducible build.

Also we would need the following settings:

export RUSTFLAGS="-Ctarget-feature=+crt-static" (Windows)

This can be set in testing/geckodriver/.cargo/config, but you should check that this file is still observed now that cargo in central uses a worktree.

Flags: needinfo?(ato)

(In reply to Andreas Tolfsen ❲:ato❳ from comment #1)

I don’t think this is particularly important to optimise for. There is a documented way to make reproducible builds of geckodriver and Homebrew was able to adapt to this with very few modifications.

What are you referring here to? IMO this is a huge burden for people having to download several gigabytes of data before being able to compile geckodriver, which only has <1MB of sources.

Cargo always traverses up the directory hierarchy to see if what is built is part of a workspace. This means if there is a Cargo.toml with [workspace] it will be taken into account and the produced artifacts will be put in the workspace target/ folder.

It will however not use the dependencies from third_party/rust/ unless you invoke ./mach build testing/geckodriver, and instead pick them up from cargo’s default crate registry.

This means, of course, that we need to use the top-level Cargo.toml to get an accurate, reproducible build.

We would only need this global Cargo.toml for in-tree builds. But our own Cargo.toml in testing/geckodriver has everything what we need.

Also we would need the following settings:

export RUSTFLAGS="-Ctarget-feature=+crt-static" (Windows)

This can be set in testing/geckodriver/.cargo/config, but you should check that this file is still observed now that cargo in central uses a worktree.

Nathan, do you know that?

Flags: needinfo?(nfroyd)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #2)

(In reply to Andreas Tolfsen ❲:ato❳ from comment #1)

Also we would need the following settings:

export RUSTFLAGS="-Ctarget-feature=+crt-static" (Windows)

This can be set in testing/geckodriver/.cargo/config, but you should check that this file is still observed now that cargo in central uses a worktree.

Nathan, do you know that?

I don't know offhand. I would expect that it Just Works: I'd think you wouldn't want making a package part of a workspace to subtly change how the package got compiled like that.

Flags: needinfo?(nfroyd)

Hi,

In the link posted by Henrik I provided a script which creates a source tarball containing geckodriver and a minimal set of internal dependencies. It does this by parsing your Cargo.toml and copying in crates with relative paths. It creates tarballs in the order of kilobytes.

Please feel free to use and modify the script as you see fit.

We would only need this global Cargo.toml for in-tree builds. But our own Cargo.toml in testing/geckodriver has everything what we need.

I can confirm this. I was able to do stand-alone builds using the tarball generated from my script. Cargo should ensure the build is reproducible.

All I'd like is a small source tarball which I can use for packaging purposes. As it stands, I have to clone your humongous repo, which isn't user or packager friendly. The repo is so big, I am unable to clone it on my home network without the connection timing out.

By the way, it would also be good if you could tag your geckodriver versions in the mono-repo so that we don't have to look up the commit id.

Thanks

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.