Deploy hg bundles to Azure Blob Storage, and serve to Azure hosted infra
Categories
(Developer Services :: Mercurial: hg.mozilla.org, task, P3)
Tracking
(Not tracked)
People
(Reporter: glob, Assigned: sheehan)
References
Details
(Keywords: leave-open)
Attachments
(11 files, 6 obsolete files)
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review |
Deploy hg bundles to Azure Blob Storage, and serve to Azure hosted infra.
Comment 1•5 years ago
|
||
WIP DO NOT LAND
This revision adds support for uploading to Azure Blob Storage using their
Python SDK. Some assumptions are made at the moment regarding the method of
authentication (using SAS tokens), however this may change in the future.
Azure Blob Storage requires an alphanumeric name for containers that does not
contain spaces or dashes. The region is specified on the account level, and
thus all containers within an account belong to the same region. The
convention taken at this time is similar to that with S3 and GCP buckets,
and that is to include the region name in the container name.
Various pending TODOs.
Comment 2•5 years ago
|
||
WIP DO NOT LAND
Fetch an authentication token from the Microsoft Identity Platform. Use this
token to authenticate against the Azure REST API which is used to fetch the
service tags.
For more info see:
- https://docs.microsoft.com/en-us/rest/api/azure/
- https://docs.microsoft.com/en-us/azure/active-directory/develop/
Required environment variables:
- AZURE_SUBSCRIPTION_ID
- AZURE_APP_CLIENT_SECRET
- AZURE_APP_CLIENT_ID
- AZURE_APP_TENANT_ID
Various pending TODOs
Depends on D72750
Comment 3•5 years ago
|
||
Depends on D73387
Updated•5 years ago
|
Comment 4•5 years ago
|
||
WIP DO NOT LAND
Add support for serving clone bundles from Azure, by checking incoming IP
against the IP prefixes file fetched from Azure. Also added some test data.
TODO: add test, refactor some of the code.
Depends on D73387
Comment 5•5 years ago
|
||
WIP DO NOT LAND
Implement some Terraform code to provision a resource group, storage account,
and container for the bundles.
Depends on D73470
Comment 6•5 years ago
|
||
scripts: instead of manually calling the Azure auth API and fetching an auth
token, do this via the ClientSecretCredential object which is provided by the
azure-identity package.
hgmo: instead of using a SAS token to authenticate against Blob Storage,
use the ClientSecretCredential object directly with the storage account.
Depends on D74132
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 7•5 years ago
|
||
Abandoning this until further notice.
Comment 8•3 years ago
•
|
||
I think we should consider looking into this again. Now we have many level 3 Azure tasks, including some on the release pipeline. This came up because an Azure task in the release pipeline timed out after cloning for 1.5 hours and delayed the release (though it's unclear if it was actually taking that long to clone, or just got stuck somehow).
But either way, some Windows builds are now in Azure too, so this will also increase developer productivity.
Updated•3 years ago
|
| Assignee | ||
Updated•1 year ago
|
| Assignee | ||
Updated•1 year ago
|
Comment 10•1 year ago
|
||
Microsoft doesn't publish a json file in the same way google or amazon do. The only way to get it is to parse the URL https://www.microsoft.com/en-us/download/confirmation.aspx?id=56519 but I could not figure that out with python. I instead generated the json file which will be stored at this URL. We intend to update this regularly through some external automation.
Comment 11•1 year ago
|
||
Comment 12•1 year ago
|
||
Comment 13•1 year ago
|
||
Updated•1 year ago
|
Updated•1 year ago
|
Comment 14•1 year ago
|
||
Pushed by cosheehan@mozilla.com:
https://hg.mozilla.org/hgcustom/version-control-tools/rev/6d40b75f9527
azure: ingest azure ip json r=sheehan
| Assignee | ||
Updated•1 year ago
|
| Assignee | ||
Updated•1 year ago
|
| Assignee | ||
Comment 15•1 year ago
|
||
This commit adds bundle upload to Azure Cloud storage for each
Azure region where we run Firefox CI. We add the AZURE_REGIONS
variable with details on each region URL, bucket and region name.
In the same fashion as GCP, we upload only stream clone bundles to
each region. Azure credentials are pulled from the environment
with the EnvionmentCredential object, which are passed to a
BlobClient for each bundle. If the bundle already exists in
the bucket, reset the object expiration time to 7 days. If the
bundle doesn't exist, upload it to Azure. For each uploaded
bundle, create an entry in the manifest file with the region
tagged using the azureregion= parameter. A later commit will
filter on this parameter to serve an appropriate bundle during
clonebundles manifest filtering.
| Assignee | ||
Comment 16•1 year ago
|
||
Add clonebundles manifest filtering for Azure regions in the same fashion
as the GCP and AWS regions. Add a new hgmo.azureippath config option
which holds a path to the Azure IP blocks file. When the path is defined
and the azureregion= string can be found in the manifest, parse the
file and compare the addressPrefixes property for each service with
the source IP address of the request, skipping all IPv6 networks. If
the source IP is within any of the specified networks, return the
region property and filter the manifest to only include bundles from
that region.
| Assignee | ||
Comment 17•1 year ago
|
||
Add a systemd unit to run the Azure IP scraper on a weekly
basis, in the same fashion as the GCP and AWS scrapers.
| Assignee | ||
Comment 18•1 year ago
|
||
jmoss, we are almost ready to deploy and test the bundle upload to Azure. The standard setup for the bundles in AWS/GCP is that they are available to download publicly and each bundle has a 7-day expiration date. As part of the upload process, if the repo hasn't changed since the last run of bundle generation we reset the blob's expiration date to 7 days from the current run time to avoid re-uploading. Before we proceed any further with testing, can you confirm the blob containers have similar access/expiration policies? :)
Comment 19•1 year ago
•
|
||
(In reply to Connor Sheehan [:sheehan] from comment #18)
jmoss, we are almost ready to deploy and test the bundle upload to Azure. The standard setup for the bundles in AWS/GCP is that they are available to download publicly and each bundle has a 7-day expiration date. As part of the upload process, if the repo hasn't changed since the last run of bundle generation we reset the blob's expiration date to 7 days from the current run time to avoid re-uploading. Before we proceed any further with testing, can you confirm the blob containers have similar access/expiration policies? :)
Yup, we updated the storage accounts to automatically delete any file within the container older than 7 days.
Comment 20•1 year ago
|
||
Pushed by cosheehan@mozilla.com:
https://hg.mozilla.org/hgcustom/version-control-tools/rev/849c42a380c7
bundles: upload bundles to Azure Cloud Storage r=jcristau,jmoss
| Assignee | ||
Updated•1 year ago
|
Comment 21•1 year ago
|
||
Pushed by cosheehan@mozilla.com:
https://hg.mozilla.org/hgcustom/version-control-tools/rev/df0292218d45
ansible: add a systemd unit to run Azure IP scraper r=jcristau
| Assignee | ||
Updated•1 year ago
|
| Assignee | ||
Comment 22•1 year ago
|
||
Switch to a ClientSecretCredential with values loaded from a
credentials file on disk.
Comment 23•1 year ago
|
||
Pushed by cosheehan@mozilla.com:
https://hg.mozilla.org/hgcustom/version-control-tools/rev/42d74d16db65
bundles: switch to a ClientSecretCredential for Azure uploads r=jcristau
| Assignee | ||
Updated•1 year ago
|
Comment 24•1 year ago
|
||
| Assignee | ||
Comment 25•1 year ago
|
||
I deployed the Azure IP scraper and Azure bundle upload patches, and both appear to be working as expected. Tomorrow I'll deploy the clonebundles manifest filtering, ie the patch which actually enables clients to use the new Azure hosted clonebundles.
Comment 26•1 year ago
|
||
| Assignee | ||
Comment 27•1 year ago
|
||
This makes it easier to determine which uploads are
in progress and how long they took to complete.
Comment 28•1 year ago
|
||
| Assignee | ||
Comment 29•1 year ago
|
||
We are now uploading clonebundles to Azure blob storage and serving them to workers hosted in the same region.
An initial scan of some logs shows clone times are comparable between the Azure bundles and the CDN hosted bundles, which is a little surprising. Next week the RelSRE folks will take a closer look to see if there are further optimizations we can perform.
Comment 30•1 year ago
|
||
Fixes test failure:
--- /app/vct/hgserver/tests/test-clonebundles.t
+++ /app/vct/hgserver/tests/test-clonebundles.t.err
@@ -40,7 +40,18 @@
uploading moz-hg-bundles-us-west-2:mozilla-central/77538e1ce4bec5f7aac58a7ceca2da0e38e90a72.gzip-v2.hg from /repo/hg/bundles/mozilla-central/77538e1ce4bec5f7aac58a7ceca2da0e38e90a72.gzip-v2.hg
uploading moz-hg-bundles-us-west-2:mozilla-central/77538e1ce4bec5f7aac58a7ceca2da0e38e90a72.zstd.hg from /repo/hg/bundles/mozilla-central/77538e1ce4bec5f7aac58a7ceca2da0e38e90a72.zstd.hg
uploading moz-hg-bundles-us-west-2:mozilla-central/77538e1ce4bec5f7aac58a7ceca2da0e38e90a72.stream-v2.hg from /repo/hg/bundles/mozilla-central/77538e1ce4bec5f7aac58a7ceca2da0e38e90a72.stream-v2.hg
- NoCredentialsError: Unable to locate credentials
- Traceback (most recent call last):
- File "/var/hg/venv_bundles/bin/generate-hg-s3-bundles", line 33, in <module>
-
sys.exit(load_entry_point('hgmolib==0.0', 'console_scripts', 'generate-hg-s3-bundles')()) - File "/var/hg/venv_bundles/lib64/python3.9/site-packages/hgmolib/generate_hg_s3_bundles.py", line 731, in main
-
paths[repo] = generate_bundles(repo, upload=upload, **opts) - File "/var/hg/venv_bundles/lib64/python3.9/site-packages/hgmolib/generate_hg_s3_bundles.py", line 518, in generate_bundles
-
azure_credentials = get_azure_credentials() - File "/var/hg/venv_bundles/lib64/python3.9/site-packages/hgmolib/generate_hg_s3_bundles.py", line 248, in get_azure_credentials
-
credentials_path = Path(os.environ["AZURE_CREDENTIALS_PATH"]) - File "/usr/lib64/python3.9/os.py", line 679, in getitem
-
raise KeyError(key) from None - KeyError: 'AZURE_CREDENTIALS_PATH'
[1]
Comment 31•1 year ago
|
||
Comment 32•1 year ago
|
||
Description
•