Closed Bug 1918194 Opened 3 months ago Closed 26 days ago

Add request header identifying background (update) task to allow Balrog to measure background updates

Categories

(Toolkit :: Application Update, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
134 Branch
Tracking Status
firefox134 --- fixed

People

(Reporter: nalexander, Assigned: nalexander)

References

Details

Attachments

(4 files)

In Bug 1687781, we added an XML parameter to Balrog responses allowing to disable background updates. Thankfully, we've never had to use it. To understand the operational cost of background updates, we'd like to add something letting Balrog know the source of MAR download traffic -- either a query parameter or a request header. This will let us examine logs and "allocate" MAR downloads to in-browser updates vs background update task updates.

I believe a request header introduces less overall change to the system since the update URL is quite complicated already.

There are lots of ways to achieve this for tests; this one is easier
to understand than mocking parts deep inside of
UpdateService.sys.mjs, i.e., within the lazy container object.

Assignee: nobody → nalexander
Status: NEW → ASSIGNED

This extracts a simple existing function for setting headers and then
makes it available to nsIIncrementalDownload. The existing function
is not the most pleasant API, but it exists and is much easier than
working through the details of storing header name-value pairs until
they are ready to be used. It's also the same API exposed by Windows
BITS, so there's no additional manipulation required to support BITS.

There are a few subtle points here:

  1. Multiple logically independent tests in the same test file require
    some manipulations, none terrible but all annoying. For example:

  2. Avoiding nsIIncrementalDownload caching between sub-tests
    requires using "slow" downloads; but:

  3. nsIBITS callbacks don't fire at the same time as
    nsIIncrementalDownload callbacks, requiring not using "slow"
    downloads for BITS.

jbuck: This is follow-up to https://phabricator.services.mozilla.com/D221860#7621275. In the current commit sequence, I have implemented adding headers

X-BackgroundTaskMode: {0,1}
X-BackgroundTaskName: {backgroundupdate,...}

but as you informed us, custom headers aren't easy to log from our CDN. I propose to also add query parameters (to the MAR fetch only) like:

?backgroundTaskMode={0,1}&backgroundTaskName={backgroundupdate,...}

The space of task names is very small: it should be only "backgroundupdate" in the wild.

Could you confirm that adding query parameters to MAR fetches will:

  1. not impact CDN requests, i.e., will be ignored without additional configuration
  2. not adversely impact caching rates, either due to being ignored or due to the small expected state-space
  3. be observable in the CDN logging that we use to determine bandwidth costs related to delivering updates?

Thanks!

Flags: needinfo?(jbuckley)

bytesized: bhearsum: while I'm here, do you expect it to be a problem if we add query parameters to arbitrary update URLs? Should I be guarding this to only Balrog/AUS update URLs so that we don't cause problems for enterprise constellations that serve their own updates?

Flags: needinfo?(bytesized)
Flags: needinfo?(bhearsum)

(In reply to Nick Alexander :nalexander [he/him] from comment #6)

do you expect it to be a problem if we add query parameters to arbitrary update URLs?

I can't think of any reason there would be any problems from the client-end. I guess we might want to make sure we aren't clobbering existing query params. I don't know if Balrog ever specifies any.

Should I be guarding this to only Balrog/AUS update URLs so that we don't cause problems for enterprise constellations that serve their own updates?

Yeah, I think I'd prefer that we not add it to URLs that might not expect it. Maybe make it conditional on a check like (the inverse of) this one?

Flags: needinfo?(bytesized)

(In reply to Nick Alexander :nalexander [he/him] from comment #6)

bytesized: bhearsum: while I'm here, do you expect it to be a problem if we add query parameters to arbitrary update URLs? Should I be guarding this to only Balrog/AUS update URLs so that we don't cause problems for enterprise constellations that serve their own updates?

There's no reason we can't do it, but Balrog will need to add support for it first (it throws errors if it receives unknown query args, eg: https://aus5.mozilla.org/update/6/Firefox/118.0a1/20230717111326/Linux_x86_64-gcc3/en-US/default/Linux%25206.4.11-200.fc38.x86_64%2520(GTK%25203.24.38%252Clibpulse%252016.1.0)/ISET%3ASSE4_2%2CMEM%3A32010/default/default/update.xml?force=1&background=1). That's no big deal though.

As for Enterprise installs, I'm afraid I'm not very knowledgable about how they serve their own updates. mkaply - that's probably a question for you.

Flags: needinfo?(bhearsum) → needinfo?(mozilla)

From earlier conversations I thought we were talking about adding query parameters to the MAR fetches, not to the update checks that talk to to balrog/AUS?
The MAR fetches typically go to archive.mozilla.org (e.g. https://archive.mozilla.org/pub/firefox/nightly/2024/09/2024-09-25-15-28-40-mozilla-central/firefox-132.0a1.en-US.win64.complete.mar) and/or download.mozilla.org (e.g. https://download.mozilla.org/?product=firefox-131.0-complete&os=win64&lang=en-US).
So besides the CDN (for archive.m.o) we'd presumably also need bouncer (download.m.o) to do something with the query params?

We don't really document how to do your own update server well, so anyone doing that is kind of on their own.

But I would doubt they would fail because of unknown parameters.

Flags: needinfo?(mozilla)

(In reply to Julien Cristau [:jcristau] from comment #9)

From earlier conversations I thought we were talking about adding query parameters to the MAR fetches, not to the update checks that talk to to balrog/AUS?

Correct: only for the MAR fetches. So I think that we won't need to update Balrog, obviating https://bugzilla.mozilla.org/show_bug.cgi?id=1918194#c8.

The MAR fetches typically go to archive.mozilla.org (e.g. https://archive.mozilla.org/pub/firefox/nightly/2024/09/2024-09-25-15-28-40-mozilla-central/firefox-132.0a1.en-US.win64.complete.mar) and/or download.mozilla.org (e.g. https://download.mozilla.org/?product=firefox-131.0-complete&os=win64&lang=en-US).
So besides the CDN (for archive.m.o) we'd presumably also need bouncer (download.m.o) to do something with the query params?

Really, Balrog redirects to the bouncer in some circumstances? TIL.

To whom might I delegate the changes for the CDN and the bouncer ignoring the query parameters backgroundTaskMode and backgroundTaskName?

Flags: needinfo?(jcristau)

It looks like they're being ignored already, e.g. https://download.mozilla.org/?product=firefox-131.0-complete&os=win64&lang=en-US&backgroundTaskMode=0 and https://archive.mozilla.org/pub/firefox/releases/130.0/update/win64/en-US/firefox-130.0.complete.mar?backgroundTaskMode=0 both appear to work (and the latter looks like it returns a cached response), so I think we're good there?
SRE might have to do some work to get the params logged and exported somewhere though, I'm not sure. And we might want bouncer to replicate those params in the redirect URL, to only have to worry about logging in one place? A bug against Webtools::Bouncer is probably a good start if we do need changes there.

(The bouncer case also shows that the mar URLs in update.xml can contain existing query params, so that partially answers comment 7.)

Flags: needinfo?(jcristau)

Tested these bouncer query parameters with curl, I was able to query them no problem % curl -i 'https://download.mozilla.org/?product=firefox-131.0-complete&os=win64&lang=en-US&backgroundTaskMode=1&backgroundTaskName=jbuck-testing

In https://console.cloud.google.com/bigquery?project=moz-fx-bouncer-prod-bd81

SELECT
  jsonPayload.path AS path,
  REGEXP_EXTRACT(jsonPayload.path, r'product=([^&]+)') AS product,
  REGEXP_EXTRACT(jsonPayload.path, r'os=([^&]+)') AS os,
  REGEXP_EXTRACT(jsonPayload.path, r'lang=([^&]+)') AS lang,
  REGEXP_EXTRACT(jsonPayload.path, r'backgroundTaskMode=([^&]+)') AS backgroundTaskMode,
  REGEXP_EXTRACT(jsonPayload.path, r'backgroundTaskName=([^&]+)') AS backgroundTaskName
FROM
  `moz-fx-bouncer-prod-bd81.bouncer_aws.nginx_access_20241008`
WHERE
  REGEXP_EXTRACT(jsonPayload.path, r'backgroundTaskMode=([^&]+)') IS NOT NULL
  OR REGEXP_EXTRACT(jsonPayload.path, r'backgroundTaskName=([^&]+)') IS NOT NULL;

There is no CDN on download.mozilla.org, so I think you're clear implementing this feature there. Aside from perhaps having Bouncer mirror these query parameters into redirects to archive.mozilla.org.

--

As for archive.mozilla.org, querying also works with CDN caching:

https://console.cloud.google.com/bigquery?project=moz-fx-productdelivery-pr-38b5

SELECT
  httpRequest.requestUrl,
  REGEXP_EXTRACT(httpRequest.requestUrl, r'backgroundTaskMode=([^&]+)') AS backgroundTaskMode,
  REGEXP_EXTRACT(httpRequest.requestUrl, r'backgroundTaskName=([^&]+)') AS backgroundTaskName
FROM
  `moz-fx-productdelivery-pr-38b5.log_storage_prod.requests`
WHERE
  TIMESTAMP_TRUNC(timestamp, DAY) = TIMESTAMP("2024-10-08")
  AND httpRequest.requestUrl LIKE '%?%';
Flags: needinfo?(jbuckley)
Priority: -- → P3
Attachment #9424216 - Attachment description: Bug 1918194 - Part 1: Add `X-BackgroundTask{Mode,Name}` headers to `nsIIncrementalDownload` MAR GETs. r?bytesized!,bhearsum!,emilio! → Bug 1918194 - Part 1: Add background task extras to `nsIIncrementalDownload` MAR GETs. r?bytesized!,bhearsum!,emilio!
Attachment #9424217 - Attachment description: Bug 1918194 - Part 2: Add `X-BackgroundTask{Mode,Name}` headers to `nsIBits` MAR GETs. r?bytesized! → Bug 1918194 - Part 2: Add background task extras to `nsIBits` MAR GETs. r?bytesized!
Pushed by nalexander@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4c7879a248a4 Pre: Add pref to allow downloading without BITS in background update task. r=bytesized,application-update-reviewers https://hg.mozilla.org/integration/autoland/rev/126f522ee37a Part 1: Add background task extras to `nsIIncrementalDownload` MAR GETs. r=bytesized,bhearsum,necko-reviewers,application-update-reviewers,csadilek,jesup https://hg.mozilla.org/integration/autoland/rev/4a5dc4f5efbd Part 2: Add background task extras to `nsIBits` MAR GETs. r=bytesized,application-update-reviewers https://hg.mozilla.org/integration/autoland/rev/7072a9d9c00a Part 3: Tests. r=bytesized,application-update-reviewers
See Also: → 1928636
Regressions: 1929581
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: