Add request header identifying background (update) task to allow Balrog to measure background updates
Categories
(Toolkit :: Application Update, enhancement, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox134 | --- | fixed |
People
(Reporter: nalexander, Assigned: nalexander)
References
Details
Attachments
(4 files)
Bug 1918194 - Pre: Add pref to allow downloading without BITS in background update task. r?bytesized
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review |
In Bug 1687781, we added an XML parameter to Balrog responses allowing to disable background updates. Thankfully, we've never had to use it. To understand the operational cost of background updates, we'd like to add something letting Balrog know the source of MAR download traffic -- either a query parameter or a request header. This will let us examine logs and "allocate" MAR downloads to in-browser updates vs background update task updates.
I believe a request header introduces less overall change to the system since the update URL is quite complicated already.
Assignee | ||
Comment 1•3 months ago
|
||
There are lots of ways to achieve this for tests; this one is easier
to understand than mocking parts deep inside of
UpdateService.sys.mjs
, i.e., within the lazy
container object.
Updated•3 months ago
|
Assignee | ||
Comment 2•3 months ago
|
||
This extracts a simple existing function for setting headers and then
makes it available to nsIIncrementalDownload
. The existing function
is not the most pleasant API, but it exists and is much easier than
working through the details of storing header name-value pairs until
they are ready to be used. It's also the same API exposed by Windows
BITS, so there's no additional manipulation required to support BITS.
Assignee | ||
Comment 3•3 months ago
|
||
Assignee | ||
Comment 4•3 months ago
|
||
There are a few subtle points here:
-
Multiple logically independent tests in the same test file require
some manipulations, none terrible but all annoying. For example: -
Avoiding
nsIIncrementalDownload
caching between sub-tests
requires using "slow" downloads; but: -
nsIBITS
callbacks don't fire at the same time as
nsIIncrementalDownload
callbacks, requiring not using "slow"
downloads for BITS.
Assignee | ||
Comment 5•2 months ago
|
||
jbuck: This is follow-up to https://phabricator.services.mozilla.com/D221860#7621275. In the current commit sequence, I have implemented adding headers
X-BackgroundTaskMode: {0,1}
X-BackgroundTaskName: {backgroundupdate,...}
but as you informed us, custom headers aren't easy to log from our CDN. I propose to also add query parameters (to the MAR fetch only) like:
?backgroundTaskMode={0,1}&backgroundTaskName={backgroundupdate,...}
The space of task names is very small: it should be only "backgroundupdate" in the wild.
Could you confirm that adding query parameters to MAR fetches will:
- not impact CDN requests, i.e., will be ignored without additional configuration
- not adversely impact caching rates, either due to being ignored or due to the small expected state-space
- be observable in the CDN logging that we use to determine bandwidth costs related to delivering updates?
Thanks!
Assignee | ||
Comment 6•2 months ago
|
||
bytesized: bhearsum: while I'm here, do you expect it to be a problem if we add query parameters to arbitrary update URLs? Should I be guarding this to only Balrog/AUS update URLs so that we don't cause problems for enterprise constellations that serve their own updates?
Comment 7•2 months ago
|
||
(In reply to Nick Alexander :nalexander [he/him] from comment #6)
do you expect it to be a problem if we add query parameters to arbitrary update URLs?
I can't think of any reason there would be any problems from the client-end. I guess we might want to make sure we aren't clobbering existing query params. I don't know if Balrog ever specifies any.
Should I be guarding this to only Balrog/AUS update URLs so that we don't cause problems for enterprise constellations that serve their own updates?
Yeah, I think I'd prefer that we not add it to URLs that might not expect it. Maybe make it conditional on a check like (the inverse of) this one?
Comment 8•2 months ago
|
||
(In reply to Nick Alexander :nalexander [he/him] from comment #6)
bytesized: bhearsum: while I'm here, do you expect it to be a problem if we add query parameters to arbitrary update URLs? Should I be guarding this to only Balrog/AUS update URLs so that we don't cause problems for enterprise constellations that serve their own updates?
There's no reason we can't do it, but Balrog will need to add support for it first (it throws errors if it receives unknown query args, eg: https://aus5.mozilla.org/update/6/Firefox/118.0a1/20230717111326/Linux_x86_64-gcc3/en-US/default/Linux%25206.4.11-200.fc38.x86_64%2520(GTK%25203.24.38%252Clibpulse%252016.1.0)/ISET%3ASSE4_2%2CMEM%3A32010/default/default/update.xml?force=1&background=1). That's no big deal though.
As for Enterprise installs, I'm afraid I'm not very knowledgable about how they serve their own updates. mkaply - that's probably a question for you.
Comment 9•2 months ago
|
||
From earlier conversations I thought we were talking about adding query parameters to the MAR fetches, not to the update checks that talk to to balrog/AUS?
The MAR fetches typically go to archive.mozilla.org (e.g. https://archive.mozilla.org/pub/firefox/nightly/2024/09/2024-09-25-15-28-40-mozilla-central/firefox-132.0a1.en-US.win64.complete.mar
) and/or download.mozilla.org (e.g. https://download.mozilla.org/?product=firefox-131.0-complete&os=win64&lang=en-US
).
So besides the CDN (for archive.m.o) we'd presumably also need bouncer (download.m.o) to do something with the query params?
Comment 10•2 months ago
|
||
We don't really document how to do your own update server well, so anyone doing that is kind of on their own.
But I would doubt they would fail because of unknown parameters.
Assignee | ||
Comment 11•2 months ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #9)
From earlier conversations I thought we were talking about adding query parameters to the MAR fetches, not to the update checks that talk to to balrog/AUS?
Correct: only for the MAR fetches. So I think that we won't need to update Balrog, obviating https://bugzilla.mozilla.org/show_bug.cgi?id=1918194#c8.
The MAR fetches typically go to archive.mozilla.org (e.g.
https://archive.mozilla.org/pub/firefox/nightly/2024/09/2024-09-25-15-28-40-mozilla-central/firefox-132.0a1.en-US.win64.complete.mar
) and/or download.mozilla.org (e.g.https://download.mozilla.org/?product=firefox-131.0-complete&os=win64&lang=en-US
).
So besides the CDN (for archive.m.o) we'd presumably also need bouncer (download.m.o) to do something with the query params?
Really, Balrog redirects to the bouncer in some circumstances? TIL.
To whom might I delegate the changes for the CDN and the bouncer ignoring the query parameters backgroundTaskMode
and backgroundTaskName
?
Comment 12•2 months ago
|
||
It looks like they're being ignored already, e.g. https://download.mozilla.org/?product=firefox-131.0-complete&os=win64&lang=en-US&backgroundTaskMode=0 and https://archive.mozilla.org/pub/firefox/releases/130.0/update/win64/en-US/firefox-130.0.complete.mar?backgroundTaskMode=0 both appear to work (and the latter looks like it returns a cached response), so I think we're good there?
SRE might have to do some work to get the params logged and exported somewhere though, I'm not sure. And we might want bouncer to replicate those params in the redirect URL, to only have to worry about logging in one place? A bug against Webtools::Bouncer is probably a good start if we do need changes there.
(The bouncer case also shows that the mar URLs in update.xml can contain existing query params, so that partially answers comment 7.)
Comment 13•2 months ago
|
||
Tested these bouncer query parameters with curl, I was able to query them no problem % curl -i 'https://download.mozilla.org/?product=firefox-131.0-complete&os=win64&lang=en-US&backgroundTaskMode=1&backgroundTaskName=jbuck-testing
In https://console.cloud.google.com/bigquery?project=moz-fx-bouncer-prod-bd81
SELECT
jsonPayload.path AS path,
REGEXP_EXTRACT(jsonPayload.path, r'product=([^&]+)') AS product,
REGEXP_EXTRACT(jsonPayload.path, r'os=([^&]+)') AS os,
REGEXP_EXTRACT(jsonPayload.path, r'lang=([^&]+)') AS lang,
REGEXP_EXTRACT(jsonPayload.path, r'backgroundTaskMode=([^&]+)') AS backgroundTaskMode,
REGEXP_EXTRACT(jsonPayload.path, r'backgroundTaskName=([^&]+)') AS backgroundTaskName
FROM
`moz-fx-bouncer-prod-bd81.bouncer_aws.nginx_access_20241008`
WHERE
REGEXP_EXTRACT(jsonPayload.path, r'backgroundTaskMode=([^&]+)') IS NOT NULL
OR REGEXP_EXTRACT(jsonPayload.path, r'backgroundTaskName=([^&]+)') IS NOT NULL;
There is no CDN on download.mozilla.org, so I think you're clear implementing this feature there. Aside from perhaps having Bouncer mirror these query parameters into redirects to archive.mozilla.org.
--
As for archive.mozilla.org, querying also works with CDN caching:
https://console.cloud.google.com/bigquery?project=moz-fx-productdelivery-pr-38b5
SELECT
httpRequest.requestUrl,
REGEXP_EXTRACT(httpRequest.requestUrl, r'backgroundTaskMode=([^&]+)') AS backgroundTaskMode,
REGEXP_EXTRACT(httpRequest.requestUrl, r'backgroundTaskName=([^&]+)') AS backgroundTaskName
FROM
`moz-fx-productdelivery-pr-38b5.log_storage_prod.requests`
WHERE
TIMESTAMP_TRUNC(timestamp, DAY) = TIMESTAMP("2024-10-08")
AND httpRequest.requestUrl LIKE '%?%';
Updated•2 months ago
|
Updated•1 month ago
|
Updated•1 month ago
|
Comment 14•26 days ago
|
||
Comment 15•26 days ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/4c7879a248a4
https://hg.mozilla.org/mozilla-central/rev/126f522ee37a
https://hg.mozilla.org/mozilla-central/rev/4a5dc4f5efbd
https://hg.mozilla.org/mozilla-central/rev/7072a9d9c00a
Description
•