handle channel changes for cached macOS attribution data
Categories
(Firefox :: Installer, enhancement)
Tracking
()
Tracking | Status | |
---|---|---|
firefox123 | --- | fixed |
People
(Reporter: bhearsum, Assigned: bhearsum)
References
Details
(Whiteboard: [fidedi-ope])
Attachments
(3 files)
macOS attribution data is meant to follow a particular install for its entire lifetime. We cached this data in directories that contain the full path to the install, and in filenames that contain the channel. (The channel is included because Release, Beta, and ESR all use Firefox.app
as their application name - so the install directory alone is not enough distinguish them if a paveover occurs.)
Because the channel is included, there is a risk that this data may be lost when the channel changes. There's at least a few scenarios here to consider:
- A paveover occurs, replacing, eg: a Release install with a Beta install.
- The user overrides their channel.
- An update changes the user's channel. This typically happens when we desupport an OS version, and migrate users on it to the ESR channel for extended support
In the paveover scenario we do not want to preserve existing attribution data - we should use whatever new attribution data is present, if it exists.
In the other scenarios we should preserve attribution data if possible. We already have some code that runs code that runs around update time when the user changes their channel - although I'm not certain if it's suited for this purpose. Ideally, we'd detect the channel change at the next startup and deal with it then.
At the moment, I'm not certain how to detect the third scenario. Robin, I took a look around https://searchfox.org/mozilla-central/source/toolkit/mozapps/update/UpdateService.sys.mjs and https://searchfox.org/mozilla-central/source/toolkit/mozapps/update/AppUpdater.sys.mjs and couldn't find anything that might help - do you have any pointers you could give me? (Any other thoughts on this would be appreciated as well.)
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Comment 1•1 year ago
|
||
I'm afraid that I'm not especially familiar with the channel change process, but I'll answer this the best I can.
First, let me address the code that you linked. I believe that it successfully works in a very specific case: Firefox starts downloading an update and exits before completing it, then the channel is changed and Firefox is restarted. Then it would look at the in-progress update, realize that it's for the wrong channel, and discard it. If, however, there is a pending/staged update when the channel is changed, I believe that, when launched, Firefox will install it, restart, and then activate this code to incorrectly set update.state = STATE_FAILED;
when it actually succeeded. If there were no update in-progress when the channel changes, I believe that this code would never run at all. The next time that the update process runs, it will set the update object's channel to match UpdateUtils.UpdateChannel
, and we will avoid this code path when _postUpdateProcessing
runs next.
I also can't seem to figure out how that code doesn't get consistently activated in your third scenario. I don't currently have the time to spin up a VM to investigate this, but it seems like when this runs, update.channel
would be set to release
, the update would be installed and would change the channel to esr
, and then that same _postUpdateProcessing
code would run and set update.state = STATE_FAILED;
. I kinda feel like someone would probably have noticed by now if that were happening, but I can't seem figure out why it wouldn't without spinning up a VM and investigating.
I don't know of any especially good way to detect the second and third scenarios. The best I can think of at the moment is to add an idle task that duplicates the update channel into some other pref and then to move the attribution data around when the duplicated version doesn't match the current channel. Possibly the install path needs to be recorded somewhere as part of this to ensure that a profile doesn't get used by multiple installations and get confused? Of course, this would try to move the attribution data around once per profile, not once per installation. I'm not sure if that's a problem. It also could mean that the code might be run on a substantial delay, perhaps even after the channel had changed multiple times, or changed and changed back. Again, I'm not sure how much of a problem this is.
Potentially maybe you could avoid some (maybe even all, actually) of these problems by writing the duplicated pref into update-config.json
instead of into the profile.
I would be happy to look into any of this further, if you'd like, but it might be a little while before I have the capacity to do so. Just let me know.
Comment 2•11 months ago
|
||
(In reply to Robin Steuber (they/them) [:bytesized] from comment #1)
If, however, there is a pending/staged update when the channel is changed, I believe that, when launched, Firefox will install it, restart, and then activate this code to incorrectly set
update.state = STATE_FAILED;
when it actually succeeded.
Oh, I just stumbled across some code that I forgot existed that probably prevents this from happening in most cases. When the user changed their channel, if they didn't also update update-settings.ini
, this check would fail during the update process, causing the update to fail.
But it probably still isn't great that we unconditionally set update.state = STATE_FAILED;
without checking what actually happened.
Assignee | ||
Comment 3•11 months ago
|
||
Robin and I had a good chat about this the other day and started to wonder whether or not we even need a cache file for mac attribution. As I understand it, the reason we had it was because fishing information out of the quarantine database in our original implementation was either expensive, or would not stick around forever. Our new implementation relies on xattrs on the .app
directory, which I suspect would take a similar amount of I/O to read and parse compared with reading and parsing data from the cache file. If this is true, and it is also true that attribution data persists after an update, removing the cache file altogether seems like it would be the most robust solution here.
These are both assumptions for now which I'll follow up on soon.
Comment 4•11 months ago
|
||
(In reply to bhearsum@mozilla.com (:bhearsum) from comment #3)
Robin and I had a good chat about this the other day and started to wonder whether or not we even need a cache file for mac attribution. As I understand it, the reason we had it was because fishing information out of the quarantine database in our original implementation was either expensive, or would not stick around forever. Our new implementation relies on xattrs on the
.app
directory, which I suspect would take a similar amount of I/O to read and parse compared with reading and parsing data from the cache file. If this is true, and it is also true that attribution data persists after an update, removing the cache file altogether seems like it would be the most robust solution here.
Re: your understanding -- spot on. It's not clear to me that the xattr
would persist; this seems a detail of how update stage and apply works that, IIUC, is an atomic swap that doesn't seem likely to retain xattr
. Easily tested and could be implemented, of course.
These are both assumptions for now which I'll follow up on soon.
Sounds great!
Assignee | ||
Comment 5•11 months ago
|
||
(In reply to Nick Alexander :nalexander [he/him] from comment #4)
(In reply to bhearsum@mozilla.com (:bhearsum) from comment #3)
Robin and I had a good chat about this the other day and started to wonder whether or not we even need a cache file for mac attribution. As I understand it, the reason we had it was because fishing information out of the quarantine database in our original implementation was either expensive, or would not stick around forever. Our new implementation relies on xattrs on the
.app
directory, which I suspect would take a similar amount of I/O to read and parse compared with reading and parsing data from the cache file. If this is true, and it is also true that attribution data persists after an update, removing the cache file altogether seems like it would be the most robust solution here.Re: your understanding -- spot on. It's not clear to me that the
xattr
would persist; this seems a detail of how update stage and apply works that, IIUC, is an atomic swap that doesn't seem likely to retainxattr
. Easily tested and could be implemented, of course.
As it turns out, the atomic replace happens on the Contents directory, not the .app
directory - so, luckily, the xattr on the .app
directory is already preserved across updates.
As for the I/O, I was able to confirm with iosnoop
that reading the xattr causes 3 reads to a resource fork totaling to 20480 bytes:
STRTIME DEVICE MAJ MIN UID PID D BLOCK SIZE PATHNAME ARGS
2023 Dec 13 15:47:48 ?? 1 11 502 788 R 8477725 4096 ??/xattr/xattr/..namedfork/rsrc bash\0
2023 Dec 13 15:47:48 ?? 1 11 502 788 R 8477726 8192 ??/xattr/xattr/..namedfork/rsrc bash\0
2023 Dec 13 15:47:48 ?? 1 11 502 788 R 8477728 8192 ??/xattr/xattr/..namedfork/rsrc bash\0
Reading the cache file, on the other hand, is 4096 bytes:
TIME STIME DELTA DEVICE INS MAJ MIN UID PID PPID D BLOCK SIZE MOUNT FILE PATH COMM ARGS
4000333 4000123 209 ?? 5 1 5 502 688 1 R 30395385 4096 Data macAttributionDataCache-nightly-try ??/Firefox Nightly/macAttributionDataCache-nightly-try firefox firefox\0
That's a 5x increase - which sounds like a lot at first - but I believe this is out of the critical path of start-up, and possibly even off main thread - so reading 16 extra kilobytes feels like it's probably not a big deal for the simplicity it brings.
Comment 6•11 months ago
|
||
(In reply to bhearsum@mozilla.com (:bhearsum) from comment #5)
(In reply to Nick Alexander :nalexander [he/him] from comment #4)
(In reply to bhearsum@mozilla.com (:bhearsum) from comment #3)
Robin and I had a good chat about this the other day and started to wonder whether or not we even need a cache file for mac attribution. As I understand it, the reason we had it was because fishing information out of the quarantine database in our original implementation was either expensive, or would not stick around forever. Our new implementation relies on xattrs on the
.app
directory, which I suspect would take a similar amount of I/O to read and parse compared with reading and parsing data from the cache file. If this is true, and it is also true that attribution data persists after an update, removing the cache file altogether seems like it would be the most robust solution here.Re: your understanding -- spot on. It's not clear to me that the
xattr
would persist; this seems a detail of how update stage and apply works that, IIUC, is an atomic swap that doesn't seem likely to retainxattr
. Easily tested and could be implemented, of course.As it turns out, the atomic replace happens on the Contents directory, not the
.app
directory - so, luckily, the xattr on the.app
directory is already preserved across updates.
Excellent news! This'll want a macOS-specific updater test eventually, since we don't want the behaviour to change.
As for the I/O, I was able to confirm with
iosnoop
that reading the xattr causes 3 reads to a resource fork totaling to 20480 bytes:STRTIME DEVICE MAJ MIN UID PID D BLOCK SIZE PATHNAME ARGS 2023 Dec 13 15:47:48 ?? 1 11 502 788 R 8477725 4096 ??/xattr/xattr/..namedfork/rsrc bash\0 2023 Dec 13 15:47:48 ?? 1 11 502 788 R 8477726 8192 ??/xattr/xattr/..namedfork/rsrc bash\0 2023 Dec 13 15:47:48 ?? 1 11 502 788 R 8477728 8192 ??/xattr/xattr/..namedfork/rsrc bash\0
Reading the cache file, on the other hand, is 4096 bytes:
TIME STIME DELTA DEVICE INS MAJ MIN UID PID PPID D BLOCK SIZE MOUNT FILE PATH COMM ARGS 4000333 4000123 209 ?? 5 1 5 502 688 1 R 30395385 4096 Data macAttributionDataCache-nightly-try ??/Firefox Nightly/macAttributionDataCache-nightly-try firefox firefox\0
That's a 5x increase - which sounds like a lot at first - but I believe this is out of the critical path of start-up, and possibly even off main thread - so reading 16 extra kilobytes feels like it's probably not a big deal for the simplicity it brings.
I 100% agree that this is not likely to have noticeable impact.
Assignee | ||
Comment 7•11 months ago
|
||
We will be relying the com.apple.application-instance
xattr being preserved for attribution data to be persisted across updates and channel changes. This change adds an extra check to most of the updater tests to ensure that this is the case.
I don't think we need to do anything special here around verifying that it happens for channel changes since those are just regular MARs that happen to include a bit that changes the channel - but I'm happy to add something if it's useful.
Assignee | ||
Comment 8•11 months ago
|
||
This function has been difficult to read for awhile. This patch cleans it up in preparation to adjust it for the removal of the cache file on macOS.
Depends on D197195
Assignee | ||
Comment 9•11 months ago
|
||
This is mostly removing code and tests related to reading/writing the cache file on macOS and updating of tests. Aside from that, the most notable part is the change to setAttributionString
to automatically prepend __MOZCUSTOM__
when writing an attribution code. This is mostly done to make things simpler and cleaner in the majority of the tests, but seeing as getAttributionString
is also aware of it, it seems like a generally nicer way to do this.
Depends on D197203
Comment 10•10 months ago
|
||
There are some r+ patches which didn't land and no activity in this bug for 2 weeks.
:bhearsum, could you have a look please?
If you still have some work to do, you can add an action "Plan Changes" in Phabricator.
For more information, please visit BugBot documentation.
Comment 12•10 months ago
|
||
Comment 13•10 months ago
|
||
Backed out for bc failures on browser_AttributionCode_telemetry.js
Backout link: https://hg.mozilla.org/integration/autoland/rev/ed8c40b1e938528fc2dc355d7c2b0c21ba54e4e2
Log link: https://treeherder.mozilla.org/logviewer?job_id=442725775&repo=autoland&lineNumber=4923
There were also xpcshell failures on test_AttributionCode.js: https://treeherder.mozilla.org/logviewer?job_id=442718176&repo=autoland&lineNumber=5350
Comment 14•10 months ago
|
||
Comment 15•10 months ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/93b5332d9da6
https://hg.mozilla.org/mozilla-central/rev/4ac74de4ecd5
https://hg.mozilla.org/mozilla-central/rev/aa6b62d272a2
Assignee | ||
Updated•10 months ago
|
Description
•