Avoid fetching the device list on every sync, to reduce FxA server load, part deux
Categories
(Firefox :: Sync, enhancement, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox74 | --- | fixed |
People
(Reporter: markh, Assigned: markh)
References
Details
Attachments
(1 file)
So over in bug 1597868, we thought we were avoiding a fetch of all devices on every sync. It turns out there were multiple places trying to force this, and (I assume) our 1 minute threshold for actually making a new request was saving us from making 2. So best I can tell, we now have 1 location left where we are doing it each sync.
It's at https://searchfox.org/mozilla-central/rev/c61720a7d0c094d772059f9d6a7844eb7619f107/browser/base/content/browser-sync.js#247 - each sync, updateAllUI is called, which calls this. This was added in bug 1583413, so any fix here needs to ensure we don't regress what we fixed there. In particular, there are some confusing references to fxAccounts.device.recentDeviceList
in UIState.jsm which we should try and get our heads around.
Assignee | ||
Comment 1•5 years ago
|
||
Updated•5 years ago
|
Comment 3•5 years ago
|
||
bugherder |
Assignee | ||
Comment 4•5 years ago
|
||
Ryan, can you please check whether this patch appears to reduce the traffic as expected?
Comment 5•5 years ago
|
||
Early days, but it's looking promising! I made a little chart here:
https://docs.google.com/spreadsheets/d/1vlVdQmnRw62jfcCKZ-7Jd5_OypkdoNrCaH0RPUdUKVw/
You can see that the orange line for fx74 has a pronounced downward trend over the last few days, which is not present in the previous release.
To check that this effect is due to the fix here rather than a more general change in request volume, I made a version that looks only at traffic to /cert/sign and /account/devices. Over the last few days we see a steady rate of requests to /cert/sign but a definite drop (like, easily more than half) in the volume of /account/devices requests.
Assuming no UX fallout, this looks like a significant win for FxA server health and we should consider uplifting if possible. Thanks all!
Comment 6•5 years ago
|
||
Early days, but it's looking promising!
This has been in nightly for a few more days, and the effect seems to be real and stable. Overall FxA request volume from Firefox 74 has been reduced by about half, which is really quite significant!
ni? :markh to consider whether we can uplift this etc.
Comment 7•5 years ago
|
||
A quick comment on possible UX affects of this change.
In testing the Fennec->Fenix Migration :grisha ran into some unexpected behaviour with send-tab. After migrating from Fennec to Fenix, he was unable to send tabs to the new Fenix device:
- It showed up in the device list on desktop as "Firefox Sync" rather than with the custom device name
- When sending a tab to it from Desktop, the browser console revealed that Desktop was falling back to sending via old-send-tab in sync
- On the Fenix device, it was finding the "displayURI" command in its sync client record, logging an error about an unrecognized command and ignoring it.
The resulting user experience was essentially "send-tab stopped working after updating from Fennec to Fenix".
We convinced ourselves that Fenix wasn't doing anything wrong here, and this was just an artifact of Desktop not refreshing its device list. Restarting the desktop browser fixed the problem. I believe the sequence of events to have been as follows:
- Fenix migrates FxA state from Fennec and mints itself some new auth tokens.
- The FxA server sees a new client that is capable of accessing sync, and creates a new placeholder device record for it, sending a push notification to other connected devices.
- For reasons, it gives the device a default name of "Firefox Sync" (ref Bug 1602006).
- Desktop receives the push notification, shows a popup about "This device is now syncing with Firefox Sync", and refreshes its device list.
- Fenix completes its migration process and uploads a full device record, including custom device name and the metadata needed to receive tabs via new-send-tab.
- This does not trigger any additional notifications or behaviour from other devices.
- The user tries to send a tab from Desktop to Fenix.
- As far as Desktop knows, the target device is still called "Firefox Sync" and is not capable of receiving tabs via new-send-tab so it tries to send via old-send tab.
- Fenix syncs, sees a sent tab in its sync client data, but doesn't know what to do with it because it doesn't support old-send-tab. So it logs an error and continues on its way.
I think this is edge-casey enough that we don't need to roll back the patch in this bug, but it's something to keep in mind. It might also contribute to experiences like Bug 1601446 in cases where the user connects many devices in quick succession.
The right long-term solution here seems like a "device list changed" push notification of some kind, so that browsers can get a fresh list in a timely manner without having to poll for it.
Assignee | ||
Comment 8•5 years ago
|
||
Oops - sorry I missed this comment
(In reply to Ryan Kelly [:rfkelly] from comment #7)
A quick comment on possible UX affects of this change.
That's bad - but I don't understand how this patch made that any worse.
Note also that desktop does refresh its device list as the list of devices is shown - so the worst-case scenario there should be that the device list is incomplete for a second or 2, then updates with the result of the refresh. Further, this refresh bypasses the "don't refresh for a minute" throttling mechanism - a request is always made even if the previous was just seconds ago.
So I really can't see how this patch explains that behaviour given that device list would have been shown many times during this scenario, and that given we sync every 10 minutes, there's a good possibility that a sync might not have happened while experiencing this, in which case the behaviour before and after this patch should be identical.
The right long-term solution here seems like a "device list changed" push notification of some kind, so that browsers can get a fresh list in a timely manner without having to poll for it.
Right - there's a "device added" notification - https://searchfox.org/mozilla-central/source/services/fxaccounts/FxAccountsPush.jsm#196 - it's probably OK to re-send this when the device changes?
If this patch did make things worse, ISTM we probably don't want an uplift request. WDYT?
Assignee | ||
Comment 9•5 years ago
|
||
I also meant to say:
(In reply to Ryan Kelly [:rfkelly] from comment #7)
- Fenix syncs, sees a sent tab in its sync client data, but doesn't know what to do with it because it doesn't support old-send-tab. So it logs an error and continues on its way.
Yeah, this sucks and is something we really should address. I opened bug 1611058
Comment 10•5 years ago
|
||
We convinced ourselves that Fenix wasn't doing anything wrong here, and this was just an artifact of Desktop not refreshing its device list.
That's bad - but I don't understand how this patch made that any worse.
Indeed, perhaps it wasn't the result of this patch in particular, just an adjacent issue with having a fresh device list.
As a quick test, I changed the name of my Fenix device and them went to send it a tab in Desktop. In the devices list from the account menu I saw a brief flash of the old device name, followed quickly by it changing to the new name.
Interestingly, when I tried the same thing using right-click -> send page to device, Desktop does not appear to update its device list automatically, and I saw the old device name in the list.
So it sounds like what we were seeing here may have been a case of desktop failing to do the update-when-showing-the-device-list behaviour in some cases, rather than fallout from this bug in particular.
If this patch did make things worse, ISTM we probably don't want an uplift request. WDYT?
Given the above, I'm not too concerned about this making an edge-case like the above even edge-casier, and think uplift would still be fine.
Comment 11•5 years ago
|
||
From slack discussion: I'm mainly interested in having this in Release ahead of upcoming Fenix migration, to buy a bit of headroom on the FxA servers. The current nightly is scheduled to go to beta on 2020-02-10 and to release on 2020-03-10, which is comfortably ahead of any potential date for the full Fenix migration.
So, given the above discussions about possible edge-cases, I think we can afford to let this ride the trains.
Assignee | ||
Comment 12•5 years ago
|
||
Comment on attachment 9119307 [details]
Bug 1604699 - avoid refreshing the FxA device list every sync.
Beta/Release Uplift Approval Request
- User impact if declined: None - this is to mitigate the load on the Firefox Accounts servers in anticipation of additional load being caused by the Fennec -> Fenix migration
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Risk is limited to the list of devices shown in the "Send Tab to Device" UI. There is a minor regression in edge-cases around this UI (specifically, for brand-new devices connected to the account, and when send-tab via the context-menu is used), but we are comfortable with this tradeoff (ie, would rather live with this regression than have the FxA servers go down due to the Fenix migration). See comment 10 and comment 11 for more.
- String changes made/needed: None
Comment 13•5 years ago
|
||
I'm confused, comment 11 says this can just ride the trains?
Assignee | ||
Comment 14•5 years ago
|
||
Comment on attachment 9119307 [details]
Bug 1604699 - avoid refreshing the FxA device list every sync.
Sorry, my mistake, I misread that comment.
Description
•