Closed Bug 1443115 Opened 7 years ago Closed 7 years ago

Desktop ignores its own pushEndpointExpired flag

Categories

(Firefox :: Firefox Accounts, defect, P1)

defect

Tracking

()

RESOLVED FIXED
Firefox 61
Tracking Status
firefox61 --- fixed

People

(Reporter: markh, Assigned: eoger)

References

Details

(Whiteboard: [fxsync])

Attachments

(1 file)

via bug 1359279 and https://github.com/mozilla/fxa-auth-server/issues/1873, FxA records have a pushEndpointExpired record. It seems possible desktop could find itself in this state but it makes no attempt to handle it.
Phil, is there an easy way to find out if there are many desktop device records with the pushEndpointExpired flag set?
Flags: needinfo?(pbooth)
> Phil, is there an easy way to find out if there are many desktop device records with the pushEndpointExpired flag set? Apologies for the late reply to this, I just got back from hols. Either :jrgm or :jbuck should be able to query the MySQL read replica for it... :jbuck or :jrgm, could one of you find the result of the the following query for us in prod? SELECT COUNT(*) FROM devices WHERE callbackIsExpired = 1;
Flags: needinfo?(pbooth)
Flags: needinfo?(jrgm)
Flags: needinfo?(jbuckley)
MySQL [fxa]> SELECT COUNT(*) FROM devices WHERE callbackIsExpired = 1; +----------+ | COUNT(*) | +----------+ | 4457721 | +----------+ 1 row in set (27 min 56.77 sec)
Flags: needinfo?(jrgm)
Flags: needinfo?(jbuckley)
(In reply to Jon Buckley [:jbuck] from comment #3) > MySQL [fxa]> SELECT COUNT(*) FROM devices WHERE callbackIsExpired = 1; > +----------+ > | COUNT(*) | > +----------+ > | 4457721 | > +----------+ Given Android is supposed to notice this and reregister, I guess we can assume many of these are probably desktop devices. However, it seems a little strange to me that desktop's FxA integration should need to handle this - presumably it will be an issue for other users of push in the browser, and AFAIK, our FxA code correctly listens to subscription-changed events. IOW, for desktop devices to be in this state, it seems to mean that FxA failed to send a push, but the desktop push service never told us about a subsequent subscription change. Kit, does that sound correct, and is this something we should try and dig deeper on, or should we just try and re-force the subscription in FxA when we notice the flag?
Flags: needinfo?(kit)
I guess another possibility is that we don't handle the subscription-changed notification robustly enough - eg, if we were unlucky enough to get that notification as we were shutting down or just as the network vanished I doubt we'd do the right thing later.
Ouch, that seems remarkably high to be fully explained by coincidental shutdown or going offline, but it's possible! We typically fire `push-subscription-change` events on startup, or on reconnect after waking from sleep; quota and permissions don't apply to FxA. Both are fairly busy, and it's possible even something as simple as opening and closing your laptop lid is enough to interrupt us as we're resubscribing. *hand wave* I wonder if we should try setting a pref, and unsetting it once we've successfully resubscribed and told FxA about the new endpoint. That might give us a better indication of whether we're interrupted, or aren't told about it at all. Folks have pointed out this is an issue for web content, too, in https://github.com/w3c/push-api/issues/292. So maybe a better long-term strategy would be to teach the push service about refreshing subscriptions, and then both the (hypothetical) DOM API and chrome code could use that.
Flags: needinfo?(kit)
> I guess we can assume many of these are probably desktop devices. Let's find out for certain. :jbuck, some follow-up queries to run here when you have a moment... 1. Group the counts by device type (if type is NULL here it means desktop btw): SELECT type, COUNT(*) AS count FROM devices WHERE callbackIsExpired = 1 GROUP BY 1 ORDER BY 2 DESC; 2. A long time ago it used to be possible for devices to get into a zombie state where they were left without an associated session token. We fixed that, but maybe there's still some hanging around so let's rule them out: SELECT COUNT(*) FROM devices d INNER JOIN sessionTokens t ON d.sessionTokenId = t.tokenId WHERE d.callbackIsExpired = 1; 3. Not sure if it's helpful, but we can also group counts by the OS that we parsed from the user-agent string: SELECT t.uaOS AS os, COUNT(*) AS count FROM devices d INNER JOIN sessionTokens t ON d.sessionTokenId = t.tokenId WHERE d.callbackIsExpired = 1 GROUP BY 1 ORDER BY 2 DESC;
Flags: needinfo?(jbuckley)
MySQL [fxa]> SELECT type, COUNT(*) AS count -> FROM devices -> WHERE callbackIsExpired = 1 -> GROUP BY 1 -> ORDER BY 2 DESC; +---------+---------+ | type | count | +---------+---------+ | desktop | 4077154 | | mobile | 434926 | | NULL | 568 | +---------+---------+ 3 rows in set (27 min 30.72 sec) MySQL [fxa]> SELECT COUNT(*) -> FROM devices d INNER JOIN sessionTokens t -> ON d.sessionTokenId = t.tokenId -> WHERE d.callbackIsExpired = 1; +----------+ | COUNT(*) | +----------+ | 4513921 | +----------+ 1 row in set (38 min 40.35 sec) MySQL [fxa]> SELECT t.uaOS AS os, COUNT(*) AS count -> FROM devices d INNER JOIN sessionTokens t -> ON d.sessionTokenId = t.tokenId -> WHERE d.callbackIsExpired = 1 -> GROUP BY 1 -> ORDER BY 2 DESC; +----------------+---------+ | os | count | +----------------+---------+ | Windows 10 | 1895664 | | Windows 7 | 945812 | | Android | 434941 | | Ubuntu | 404033 | | Mac OS X | 205313 | | Linux | 201418 | | Windows 8.1 | 194675 | | Windows XP | 109558 | | Windows Vista | 37703 | | Fedora | 35790 | | Windows 8 | 32890 | | Windows | 16308 | | FreeBSD | 2040 | | iOS | 975 | | NULL | 624 | | OpenBSD | 451 | | Windows 2000 | 151 | | Windows NT 4.0 | 117 | | NetBSD | 90 | | Solaris | 58 | | Firefox OS | 14 | | Windows 95 | 11 | | Chrome OS | 10 | | Symbian OS | 7 | | Windows Phone | 5 | | Windows 98 | 2 | | Tizen | 2 | | Kubuntu | 1 | | BlackBerry OS | 1 | | openSUSE | 1 | | Windows CE | 1 | | Red Hat | 1 | | Debian | 1 | | Slackware | 1 | +----------------+---------+ 34 rows in set (29 min 24.51 sec)
Flags: needinfo?(jbuckley)
Whiteboard: [fxsync]
Priority: -- → P2
Priority: P2 → P1
Assignee: nobody → eoger
Comment on attachment 8966715 [details] Bug 1443115 - Handle FxA pushEndpointExpired flag. https://reviewboard.mozilla.org/r/235420/#review241628 Looks great, thanks
Attachment #8966715 - Flags: review?(markh) → review+
Pushed by eoger@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/425022df8add Handle FxA pushEndpointExpired flag. r=markh
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 61
Blocks: 1453855
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: