Closed Bug 1289629 Opened 8 years ago Closed 6 years ago

Clients with GCM senderid 829133274407 need to disable WebPush (or use new SenderID)

Categories

(Firefox for Android Graveyard :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jrconlin, Unassigned)

Details

Due to https://bugzilla.mozilla.org/show_bug.cgi?id=1279306, older installs which used a now invalid GCM SenderID will no longer receive updates. These clients will continue to allow users to request new push endpoints, but these will never work.

Application needs to either change it's default SenderID to match the new SenderID for production, or must disable WebPush so as not to give a very poor user experience. 

Note: the server can report to the client a broken SenderID on subsequent REST endpoint registration calls, however that would be additional work as well.

For now, the server will mark any UAID record with a GCM delivery failure as invalid.
What's the action item here? As far as I can tell, Bug 1279306 is in 48+, and does indeed change the default pref.
Flags: needinfo?(jrconlin)
Normally, SenderIDs are static and do not change. Unfortunately, because of 1279306, we have a population of existing users which are using a now invalid SenderID. The SenderID value was set at application install time, and it is my understanding, that the only way to fix this is to completely remove then reinstall the application with the corrected SenderID. The normal update process will not resolve this. 

Because these applications have an incorrect GCM SenderID, they will not get push notifications. Unfortunately, the application will continue to offer Push notifications to the user as if it could. I am updating the Push Server to return an error when it has detected that the UAID associated with the application is unable to receive push notifications, however it would be best if the application were to check the value stored in it's own manifest, and if it does not match the currently valid SenderID, it were to not allow Push notifications to be created at all. 

That, or if there is a way for the application to reliably change it's SenderID, it should do so.
Flags: needinfo?(jrconlin)
Over to Sebastian to dig deeper.
Flags: needinfo?(s.kaspari)
IIUC, the consequences of not fixing this are that pre-release versions of Firefox that were installed prior to the landing of bug 1279306 will not receive Push Notifications?
Flags: needinfo?(jrconlin)
ckarlof: yes.
Flags: needinfo?(jrconlin)
(In reply to JR Conlin [:jrconlin,:jconlin] from comment #2)
> The SenderID value was set at application install time,
> and it is my understanding, that the only way to fix this is to completely
> remove then reinstall the application with the corrected SenderID. The
> normal update process will not resolve this. 

I don't understand this. The sender ID is used to register for GCM and receive a GCM token.
This token is passed to a server in order to allow it to send GCM push messages.

The sender ID is a constant in our app and it will change with an app update. However we might
still have a GCM token for the old registration saved and we would need to re-register and send
the new GCM token to our server. Is this actually the problem here?

I need to read more of our GCM code: Do we ever re-register or expire tokens ever? In the past the
GCM docs stated that you should refresh tokens regularly (e.g. after app update). But this is
now only mentioned in the deprecated legacy documentation[1].

According to the new docs [2] the GCM service now initiates refreshs periodically. And it looks
like we handle this [3]. This refresh should use the new sender ID after an app update. However
this is not in our control and we do not know when this will be triggered.

We could look into triggering a token refresh via app updates / switchboard but until this rides
the trains the token refresh might already have happened anyways (and this is a one-time thing to
fix here).
Or we could save the GCM sender ID together with the token and compare: If we have a new/different
sender ID now then do not re-use the existing GCM token: Register with the new sender id.

Does this make sense or am I missing something?

[1] https://developers.google.com/cloud-messaging/android/legacy-regid
[2] https://developers.google.com/instance-id/guides/android-implementation#refresh_tokens
[3] https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/gcm/GcmInstanceIDListenerService.java#27
Flags: needinfo?(s.kaspari)
JR, can you please respond to Sebastian's questions?
Flags: needinfo?(jrconlin)
I believe there may be some confusion based on vague definitions. 
There are two components to the GCM system, the "SenderID" (which is a public value identifying all instances of a given application across android), and a GCM "Token" (which is specific to a given instance of an app associated with a user). The "SenderID" is used to fetch the "Token". 

As I understand the function: https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/gcm/GcmInstanceIDListenerService.java#19, this is triggered when the GCM "Token" changes. I understand that the GCM token can change for any reason. 

The SenderID normally does not change. In our case, the SenderID has changed, and this is the root of the problem. 

I believe that the registration occurs at https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/gcm/GcmTokenClient.java#87 I don't know if the caching that is mentioned is something that could be cleared before the request if the SenderID has changed, but that seems like a likely path. I know from personal experience that updating nightly (which had the new ID) did not resolve the problem, however uninstalling, and then reinstalling nightly did.
Flags: needinfo?(jrconlin)
(In reply to JR Conlin [:jrconlin,:jconlin] from comment #8)
> I know from personal experience that updating nightly (which had the new ID) did not
> resolve the problem, however uninstalling, and then reinstalling nightly did.

The local cashing of the token should be the reason for that. I don't see us caching/storing the sender ID. In your test case clearing the app data might be enough: The cached token is gone and we use the new sender ID to receive a new token.

For the users with a token for the old sender ID: We could either wait for the refresh (unpredictable) or uplift a patch that basically forces a refresh (and stores the sender ID like mentioned above). Looking at bug 1279306 it seems like we shipped the wrong sender ID up to beta. Uplifting such a patch would refresh the tokens of all those users - is this a problem for us?
(In reply to Sebastian Kaspari (:sebastian) from comment #9)
> (In reply to JR Conlin [:jrconlin,:jconlin] from comment #8)
> > I know from personal experience that updating nightly (which had the new ID) did not
> > resolve the problem, however uninstalling, and then reinstalling nightly did.
> 
> The local cashing of the token should be the reason for that. I don't see us
> caching/storing the sender ID. In your test case clearing the app data might
> be enough: The cached token is gone and we use the new sender ID to receive
> a new token.
> 
> For the users with a token for the old sender ID: We could either wait for
> the refresh (unpredictable) or uplift a patch that basically forces a
> refresh (and stores the sender ID like mentioned above). Looking at bug
> 1279306 it seems like we shipped the wrong sender ID up to beta. Uplifting
> such a patch would refresh the tokens of all those users - is this a problem
> for us?

Folks, I am late to this party, but reading this and https://bugzilla.mozilla.org/show_bug.cgi?id=1279306 I can't understand why we require a new sender ID.  That should never need to change (assuming Mozilla never loses control of the sender ID); in fact, sender ID is now called something like "application ID" in other parts of Google's system, partially to reflect the fact that it's static.  Certainly changing the URL of the Push server may require changing the sender ID; but it just as well "requires" the new Push server to have credentials for the old sender ID.  Why is that not the solution here?

As for refreshing tokens, etc: the existing code tries to keep the UAID registration fresh, in part to allow ourselves to dig clients out of holes like this one.  If you start digging at https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/push/PushManager.java#33, you'll find that we eventually handle errors bluntly around https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/push/PushManager.java#332.  If the autopush endpoint returns a non-transient error (see https://dxr.mozilla.org/mozilla-central/source/mobile/android/services/src/main/java/org/mozilla/gecko/push/autopush/AutopushClientException.java#69), we will remove the UAID registration entirely.  This is our out.

Of course, we need to support the Sender ID we have in the wild -- which returns to my initial question.  Why are we not doing that?
Flags: needinfo?(jrconlin)
Kit could you perhaps help shed some light on why the sender ID was changed (based on https://bugzilla.mozilla.org/show_bug.cgi?id=1279306#c2)?
Flags: needinfo?(kcambridge)
My recollection from #push is the old sender ID only worked for stage, and we needed a separate one for production. https://updates-autopush.stage.mozaws.net/v1/gcm/829133274407/... worked; https://updates.push.services.mozilla.com/v1/gcm/829133274407/... returned 301s (which indicated the ID was invalid).
Flags: needinfo?(kcambridge)
Added a patch to the server to allow multiple senderIDs via https://github.com/mozilla-services/autopush/pull/577

This will be on the train for the next deployment.
Flags: needinfo?(jrconlin)
Maybe we can close this now?
Flags: needinfo?(jrconlin)
Yes.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(jrconlin)
Resolution: --- → FIXED
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.