Closed Bug 1236130 Opened 4 years ago Closed 4 years ago

Network change notification issue causes a caching problem, (some) sites do not update anymore

Categories

(Firefox for Android :: General, defect)

46 Branch
Unspecified
Android
defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 49
Tracking Status
firefox45 --- disabled
firefox46 + disabled
firefox47 + disabled
firefox49 --- fixed
fennec 47+ ---

People

(Reporter: jan.manthay, Assigned: Grisha)

References

Details

Attachments

(2 files)

On my Asus Memopad HD7 with Android 4.2.2 Firefox 46 (Nightly) has a cache problem for a few days now.

Sites do not update anymore, and always show the same content, even after a forced reload.
After a forced quit of the app then the new content is loaded.

Not all sites are affected, and it looks like not all affected sites are affected always. But I am not yet 100% sure on this.
Time ist too short so far to find a systematic.

Sites that were affectet:
tagesschau.de
spiegel.de
golem.de
heise.de
wordpress site
blogger sites

Can’t say exactly since when this happens, like about a week.
We landed code (bug 935190) to monitor the network connection, and load a cached version of the page if you have no network connection.

If this happens, you should see a "snackbar" UI across the bottom of the page telling you the page is being loaded from the cache.

You can disable this by setting "browser.tabs.useCache" to false. See if that affects the issue.
Also note, we changed the "reload" action to not be a forced reload anymore. You need to long press on the "reload" button to get a forced reload.
OK, this still happens, sites are not updated. There is no information that the chache is in use. An internet connection is active.
A long press on the reload button loads the actual content.

For my tablet, it is a wifi only version.
When the tablet is in standby for a longer time, wifi is deactivatet to save battery. After waking up the device, wifi is activated again.
So maybe Fx has problems to realise that, and still thinks there is no connection.


"You can disable this by setting "browser.tabs.useCache" to false. See if that affects the issue."
I will try this now.
tracking-fennec: --- → ?
(In reply to Jan Manthay from comment #3)

> "You can disable this by setting "browser.tabs.useCache" to false. See if
> that affects the issue."
> I will try this now.

Have you managed to try this yet?
tracking-fennec: ? → 46+
Flags: needinfo?(jan.manthay)
Sorry for the lack of updates on this case, but nighty has a crash problem on my side at the moment, which makes it difficult to check out this bug here.

But I didn’t see this problem for a few days now, so maybe something fixed it.

I did set "browser.tabs.useCache" to false, and indeed this solved the problem. After setting it to true again, the problem came back, but one or two days later I was not able to see it any more.

So at the moment this looks like WORKSFORME, but of course I will still have a look.
Flags: needinfo?(jan.manthay)
Blocks: 1236616
Thanks for the report. I'll close this for now, but please let us know if you see problems again.
Status: UNCONFIRMED → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME
I'm reopening this because I see the same problem. Before reading this bug, I had a similar theory as comment 3, namely that Firefox restores the session faster than the Android device (Nexus 7, Nexus 9) can reestablish the Wifi connection.
Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: WORKSFORME → ---
Yes, sorry to say it still is a problem. Its not always easy to see on first sight.
[Tracking Requested - why for this release]: unacceptable behavior

Had this happen to me. Looked into about:cache for the page. Expire date was 1969-12-31 16:00 in Pacific time. Which is 1970-01-01 00:00 UTC. So it looks like cache entries with a Unix time of 0 are not being invalidated (or maybe the negative Unix time due to timezones is the problem).
Status: REOPENED → NEW
The only workaround for this problem is the user to use the 'Clear private data' option in the settings with just the 'Cache' entry checked.
Clarifying after discussion in #mobile: long-pressing the refresh button does fix the problem described in comment 7. So forcing refresh to be a full refresh after we go from offline->online mode would be a fix (though I guess harder to implement than this description).

(In reply to Kevin Brosnan [:kbrosnan] from comment #9)
> [Tracking Requested - why for this release]: unacceptable behavior
> 
> Had this happen to me. Looked into about:cache for the page. Expire date was
> 1969-12-31 16:00 in Pacific time. Which is 1970-01-01 00:00 UTC. So it looks
> like cache entries with a Unix time of 0 are not being invalidated (or maybe
> the negative Unix time due to timezones is the problem).

Random thought: I hope this isn't an issue with the Android cache filesystem not doing modification times properly.
Tracking since this sounds like a user-facing feature. May be a recent regression. 
gcp or margaret, can you take a closer look, or help find an owner for this bug?
Does it affect beta 45?
Just want to say that my smartphone, a MotoG with Android 5.1 is also affected. I never shut down Wifi on this device, and it is always connected to the mobile net.
And still, sometimes pages are not updated/refreshed. There is also no notification that the cache is in use in those cases.
(But I saw this notification a single time)
(In reply to Jan Manthay from comment #13)
> Just want to say that my smartphone, a MotoG with Android 5.1 is also
> affected. I never shut down Wifi on this device, and it is always connected
> to the mobile net.
> And still, sometimes pages are not updated/refreshed. There is also no
> notification that the cache is in use in those cases.
> (But I saw this notification a single time)

Can you run with "browser.tabs.useCache" = false for a few days and see if that makes the issue go away? Or are you still running with "browser.tabs.useCache" = false and still seeing the issue?

GCP - Can you flip "browser.tabs.useCache" to false as well and report back?
Flags: needinfo?(gpascutto)
(In reply to Mark Finkle (:mfinkle) from comment #14)
> GCP - Can you flip "browser.tabs.useCache" to false as well and report back?

Seems to "fix" it so far.
Flags: needinfo?(gpascutto)
mfinkle, what do you want to do about this? Right now we're set to ship this feature in 46, but we should make sure we understand what's going on here.
Flags: needinfo?(mark.finkle)
Given we don't know what's happening yet, we should RELEASE_BUILD the pref to false
Flags: needinfo?(mark.finkle)
Depends on: 1253598
(In reply to Mark Finkle (:mfinkle) from comment #17)
> Given we don't know what's happening yet, we should RELEASE_BUILD the pref
> to false

I filed bug 1253598 about this.

Let's come up with an action plan for how we can figure out what's going wrong here. Is there some debug logging we can add for more useful bug reports?
snorp, can you help us figure this out? See comment 9.
Flags: needinfo?(snorp)
We should let the network guys take a crack at it first before I start flailing around. Nick, maybe you know who the best person to look at it? This is blocking useful offline-mode stuff in Fennec.
Flags: needinfo?(snorp) → needinfo?(hurley)
Honza and Michal are the guys to go to for cache issues
Flags: needinfo?(michal.novotny)
Flags: needinfo?(hurley)
Flags: needinfo?(honzab.moz)
Run the patch for bug 935190 through :bagder's review.  I think what is wrong here is the result (of not always 100% reliable) network link service.  :bagder is the current maintainer of that code.

Anyway, next time when you play with load flags and network APIs it may be good to give such a patch to a necko peer for at least a feedback.  May save us all some headaches.
Flags: needinfo?(michal.novotny)
Flags: needinfo?(mark.finkle)
Flags: needinfo?(honzab.moz)
(In reply to Honza Bambas (:mayhemer) from comment #22)
> Run the patch for bug 935190 through :bagder's review.  I think what is
> wrong here is the result (of not always 100% reliable) network link service.
> :bagder is the current maintainer of that code.

Thanks for the info.

bagder, can you help us determine if we're doing something wrong in this patch that landed in bug 935190?

https://hg.mozilla.org/mozilla-central/rev/2aad6036a8e5

It sounds like we're not receiving a "network:link-status-changed" notification when we think we should be.

FYI, I just landed a patch to put this feature behind a switchboard flag in bug 1253598, so we don't need to worry about this accidentally going to release. We should figure out the right way to do this before shipping it.
Assignee: nobody → margaret.leibovic
Flags: needinfo?(mark.finkle) → needinfo?(daniel)
(In reply to :Margaret Leibovic from comment #23)
> (In reply to Honza Bambas (:mayhemer) from comment #22)
> > Run the patch for bug 935190 through :bagder's review.  I think what is
> > wrong here is the result (of not always 100% reliable) network link service.
> > :bagder is the current maintainer of that code.
> 
> Thanks for the info.
> 
> bagder, can you help us determine if we're doing something wrong in this
> patch that landed in bug 935190?
> 
> https://hg.mozilla.org/mozilla-central/rev/2aad6036a8e5
> 
> It sounds like we're not receiving a "network:link-status-changed"
> notification when we think we should be.

We send that event in response to connectivity change intents from Android[0]. I think that it's possible we are getting a string containing 'unknown', which would cause us to use the cache according to the change you linked to. IMHO, we should probably only force the cache if we know for sure it's "down". Maybe Finkle saw that "unknown" was commonly "down", though?

[0] https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/GeckoConnectivityReceiver.java#85
Flags: needinfo?(mark.finkle)
Assignee: margaret.leibovic → gkruglov
Flags: needinfo?(daniel)
See Also: → 1258428
Hi Margaret, I am just trying to follow up on Fx47 tracked bugs and this one also affects 46. Is this something we have a potential fix for? Thanks!
Flags: needinfo?(margaret.leibovic)
It is disabled by switchboard, in bug 1253598 though I am not clear on if that solves it for the 47 release as well.
(In reply to Kevin Brosnan [:kbrosnan] from comment #26)
> It is disabled by switchboard, in bug 1253598 though I am not clear on if
> that solves it for the 47 release as well.

This feature is held on beta until we decide to let it ride to release, so we don't need to feel rushed to fix this before a certain release.
Flags: needinfo?(margaret.leibovic)
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #24)
> (In reply to :Margaret Leibovic from comment #23)
> > (In reply to Honza Bambas (:mayhemer) from comment #22)
> > > Run the patch for bug 935190 through :bagder's review.  I think what is
> > > wrong here is the result (of not always 100% reliable) network link service.
> > > :bagder is the current maintainer of that code.
> > 
> > Thanks for the info.
> > 
> > bagder, can you help us determine if we're doing something wrong in this
> > patch that landed in bug 935190?
> > 
> > https://hg.mozilla.org/mozilla-central/rev/2aad6036a8e5
> > 
> > It sounds like we're not receiving a "network:link-status-changed"
> > notification when we think we should be.
> 
> We send that event in response to connectivity change intents from
> Android[0]. I think that it's possible we are getting a string containing
> 'unknown', which would cause us to use the cache according to the change you
> linked to. IMHO, we should probably only force the cache if we know for sure
> it's "down". Maybe Finkle saw that "unknown" was commonly "down", though?
> 
> [0]
> https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/
> mozilla/gecko/GeckoConnectivityReceiver.java#85

I only look for "down", "up", or "unknown" strings [1], otherwise, I ignore the notification. I treat "down" and "unknown" as "offline". It would be a good idea to do more logging of the explicit states to see if we are missing a transition change.

[1] http://mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/browser.js#7209
Flags: needinfo?(mark.finkle)
Grisha - I can help answer questions and walk you through what I know. Ping me sometime.
A brief look and poking at it in webide, I see one quick problem:
- One of the notifications we receive is "changed", which is currently ignored. From [1], "changed" is: "isLinkUp is still true, but the network setup is modified". So say if we were in "useCache=true" mode, and "changed" came in without a follow up "up" event (this might be possible?), we'd get erroneously stuck in the cache mode.

So at the very least we should treat "changed" properly, which is a tiny fix, and I'll keep trying to debug it further.

[1] https://dxr.mozilla.org/mozilla-central/source/netwerk/base/nsINetworkLinkService.idl#68
> "changed" came in without a follow up "up" event (this might be possible?)

Yes, it should be possible. If the network is up and running and we detect a network change, it will inform about this with a "changed" but it will not change the up/down state of the link.
Summary: Nightly 46 has a cache problem, (some) sites do not update anymore → Network change notification issue causes a caching problem, (some) sites do not update anymore
(In reply to Daniel Stenberg [:bagder] from comment #31)
> > "changed" came in without a follow up "up" event (this might be possible?)
> 
> Yes, it should be possible. If the network is up and running and we detect a
> network change, it will inform about this with a "changed" but it will not
> change the up/down state of the link.

This in itself will only be a problem if either:
1. network was down, and a "changed" event was sent to indicate that is back up now without a follow up "up".
2. fennec's tabs code thought that network was down (which would be a problem in itself) and was in a "use cache" mode, and a "changed" event was sent without an "up".

Can you confirm that (1) shouldn't happen? It seems like it shouldn't. I'm looking to see if (2) is possible.
Comment 30 sounds like what I am seeing with bug 1258428.
(In reply to :Grisha Kruglov from comment #32)

> This in itself will only be a problem if either:
> 1. network was down, and a "changed" event was sent to indicate that is back
> up now without a follow up "up".

The point is that "changed" doesn't imply "up" nor "down". It just implies that the network changed and it remains in the same up-down state as before. I don't think we have anything that forbids a changed to happen while we're in "down" state but I doubt it actually ever happen.
tracking-fennec: 46+ → 47+
Duplicate of this bug: 1256020
As it's getting late in beta now, is this something we should disable for 46 before release? Or should it not block offline caching?
Flags: needinfo?(margaret.leibovic)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #36)
> As it's getting late in beta now, is this something we should disable for 46
> before release? Or should it not block offline caching?

It is controlled by switchboard, so it's automatically held on beta. It will not be enabled on release when 46 goes to release.
Flags: needinfo?(margaret.leibovic)
Marking as disabled for 46 since it's currently built in to be disabled once 46 moves to release.
- specifying states, events and transition side-effects explicitely makes this code easier to read/maintain
- move bunch of network state helper methods into NetworkUtils
- ensure to update both network state (up/down/unknown), as well as connection type/subtype every time we need to update network state
-- this should fix the buggy behaviour when we'd miss certain network state transitions
- tests for the FSM transition matrix, and everything in the NetworkUtils

Review commit: https://reviewboard.mozilla.org/r/49571/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/49571/
Attachment #8746792 - Flags: review?(mark.finkle)
Attachment #8746793 - Flags: review?(mark.finkle)
Comment on attachment 8746792 [details]
MozReview Request: Bug 1236130 - Part 1: Use an explicit state machine to control GeckoNetworkManager r=mcomella

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/49571/diff/1-2/
Comment on attachment 8746793 [details]
MozReview Request: Bug 1236130 - Part 2: Only use cache for tabs when network state is "down" r=mcomella

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/49573/diff/1-2/
Maybe Sebastian would also be a better reviewer for this.
Flags: needinfo?(s.kaspari)
Comment on attachment 8746792 [details]
MozReview Request: Bug 1236130 - Part 1: Use an explicit state machine to control GeckoNetworkManager r=mcomella

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/49571/diff/2-3/
Attachment #8746792 - Attachment description: MozReview Request: Bug 1236130 - Part 1: Use an explicit state machine to control GeckoNetworkManager r=mfinkle → MozReview Request: Bug 1236130 - Part 1: Use an explicit state machine to control GeckoNetworkManager r=mcomella
Attachment #8746793 - Attachment description: MozReview Request: Bug 1236130 - Part 2: Only use cache for tabs when network state is "down" r=mfinkle → MozReview Request: Bug 1236130 - Part 2: Only use cache for tabs when network state is "down" r=mcomella
Attachment #8746792 - Flags: review?(mark.finkle) → review?(michael.l.comella)
Attachment #8746793 - Flags: review?(mark.finkle) → review?(michael.l.comella)
Comment on attachment 8746793 [details]
MozReview Request: Bug 1236130 - Part 2: Only use cache for tabs when network state is "down" r=mcomella

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/49573/diff/2-3/
Mike volunteered to review this.
Flags: needinfo?(s.kaspari)
Attachment #8746792 - Flags: review?(michael.l.comella) → review+
Comment on attachment 8746792 [details]
MozReview Request: Bug 1236130 - Part 1: Use an explicit state machine to control GeckoNetworkManager r=mcomella

https://reviewboard.mozilla.org/r/49571/#review47363

Nice tests!

The code changed too much for me to verify the behavior is the same as it was before, but in isolation, this looks good. You may want to ask jchen or someone else on platform for a quick skim to ensure the functionality looks the same.

::: mobile/android/base/java/org/mozilla/gecko/GeckoNetworkManager.java:29
(Diff revision 3)
>  /*
> - * A part of the work of GeckoNetworkManager is to give an general connection
> + * Provides connection type, subtype and general network status (up/down).

`/**` starts a javadoc block. This is important because if we try to generate the docs, a `/*` comment will be ignored.

::: mobile/android/base/java/org/mozilla/gecko/GeckoNetworkManager.java:40
(Diff revision 3)
>   */
>  
>  public class GeckoNetworkManager extends BroadcastReceiver implements NativeEventListener {

nit: rm newline

::: mobile/android/base/java/org/mozilla/gecko/GeckoNetworkManager.java:43
(Diff revision 3)
> -    private static final String LINK_TYPE_WIFI = "wifi";
> -    private static final String LINK_TYPE_WIMAX = "wimax";
> -    private static final String LINK_TYPE_2G = "2g";
> -    private static final String LINK_TYPE_3G = "3g";
> -    private static final String LINK_TYPE_4G = "4g";
>      private static final String LOGTAG = "GeckoNetworkManager";

nit: We've been using `GeckoNetworkManager.class.getSimpleName()` (but to fair, I'm not clear that's a best practice).

::: mobile/android/base/java/org/mozilla/gecko/GeckoNetworkManager.java:71
(Diff revision 3)
> +    private ManagerState currentState = ManagerState.OffNoListeners;
> +    private ConnectionType currentConnectionType = ConnectionType.NONE;
> +    private NetworkStatus currentNetworkStatus = NetworkStatus.UNKNOWN;
> +    private ConnectionSubType currentConnectionSubtype = ConnectionSubType.UNKNOWN;

These should use hungarian notation because the file is already using hungarian notation.

*However*, since we're basically rewriting the file, we could remove the notation on `sInstance` and `mApplicationContext`, which I slightly prefer.

::: mobile/android/base/java/org/mozilla/gecko/GeckoNetworkManager.java:95
(Diff revision 3)
> -    // Information happened.
> -    private boolean mShouldNotify;
> -
> -    // The application context used for registering receivers, so
> -    // we can unregister them again later.
>      private volatile Context mApplicationContext;

nit: -> `applicationContext`

::: mobile/android/base/java/org/mozilla/gecko/GeckoNetworkManager.java:166
(Diff revision 3)
> +        switch (currentState) {
> +            case OffNoListeners:
> +                switch (event) {
> +                    case start:                 return ManagerState.OnNoListeners;
> +                    case enableNotifications:   return ManagerState.OffWithListeners;
> +                    default: return null;
> -        }
> +                }

It's fragile to use switch statements without explicit `break` statements (even after `return`) and comments like `// fallthrough`. It's a common source of errors when the code is being refactored (e.g. because someone forgets about fall-throughs).

I think you should either add explicit breaks or if/else.

::: mobile/android/base/java/org/mozilla/gecko/GeckoNetworkManager.java:174
(Diff revision 3)
> +                switch (event) {
> +                    case stop:                  return ManagerState.OffNoListeners;
> +                    case enableNotifications:   return ManagerState.OnWithListeners;
> +                    default: return null;

Instead of returning null, I think we can be more specific/explicit:
* if `currentState` doesn't match any of these states, we should throw
* We should consider returning the current state, rather than null, in the event that we don't intend to transition (and the performActionForNextState can early return if the state is the same).

::: mobile/android/base/java/org/mozilla/gecko/GeckoNetworkManager.java:206
(Diff revision 3)
> +     * @param currentState State which we are leaving
> +     * @param event Event which is causing us to leave the state
> +     */
> +    private void performActionsForStateEvent(ManagerState currentState, ManagerEvent event) {
> +        switch (currentState) {
> +            case OffNoListeners:

I'm concerned that the next-state branching and the perform transition branching kind of duplicate each other, but without overcomplicating this (e.g. using Objects to represent State, or writing a lot of extra Runnables for state transitions), I can't think of a better solution.

Worth mentioning I otherwise do really like the separation of getNextState & performing transition.

::: mobile/android/base/java/org/mozilla/gecko/GeckoNetworkManager.java:208
(Diff revision 3)
> +     */
> +    private void performActionsForStateEvent(ManagerState currentState, ManagerEvent event) {
> +        switch (currentState) {
> +            case OffNoListeners:
> -        switch (event) {
> +                switch (event) {
> -            case "Wifi:Enable": {
> +                    case start:

It's generally good to have a `// fallthrough` comment when you intend to fallthrough because it's often hard to tell if it's intentional or not

::: mobile/android/base/java/org/mozilla/gecko/GeckoNetworkManager.java:209
(Diff revision 3)
> -                final WifiManager mgr = (WifiManager) mApplicationContext.getSystemService(Context.WIFI_SERVICE);
> -
> +                    case enableNotifications:
> +                        updateNetworkStateAndConnectionType();

Why do we want to update the network state when enableNotifications is called?

::: mobile/android/base/java/org/mozilla/gecko/GeckoNetworkManager.java:212
(Diff revision 3)
>                  }
> +            case OnNoListeners:

You're missing `break`s in the outer case statement. The `switch` inside a `switch` is confusing – I'd recommend making one of them an if statement.

::: mobile/android/base/java/org/mozilla/gecko/GeckoNetworkManager.java:256
(Diff revision 3)
> +                Context.CONNECTIVITY_SERVICE);
> +        // Type/status getters below all have a defined behaviour for when connectivityManager == null
> +        if (connectivityManager == null) {
> +            Log.e(LOGTAG, "ConnectivityManager does not exist.");
> +        }
> +        Log.d(LOGTAG, "Old network state: " + currentNetworkStatus + ", " + currentConnectionType + ", " + currentConnectionSubtype);

Between this and the state transition logs, it sounds like we're going to get a lot of log spam on transitions – is there a way to reduce this?

::: mobile/android/base/java/org/mozilla/gecko/GeckoNetworkManager.java:277
(Diff revision 3)
> +                            wifiDhcpGatewayAddress(mApplicationContext),
> +                            currentConnectionSubtype.value
> +                    )
> +            );
>  
> -    private void stopListening() {
> +            GeckoAppShell.sendEventToGecko(GeckoEvent.createNetworkLinkChangeEvent(currentNetworkStatus.value));

Could we use `currentNetworkStatus.name()` instead? So we don't have to duplicate the name twice.

Alternatively, if that does something silly like MAKE_THE_NAME_ALL_CAPS, consider storing `name().toLowerCase()` or overriding the `name` function to return lower case.

Same for currentConnectionSubtype.

::: mobile/android/base/java/org/mozilla/gecko/util/NetworkUtils.java:28
(Diff revision 3)
> +        ETHERNET("ethernet"),
> +        WIFI("wifi"),
> +        WIMAX("wimax"),
> +        UNKNOWN("unknown");
> +
> +        public String value;

`final`

::: mobile/android/base/java/org/mozilla/gecko/util/NetworkUtils.java:51
(Diff revision 3)
> +    // TODO: ConnectionType enum as defined in the w3c spec is different - also contains wimax, mixed, unknown.
> +    // See: http://w3c.github.io/netinfo/#the-connectiontype-enum
> +    // Should this conform?

What is the resolution for this TODO? Is there a follow-up for this? If so, there should be a bug number.

::: mobile/android/base/java/org/mozilla/gecko/util/NetworkUtils.java:62
(Diff revision 3)
> +        ETHERNET(2),
> +        WIFI(3),
> +        OTHER(4),
> +        NONE(5);
> +
> +        public final int value;

nit: is there a more specific name we can use here? What's this value intended to be used for? It's probably because Gecko won't let us pass the Enum directly, right? So perhaps geckoValue?

::: mobile/android/base/java/org/mozilla/gecko/util/NetworkUtils.java:101
(Diff revision 3)
> -            return false;
> +            return ConnectionSubType.UNKNOWN;
> +        }
> +
> +        switch (networkInfo.getType()) {
> +            case ConnectivityManager.TYPE_ETHERNET:
> +                return ConnectionSubType.ETHERNET;

I prefer explicit breaks here and potentially comments like, `// fallthrough x5`

::: mobile/android/base/java/org/mozilla/gecko/util/NetworkUtils.java:146
(Diff revision 3)
> +            return ConnectionType.NONE;
> +        }
> +
> +        switch (networkInfo.getType()) {
> +            case ConnectivityManager.TYPE_BLUETOOTH:
> +                return ConnectionType.BLUETOOTH;

nit: explicit breaks

::: mobile/android/base/java/org/mozilla/gecko/util/NetworkUtils.java:159
(Diff revision 3)
> +            default:
> +                return ConnectionType.OTHER;
> +        }
> +    }
> +
> +    public static NetworkStatus getNetworkStatus(ConnectivityManager connectivityManager) {

I don't think the caller will ever want to pass in a `ConnectivityManager`, except for efficiency, but I don't think it's worth littering the public-facing code with both `doSomething(ConnectivityManager)` and `doSomething()` for every use-case.

However, I see the testing use-case – do you think it'd be better to make `doSomething(ConnectivityManager)` only `@VisibleForTesting` but otherwise use non-argument versions?

::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/GeckoNetworkManagerTest.java:2
(Diff revision 3)
> +/* Any copyright is dedicated to the Public Domain.
> +   http://creativecommons.org/publicdomain/zero/1.0/ */

nit: MPL

::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/GeckoNetworkManagerTest.java:14
(Diff revision 3)
> +import org.mozilla.gecko.GeckoNetworkManager.ManagerState;
> +import org.mozilla.gecko.GeckoNetworkManager.ManagerEvent;
> +
> +import static org.junit.Assert.*;
> +
> +

nit: extra ws

::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/GeckoNetworkManagerTest.java:16
(Diff revision 3)
> +
> +import static org.junit.Assert.*;
> +
> +
> +@RunWith(TestRunner.class)
> +public class GeckoNetworkManagerTest {

nit: -> `TestGeckoNetworkManager`

::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/GeckoNetworkManagerTest.java:18
(Diff revision 3)
> +
> +
> +@RunWith(TestRunner.class)
> +public class GeckoNetworkManagerTest {
> +    @Test
> +    public void testTransitionMatrix() {

nit: I think it's more explicit to have the test name represent the method you're testing (e.g. so you don't accidentally test multiple methods).

`testGetNextState` or even `testGetNextStateTransitionMatrix`

::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/GeckoNetworkManagerTest.java:22
(Diff revision 3)
> +        assertNull(GeckoNetworkManager.getNextState(testingState, ManagerEvent.disableNotifications));
> +        assertNull(GeckoNetworkManager.getNextState(testingState, ManagerEvent.stop));
> +        assertNull(GeckoNetworkManager.getNextState(testingState, ManagerEvent.receivedUpdate));

(If we returned the current state on no state transition, this could be a better test because we're sure the state didn't change, as opposed to being passed a bad state)

::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/util/NetworkUtilsTest.java:2
(Diff revision 3)
> +/* Any copyright is dedicated to the Public Domain.
> +   http://creativecommons.org/publicdomain/zero/1.0/ */

nit: mpl

::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/util/NetworkUtilsTest.java:24
(Diff revision 3)
> +import org.robolectric.shadows.ShadowNetworkInfo;
> +
> +import static org.junit.Assert.*;
> +
> +@RunWith(TestRunner.class)
> +public class NetworkUtilsTest {

nit: -> `TestNetworkUtils`
Attachment #8746793 - Flags: review?(michael.l.comella) → review+
Comment on attachment 8746793 [details]
MozReview Request: Bug 1236130 - Part 2: Only use cache for tabs when network state is "down" r=mcomella

https://reviewboard.mozilla.org/r/49573/#review47385

All of that for this? ;P
https://reviewboard.mozilla.org/r/49571/#review47363

> nit: We've been using `GeckoNetworkManager.class.getSimpleName()` (but to fair, I'm not clear that's a best practice).

The pattern you mention - `LOGTAG = ... .getSimpleName()` - doesn't appear all that much in my quick look through dxr. It seems super common to just assign an explicit string. Let me know if you feel strongly otherwise, or if there's a styling/convention doc someplace that covers stuff like this.

> It's fragile to use switch statements without explicit `break` statements (even after `return`) and comments like `// fallthrough`. It's a common source of errors when the code is being refactored (e.g. because someone forgets about fall-throughs).
> 
> I think you should either add explicit breaks or if/else.

Agreed; in fact, I got confused by this myself as I was reading this code the next day :-)

> Instead of returning null, I think we can be more specific/explicit:
> * if `currentState` doesn't match any of these states, we should throw
> * We should consider returning the current state, rather than null, in the event that we don't intend to transition (and the performActionForNextState can early return if the state is the same).

I want to stick with null for invalid transitions. I think it's very explicit - for a given state/event, there is no valid state to transition into; it's responsibility of the caller to process this then; it's perfectly valid to have transitions from state onto itself (receivedUpdate for example), and I'd like to have an unambiguous way of modelling and testing that.
Agreed on the throw.

> I'm concerned that the next-state branching and the perform transition branching kind of duplicate each other, but without overcomplicating this (e.g. using Objects to represent State, or writing a lot of extra Runnables for state transitions), I can't think of a better solution.
> 
> Worth mentioning I otherwise do really like the separation of getNextState & performing transition.

This separation keeps everything simple, and lets me test the transition matrix without any fuss, without worrying about any side-effects. Ideally I'd want to test these transitions as well, as it should be trivial with spy wrappers/mocks of sorts around the functions doing actual work (i'm still in the javascript mindset...).

> Why do we want to update the network state when enableNotifications is called?

Added a comment explaining this. I'll copy it here:
        // NB: network state might be queried via getCurrentInformation at any time; pre-rewrite behaviour was
        // that network state was updated whenever enableNotifications was called. To avoid deviating
        // from previous behaviour and causing weird side-effects, we call updateNetworkStateAndConnectionType
        // whenever notifications are enabled.

> You're missing `break`s in the outer case statement. The `switch` inside a `switch` is confusing – I'd recommend making one of them an if statement.

Ahh, missing breaks! I think that fixed some weird issues I was having with this stuff. Thanks for the catch. Indeed, this was easy to miss - I've converted inner switch-es into a series of if-statements - and explicit breaks everywhere.

> Between this and the state transition logs, it sounds like we're going to get a lot of log spam on transitions – is there a way to reduce this?

the "old network state" was more of a debug statement. i think the "new network state" could be useful though if/when problems arise with consumers of this network information, so i've left that one. So we're down to "Incoming event" and "New network state" - so at most two per transition, but usually less (not every transition causes network state update). I think that's ok?

> Could we use `currentNetworkStatus.name()` instead? So we don't have to duplicate the name twice.
> 
> Alternatively, if that does something silly like MAKE_THE_NAME_ALL_CAPS, consider storing `name().toLowerCase()` or overriding the `name` function to return lower case.
> 
> Same for currentConnectionSubtype.

We can't do `.name()` for `ConnectionSubType`, as we need to return types such as "2g", "3g", "4g", which aren't valid enum names - hence `.value`. `ConnectionType` has to use a similar approach for the same reason. `NetworkStatus` is using `.value` to be consistent with the other two. I think consistent/predictable pattern here is worth it, even with perhaps less than specific naming. You can also think of it slightly differently - enums are used as a typing mechanism internally, and `.value` is for external consumption of these enums. (for some vague defintitions of internal/external, as these are all public).

> What is the resolution for this TODO? Is there a follow-up for this? If so, there should be a bug number.

Couldn't find anything relevant, so filed Bug 1270401.

> I prefer explicit breaks here and potentially comments like, `// fallthrough x5`

I've cleaned this stuff up a bit, so `return`s instead of `break`s do not look potentially confusing anymore. Added comments for any fallthroughs.

> I don't think the caller will ever want to pass in a `ConnectivityManager`, except for efficiency, but I don't think it's worth littering the public-facing code with both `doSomething(ConnectivityManager)` and `doSomething()` for every use-case.
> 
> However, I see the testing use-case – do you think it'd be better to make `doSomething(ConnectivityManager)` only `@VisibleForTesting` but otherwise use non-argument versions?

For the only caller right now - `GeckoNetworkManager` - it makes perfect sense to pass in the `ConnectivityManager`. The alternative is passing in `context`, but that is uglier for the caller, and then testing isn't as simple. So I feel that this is fine as is.

With `isConnected` we have `isConnected(@NonNull Context context)` and `isConnected(ConnectivityManager connectivityManager)`, but they're used in very different circumstances, so I feel that the split is justified.

> nit: MPL

https://bugzilla.mozilla.org/show_bug.cgi?id=1265525#c62 ?

> nit: -> `TestGeckoNetworkManager`

I'm following AndroidStudio's standard naming suggestion when creating tests. I feel that it's better to follow IDE's standard naming practices, unless there are other strong reasons at play.

> (If we returned the current state on no state transition, this could be a better test because we're sure the state didn't change, as opposed to being passed a bad state)

This method (getNextState) isn't responsible for keeping state machine sane in light of possibly improper events for a given state, it just defines a matrix. I feel like that would be mixing separate concerns together. Ideally we'd want to test handleManagerEvent separetely to ensure that improper events are no-ops.

> nit: mpl

Margaret seems to disagree? https://bugzilla.mozilla.org/show_bug.cgi?id=1265525#c62

> nit: -> `TestNetworkUtils`

Same comment RE standard IDE naming scheme.
Comment on attachment 8746792 [details]
MozReview Request: Bug 1236130 - Part 1: Use an explicit state machine to control GeckoNetworkManager r=mcomella

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/49571/diff/3-4/
Comment on attachment 8746793 [details]
MozReview Request: Bug 1236130 - Part 2: Only use cache for tabs when network state is "down" r=mcomella

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/49573/diff/3-4/
Any concerns about this change, before I land it?
mcomella suggested I ping you for a quick glimpse at this. My intention with this rewrite was to leave the behaviour of GeckoNetworkManager the same, as far as the outside world is concerned - other than fixing buggy behaviour, namely not sending network status update events to gecko in some cases.

Thanks!
Flags: needinfo?(nchen)
Status: NEW → ASSIGNED
Looks fine!
Flags: needinfo?(nchen)
https://hg.mozilla.org/integration/fx-team/rev/3cac818d4803432669bbe3ed916af12172b56d2e
Bug 1236130 - Part 1: Use an explicit state machine to control GeckoNetworkManager r=mcomella
https://hg.mozilla.org/integration/fx-team/rev/bbda46c289f1c84303db1f8efc89094b4b78ccff
Bug 1236130 - Part 2: Only use cache for tabs when network state is "down" r=mcomella
https://hg.mozilla.org/mozilla-central/rev/3cac818d4803
https://hg.mozilla.org/mozilla-central/rev/bbda46c289f1
Status: ASSIGNED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 49
(In reply to :Grisha Kruglov from comment #52)
> The pattern you mention - `LOGTAG = ... .getSimpleName()` - doesn't appear
> all that much in my quick look through dxr. It seems super common to just
> assign an explicit string. Let me know if you feel strongly otherwise, or if
> there's a styling/convention doc someplace that covers stuff like this.

There's no doc but I've been doing `getSimpleName` since I've been around as I believe others have. Setting an explicit String typically occurs on old files (of which there are many). That being said, we can eventually resolve this in bug 1242091.

> > nit: -> `TestGeckoNetworkManager`
> 
> I'm following AndroidStudio's standard naming suggestion when creating
> tests. I feel that it's better to follow IDE's standard naming practices,
> unless there are other strong reasons at play.

All of our other tests are "Test*" and I'd rather not divide our conventions now.

I didn't realize IJ had a convention and would be up for renaming everything to "*Test" for convenience. However, I'm not sure others would like the churn and the amount of time it'd take to do that.

Margaret, what do you think? Should we keep all of our tests called "Test*", have both conventions, or rename them to "*Test"?

In any case, we should file a follow-up to do one of ^.
Flags: needinfo?(margaret.leibovic)
(In reply to Michael Comella (:mcomella) from comment #60)
> (In reply to :Grisha Kruglov from comment #52)
> > The pattern you mention - `LOGTAG = ... .getSimpleName()` - doesn't appear
> > all that much in my quick look through dxr. It seems super common to just
> > assign an explicit string. Let me know if you feel strongly otherwise, or if
> > there's a styling/convention doc someplace that covers stuff like this.
> 
> There's no doc but I've been doing `getSimpleName` since I've been around as
> I believe others have. Setting an explicit String typically occurs on old
> files (of which there are many). That being said, we can eventually resolve
> this in bug 1242091.

Are there strong arguments for using this pattern? I'm a bit concerned about it adding extra code to the static initializer and making class loading a bit slower (unless ProGuard is able to optimize this away).
(In reply to Michael Comella (:mcomella) from comment #60)

> Margaret, what do you think? Should we keep all of our tests called "Test*",
> have both conventions, or rename them to "*Test"?

I don't have an opinion. Do whatever makes life easiest for developers who write these tests, but in general I'm not a fan of renaming things unless there's a reason.
Flags: needinfo?(margaret.leibovic)
Hi Margaret, is this something that would be considered critical from an end-user point of view? It seems that way to me. Should we consider uplifting to Beta 47 and Aurora48? Thanks!
Flags: needinfo?(margaret.leibovic)
(In reply to Ritu Kothari (:ritu) from comment #63)
> Hi Margaret, is this something that would be considered critical from an
> end-user point of view? It seems that way to me. Should we consider
> uplifting to Beta 47 and Aurora48? Thanks!

This feature is behind a switchboard flag, and it is disabled on release. I think we can let this fix ride the trains and flip the flag when it hits release.
Flags: needinfo?(margaret.leibovic)
(In reply to :Margaret Leibovic from comment #64)
> (In reply to Ritu Kothari (:ritu) from comment #63)
> > Hi Margaret, is this something that would be considered critical from an
> > end-user point of view? It seems that way to me. Should we consider
> > uplifting to Beta 47 and Aurora48? Thanks!
> 
> This feature is behind a switchboard flag, and it is disabled on release. I
> think we can let this fix ride the trains and flip the flag when it hits
> release.

Ok that's reassuring to know. Thanks Margaret. This is disabled for Fx47.
You need to log in before you can comment on or make changes to this bug.