Closed Bug 1420802 Opened 7 years ago Closed 7 years ago

New link detection may not be reliable causing us to not Sync when we should

Categories

(Firefox :: Sync, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
Firefox 59
Tracking Status
firefox59 --- fixed

People

(Reporter: markh, Assigned: eoger)

References

Details

Attachments

(1 file)

On a local VM that's slow to come out of sleep, I saw sync failing to work due to it believing the network is down when it wasn't. I could see this ~50% of the time when the VM was on a spinning disk, but haven't yet seen it since I moved the VM to a SDD.

NetworkLinkService.linkStatusKnown was true (although we don't check that - we should)

NetworkLinkService.isLinkUp was false, even though I could use the browser just fine.

Ed notes this probably means something is wrong in https://searchfox.org/mozilla-central/source/netwerk/system/win32/nsNotifyAddrListener.cpp#613, but that's going to be a bit tricky in my VM and may turn out to be something out of our control (eg, Windows may not be notifying us).

This may be vmware specific, but it worries me - and the impact is very bad - not syncing.

Off the top of my head, I wonder if we could do something like:

* Put the check for this behind a pref, defaulting it to disabled.
* At some point when we know the network must be up, check the NetworkLinkService's status, and if it reports the network is *not* up, emit a telemetry event.
* When we get confidence that it is doing no harm, flip the pref to true.

WDYT?
ISTM like another option would be:

* Remove it from the offline check in policies.js
* Still listen for the link status events, and sync when it comes back up.

E.g. don't trust it to tell us if we're offline, but use its connection notifications. This seems like it would do no harm, but still achieve the goal of syncing as soon as we can when the network comes back up.

> and may turn out to be something out of our control (eg, Windows may not be notifying us).

I was going to say something along the lines of that this seems unlikely to me, since that would be roughtly equivalent to a bug in the OS. To back that up, I looked through the comments on https://bugzilla.mozilla.org/show_bug.cgi?id=939318, but they seem to indicate that I was wrong, and that it might not work in a VM. Or at least there are a few known issues around them, and it isn't 100% clear to me that they were ironed out.

ISTM like there might be a more modern way that works better, but my searches on MSDN don't seem to turn up anything we don't already use in some form or another (not that I looked *that* hard).
(In reply to Thom Chiovoloni [:tcsc] from comment #1)
> ISTM like another option would be:
> 
> * Remove it from the offline check in policies.js
> * Still listen for the link status events, and sync when it comes back up.
> 
> E.g. don't trust it to tell us if we're offline, but use its connection
> notifications. This seems like it would do no harm, but still achieve the
> goal of syncing as soon as we can when the network comes back up.

Yeah, that sounds like a reasonable plan.

> I was going to say something along the lines of that this seems unlikely to
> me, since that would be roughtly equivalent to a bug in the OS. To back that
> up, I looked through the comments on
> https://bugzilla.mozilla.org/show_bug.cgi?id=939318, but they seem to
> indicate that I was wrong, and that it might not work in a VM. Or at least
> there are a few known issues around them, and it isn't 100% clear to me that
> they were ironed out.

Nice find - and yeah, that does look like it remains a little messy. There's also no one else using this interface "in anger" in the tree which worries me a little - eg, I think we'd also want to ensure that the above checks don't cause us to sync very regularly if there are spurious notifications, etc
The impact of this is not syncing at all, even when we should be. As Thom says in comment 1, this is a good indicator for when to try syncing, but shouldn't let this block sync. It would be good to know which platforms this works on, but we don't have much insight into that with the ping now.
Priority: -- → P1
Assignee: nobody → eoger
Status: NEW → ASSIGNED
Pushed by eoger@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6fe2f6075e22
Remove NetworkLinkService from offline check. r=tcsc
It's unfortunately notmirrored here, but the review happened here: https://phabricator.services.mozilla.com/D300
https://hg.mozilla.org/mozilla-central/rev/6fe2f6075e22
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 59
Comment on attachment 8933029 [details]
Bug 1420802 - Remove NetworkLinkService from offline check. r=tcsc

Thom Chiovoloni [:tcsc] has approved the revision.

https://phabricator.services.mozilla.com/D300#7286
Attachment #8933029 - Flags: review+
Attachment #8933029 - Attachment description: Bug 1420802 - Remove NetworkLinkService from offline check. r?tcsc → Bug 1420802 - Remove NetworkLinkService from offline check. r=tcsc
Depends on: 1453887
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: