Open Bug 1962316 Opened 10 months ago Updated 6 months ago

[Android] Using a (wireguard-based) VPN while changing network bearers leads to incomplete requests

Categories

(Core :: Networking, defect, P2)

defect
Points:
5

Tracking

()

People

(Reporter: jonalmeida, Unassigned, NeedInfo)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-triaged][necko-priority-queue])

Steps to reproduce

  1. Have a wireguard-based VPN profile enabled.
  2. Start a page load.
  3. Before the page load is complete, change network bearers (mobile data <-> wifi).
  4. Observe.

Expected behaviour

  • The page load should complete relatively soon (1-2 secs) from when the network bearer switch is successful and other apps start to work.

Actual behaviour

  • The page load remains in the same state and does not change.
  • Refreshing has no effect on the page load completing.
  • Swiping the app away to kill it, and then bring it back is typically what I do to get the page to load.

Additional information

  • Device info: Android 15, Pixel 8a, running Focus and Fenix 137 & 139 (current Nightly).
  • I say "wireguard-based", but I'm not certain if that is truly the cause - it just happens to be the common factor in the two options I've experienced: Tailscale and homegrown wireguard server.
  • Unconfirmed, but I will try to reproduce this with apps that use an Android VPN profile that are used for device-wide tracking protection but do not have real VPN servers that route your traffic. This might help narrow the cause to Android OS vs wireguard client apps.
  • Will gather and link to a network profile when I reproduce this again - let me know if I can add any more additional to this.

I believe about:logging can now be used to capture profiles with logs now.
I'd really appreciate it if you could use that - it should give us some hints about what happens to the sockets when the network transitions and something gets stuck. Thanks!

Severity: -- → S3
Points: --- → 5
Rank: 1
Flags: needinfo?(jonalmeida942)
Priority: -- → P2
Whiteboard: [necko-triaged][necko-priority-next][necko-priority-review]
Whiteboard: [necko-triaged][necko-priority-next][necko-priority-review] → [necko-triaged][necko-priority-next]
Whiteboard: [necko-triaged][necko-priority-next] → [necko-triaged][necko-priority-queue]

I think I can reproduce the situation with a different set of steps as well. Will get logs with these STR when I try again:

Steps to reproduce

  1. Have a wireguard-based VPN profile disabled.
  2. Open Fenix/Focus and start to load a page.
  3. Enable the VPN during or as quickly as possible after the page was done loading.
  4. Click a link on the same page.

Hey valentin, these were the logs I was able to capture while on the (terrible) Toronto subway that doesn't have network connectivity between stations. I haven't look into them, but I'm hoping they have enough context to help.

While taking these logs, I observed that other apps (like Slack) reacted quicker when the networks bar had strength vs no signal, so I wonder if there is an integration problem between GeckoView knowing when there is connectivity to give Gecko enough signal - just some guess work though. ¯\_(ツ)_/¯

Flags: needinfo?(jonalmeida942) → needinfo?(valentin.gosu)

Looks similar to me as bug 1960421 on the surface.

See Also: → 1960421
You need to log in before you can comment on or make changes to this bug.