Open Bug 1979279 Opened 4 months ago Updated 1 month ago

High packet loss on QUIC on Windows with GSO

Categories

(Core :: Networking, defect, P2)

defect

Tracking

()

ASSIGNED
Tracking Status
firefox-esr128 --- unaffected
firefox-esr140 --- unaffected
firefox141 --- unaffected
firefox142 + disabled
firefox143 --- disabled
firefox144 --- disabled

People

(Reporter: mail, Assigned: mail)

References

(Regression)

Details

(Keywords: leave-open, regression, Whiteboard: [necko-triaged] )

Attachments

(4 files)

Seeing high packet loss on QUIC connections on Windows since Neqo v0.14.0 landed.

See following graph. Note sharp uptick in 95th percentile on 2025-07-14 from ~0.5% packet loss to ~6.5% packet loss, same day as Bug 1975873 landed.

https://glam.telemetry.mozilla.org/firefox/probe/http3_loss_ratio/explore?os=Windows&process=parent&visiblePercentiles=%5B95%2C75%2C50%5D

Assignee: nobody → mail
No longer blocks: 1927797
Status: NEW → ASSIGNED
Type: task → defect
No longer depends on: 1975873
Keywords: regression
Regressed by: 1975873
Attached image image.png β€”

Hypothesis thus far:

Set release status flags based on info from the regressing bug 1975873

Disable HTTP3 QUIC UDP GSO IO, i.e. sending multiple UDP datagrams in
one sys call, on Windows. See Bug for details.

I suggest the following:

  1. disable GSO (aka. USO) on Windows on Firefox Nightly (see patch above)
  2. if it fixes the issue:
    1. backport the fix to Beta
    2. investigate why GSO is failing on (some) Windows devices
  3. if it doesn't fix the issue:
    1. find a new hypothesis

FYI: GSO might also be the root cause of bug 1978821.

See Also: → 1978821
Keywords: leave-open
Duplicate of this bug: 1978821
Severity: -- → S3
Priority: -- → P2

Potential fix landed. Once metrics on Nightly are back to normal, I will propose a back-port for Beta.

Latest data point on Glam is 2025-07-24. Thus, there is no confirmation (or lack of confirmation) whether the above patch fixed the issue. Will check back in tomorrow.

The bug is marked as tracked for firefox142 (beta). However, the bug still has low severity.

:ghess, could you please increase the severity for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit BugBot documentation.

Flags: needinfo?(ghess)

Disable HTTP3 QUIC UDP GSO IO, i.e. sending multiple UDP datagrams in
one sys call, on Windows. See Bug for details.

Original Revision: https://phabricator.services.mozilla.com/D258686

Attachment #9503334 - Flags: approval-mozilla-beta?

firefox-beta Uplift Approval Request

  • User impact if declined: Degraded network performance
  • Code covered by automated testing: yes
  • Fix verified in Nightly: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: no manual QE testing
  • Risk associated with taking this patch: Not aware of additional risks. Disables a new feature (i.e. GSO)
  • Explanation of risk level: Not sure how to clasify.
  • String changes made/needed: no
  • Is Android affected?: no
Attached image image.png β€”

Status update:

  • since the patch to Nightly, the loss_ratio metric on Nightly has recovered
  • thus I requested a Beta uplift https://phabricator.services.mozilla.com/D258916
  • I tried to reproduce the bug on 6 different Windows machines (x86-64 and ARM) without success
Flags: needinfo?(ghess)
Attachment #9503334 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Summary: High packet loss on QUIC on Windows → High packet loss on QUIC on Windows with GSO
Depends on: 1992919
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: