Closed Bug 1648553 Opened 2 months ago Closed 1 month ago

Sometimes page loads start hanging in all tabs with http3

Categories

(Core :: Networking: HTTP, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla80
Tracking Status
firefox-esr68 --- unaffected
firefox77 --- disabled
firefox78 --- disabled
firefox79 --- fixed
firefox80 --- fixed

People

(Reporter: cpeterson, Assigned: dragana)

References

(Regression)

Details

(Keywords: regression, Whiteboard: [necko-triaged])

Attachments

(1 file)

I re-enabled http3 in today's Nightly and sometimes page loads start hanging in all tabs. Nightly never recovers. I have to quit the browser, which can take 1-2 minutes and often ends by hitting nsHttpConnectionMgr::Shutdown crash bug 1633342:

bp-05b492a2-92ac-4428-b79d-c21dc0200625
bp-a50500b2-83e4-43db-b27e-494f00200625
bp-d5bd4327-3687-46a9-9124-0566e0200625

I didn't have this problem when using http3 with neqo 0.4.1. I don't know if this is a regression in the recent neqo 0.4.4 update (bug 1633342) or the previous 0.4.3 update (because I didn't test that one).

I think we have llop in Http3Session. There are 2 changes that could cause it.

Assignee: nobody → dd.mozilla
Status: NEW → ASSIGNED
Regressed by: 1624105
No longer regressed by: 1633342
See Also: 1647769

I could easily reproduce this in Mac: load a http3 page, put laptop to hibernation, turn it back on.
The underlying problem was that the quic connection was closed because of timeout and necko has tried to add a new stream: it reads headers(GetHeadersString) and calls TryActivating that fails failed (line).
The transaction has written some data so the session calls ReadSegment again to see if the transaction will write more data:

  • Http3Stream is still in PREPARING_HEADERS state and calls GetHeadersString again here (because mActivatingFailed is false)
  • GetHeadersString will read some data because it is looking for the end-of-headers determinator but it can not find it in the request body (note the headers are already read in the first call).
  • here ReadSegment return NS_OK and also countRead is > 0 so Http3Session calls ReadSegment again,
  • and so no...

The easy fix was to not call GetHeadersString when mRequestHeadersDone==true.

I have decided to refactor Http3Stream states:

  • PREPARING_HEADER - this state only read the headers from the transaction and switches to WAITING_TO_ACTIVATE state when it is done reading,
  • WAITING_TO_ACTIVATE only calls OnReadSegment directly (not through transaction) because it may happen that transaction will not call OnReadSegment
  • and the rest are the same

In normal case when reading headers from transaction is done TryActivating will be call immediately (PREPARING_HEADER falls through to WAITING_TO_ACTIVATE in OnReadSegment) in most cases the transaction will be immediately activated and Http3Stream will switch to the SENDING_BODY or SEND_DONE state . If TryActivating fails next time calling ReadSegment will only call OnReadSegment directly to try to activate the transaction again.
The 2 code paths:

  1. Call ReadSegment -> in OnReadSegment all headers are read and in in switch it fails through and calls TryActivating. The stream is immediately activated -> Http3Session will call ReadSegment immediately again to send request body
  2. Call ReadSegment -> in OnReadSegment all headers are read and in the switch-case it fails through and calls TryActivating. TryActivating fails. The stream is in WAITING_TO_ACTIVATTE state. When new streams can be activated again Http3Session calls ReadSegment for this stream again. In ReadSegment a fake OnReasSegment is called directly. If it succeeds it fails through to SENDING_BODY.
Severity: -- → S3
Priority: -- → P2
Whiteboard: [necko-triaged]
Pushed by ddamjanovic@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/597ca7dea687
Improve Http3Stream sending states. r=michal,necko-reviewers
Status: ASSIGNED → RESOLVED
Closed: 1 month ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla80
Duplicate of this bug: 1651565
Duplicate of this bug: 1651541

Should we consider uplift this to beta given the crash rate is high?

Comment on attachment 9159795 [details]
Bug 1648553 - Improve Http3Stream sending states.

Beta/Release Uplift Approval Request

  • User impact if declined: This fix a crash. The crash is in feature(HTTP3) that is disabled by default, but the rate of crashes on beta is high (Bug 1651625 comment 16, out of first 50 crashes 23 where cause by the bug that this patch fixes).
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch changes the code that is only executed if HTTP3 is on (the code in Http3Stream). Therefore if http3 is disabled this code will never be executed and http3 is disabled by default.
  • String changes made/needed: none
Attachment #9159795 - Flags: approval-mozilla-beta?

Comment on attachment 9159795 [details]
Bug 1648553 - Improve Http3Stream sending states.

HTTP3 crash fix, approved for 79.0b6.

Attachment #9159795 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.