Closed Bug 1562315 Opened 2 months ago Closed 2 months ago

TLSFilterTransaction::WriteSegments called from tunneling nsHttpConnection::OnSocketReadable may cause an indefinite loop

Categories

(Core :: Networking: HTTP, defect, P2)

defect

Tracking

()

VERIFIED FIXED
mozilla70
Tracking Status
firefox-esr68 68+ fixed
firefox68 --- fixed
firefox69 --- fixed
firefox70 --- fixed

People

(Reporter: mayhemer, Assigned: mayhemer)

References

(Blocks 1 open bug)

Details

(Keywords: hang, Whiteboard: [necko-triaged][secure-proxy-mvp])

Attachments

(1 file, 1 obsolete file)

Testing with a localhost download exhauster and common websites I got a loop in nsHttpConnection::OnSocketReadable, twice. Happens because the again=false set by the session is forgotten by the TLSFilterTransaction default WriteSegmentsAgain implementation: nsAHttpTransaction.h - mozsearch.

Priority: -- → P1
Whiteboard: [necko-triaged][secure-proxy-mvp]
Pushed by honzab.moz@firemni.cz:
https://hg.mozilla.org/integration/autoland/rev/3eea5e6ac99e
Respect again=false indicated by http2 session when calling through TLSFilterTransaction, r=dragana
Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla69
Keywords: hang
Regressions: 1563483

This is log of the loop:

2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: V/nsHttp nsHttpConnection::OnSocketReadable 0000026A87076C00 trans->ws rv=0 n=0 socketin=0 loopCounter=10
2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: D/nsHttp TLSFilterTransaction::WriteSegments 0000026A9033E020 max=32768
2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: I/nsHttp Http2Session::WriteSegments 0000026A83949800 InternalState 4
2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: I/nsHttp Http2Stream::WriteSegments 0000026A87031740 count=32768 state=4
2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: V/nsHttp nsHttpTransaction::WriteSegments 0000026A85D8FC00
2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: V/nsHttp nsHttpConnectionMgr::ShouldThrottle trans=0000026A85D8FC00
2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: I/nsHttp Http2Session::WriteSegments session=0000026A83949800 id 0x13 needscleanup=0000000000000000. cleanup stream based on stream->writeSegments returning code 804b0002
2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: I/nsHttp Http2Session::CleanupStream 0000026A83949800 by ID 0x13 to stream 0000026A87031740
2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: I/nsHttp Http2Session::CleanupStream 0000026A83949800 0000026A87031740 0x13 0
2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: I/nsHttp Http2Session::CleanupStream 0x13 deferred
2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: V/nsHttp nsHttpConnection::ResumeRecv [this=0000026A87076C00]
2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: D/nsHttp InputStreamShim::AsyncWait 0000026A8EF50A00 callback 0000026A87076C10
2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: D/nsHttp TLSFilterTransaction 0000026A9033E020 called trans->WriteSegments rv=0 0
2019-06-28 19:43:42.854000 UTC - [Parent 15044: Socket Thread]: V/nsHttp nsHttpConnection::OnSocketReadable 0000026A87076C00 trans->ws rv=0 n=0 socketin=0 loopCounter=10

I see this is targeting 69, does it need to be uplifted to 68?

Flags: needinfo?(honzab.moz)

This should not be uplifted to 68, as it caused a regression and is being backed out of 69 (see bug 1563803)

Flags: needinfo?(honzab.moz)

(In reply to Nhi Nguyen (:nhi) from comment #6)

This should not be uplifted to 68, as it caused a regression and is being backed out of 69 (see bug 1563803)

Yes, this is going to be backed out in bug 1563803. I will then reopen this (with a reference to b-o cset) and leave open. This doesn't block MVP any longer (was not able to repro after whole afternoon of testing.)

Priority: P1 → P2
Whiteboard: [necko-triaged][secure-proxy-mvp] → [necko-triaged][secure-proxy-skyline]

Dragana and me figure this probably out.

The problem is that we cancel an (h2) transaction on the main thread with e.g. NS_ERROR_PARSED_DATA_CACHED, while we are also going to just read it on the socket thread (already looping onsocketready of the proxy connection). OnMsgCancelTransaction is never delivered to the socket thread, we start looping because reading the transaction's pipe fails (the error is visible there) and the logic ignores *again = false then.

Note that we get stuck at:
https://searchfox.org/mozilla-central/rev/4436573fa3d5c4311822090e188504c10c916c6f/xpcom/io/nsPipe3.cpp#1641

the returned error is then propagated back to:
https://searchfox.org/mozilla-central/rev/4436573fa3d5c4311822090e188504c10c916c6f/netwerk/protocol/http/Http2Session.cpp#3416

To fix the regression, we need to call ForceRecv() on the tunneling connection, I added a note to phab.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → ASSIGNED
Attachment #9074881 - Attachment is obsolete: true
Pushed by honzab.moz@firemni.cz:
https://hg.mozilla.org/integration/autoland/rev/99a239deee75
Respect again=false indicated by http2 session when calling through TLSFilterTransaction, r=dragana

Comment on attachment 9076769 [details]
Bug 1562315 - Respect again=false indicated by http2 session when calling through TLSFilterTransaction, r=dragana

Beta/Release Uplift Approval Request

  • User impact if declined: requesting approvat at Dragana's request.

an indefinite loop of the socket thread totally disables the network functionality of the browser (can't load pages) until restart

  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): low - because this is again isolated to tunneling only and adds the necessary kick to move the end to end session's logic forward to fix the regression this patch was previously backed out for.

I also managed to check the bug is actually fixed by this patch by hitting the code path.

  • String changes made/needed:
Attachment #9076769 - Flags: approval-mozilla-release?
Whiteboard: [necko-triaged][secure-proxy-skyline] → [necko-triaged][secure-proxy-mvp]

Comment on attachment 9076769 [details]
Bug 1562315 - Respect again=false indicated by http2 session when calling through TLSFilterTransaction, r=dragana

Beta/Release Uplift Approval Request

m-c has been already merged to m-b, so we need an uplift there as well
see Comment 12

Attachment #9076769 - Flags: approval-mozilla-beta?
Target Milestone: mozilla69 → ---
Status: ASSIGNED → RESOLVED
Closed: 2 months ago2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla70

Comment on attachment 9076769 [details]
Bug 1562315 - Respect again=false indicated by http2 session when calling through TLSFilterTransaction, r=dragana

Approved for 69.0b4.

Attachment #9076769 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Blocks: 1565518

Please add repro steps with expected results.

(In reply to Rebecca Billings [:rbillings] from comment #17)

Please add repro steps with expected results.

I'm afraid this one is extremely hard to reproduce (never was able to hit it under normal+only the proxy setup). An attempt to reproduce would be similar to https://bugzilla.mozilla.org/show_bug.cgi?id=1563695#c12.

To push this a bit, you can start logging according https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging. But that is not a standard setting for users and also may cause other problems (bug 1463729) and generally change the behavior (timing/racing) of the http stack, so it's not an objective testing.

There were big differences here when testing upload/download speeds between Secure Proxy and other services compared to no proxy at all.

For Windows, the differences were not as noticeable but Mac and Linux machines had slower than expected results. Linux having a really big spike in its ping as well. I've noted everything down here.

Normal browsing didn't seem to be affected all that much pages loading decently taking into consideration the huge drop from the internet speed test.

Testing on Mac 10.14.15 the differences in my results were much less. With the secure proxy mbps= 220, without secure proxy mbps=260. Normal browsing worked well. Marking as verified - unless you choose to re-open it based on the results from comment 19.

Status: RESOLVED → VERIFIED

Comment on attachment 9076769 [details]
Bug 1562315 - Respect again=false indicated by http2 session when calling through TLSFilterTransaction, r=dragana

fix for http2 proxying, needed for secure-proxy project. approved for 68.0.1 / 68.1esr

Attachment #9076769 - Flags: approval-mozilla-release?
Attachment #9076769 - Flags: approval-mozilla-release+
Attachment #9076769 - Flags: approval-mozilla-esr68+

Per discussion with jcristau, we're uplifting this to 68.0.1esr also to maintain parity with the non-ESR 68.0.1 release and hopefully avoid some confusion.

Duplicate of this bug: 1459819

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression

(In reply to Release mgmt bot [:sylvestre / :calixte / :marco for bugbug] from comment #26)

Bugbug thinks this bug is a regression, but please revert this change in case of error.

No, it's really not, dear bot :)

Keywords: regression
You need to log in before you can comment on or make changes to this bug.