Open Bug 1897075 Opened 1 year ago Updated 1 month ago

Connection downgrades create new connection

Categories

(Core :: Networking, enhancement, P2)

enhancement

Tracking

()

REOPENED

People

(Reporter: simonf, Assigned: kershaw, NeedInfo)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-triaged])

Attachments

(3 files, 4 obsolete files)

It seems fixing a URI creates a new connection (or load, sorry if I'm messing up the terminology). This causes a failed HTTPS upgrade: HTTP -> HTTPS -> HTTP to start a new connection and some consumers don't understand that.

Specifically, ContentPage is waiting for nsIWebProgressListener.STATE_STOP to determine that a connection has completed at XPCShellContentUtils.sys.mjs#80. Unfortunately, at DocumentLoadListener.cpp#2395 a new load is started when a URI fixup is taking place. This leads to a race between the replaced load finishing and the listener reacting to what it thinks is already complete.

There are a few other uses of STATE_STOP: https://searchfox.org/mozilla-central/search?q=STATE_STOP&path=&case=false&regexp=false so this might very well happen outside of tests.

Is there a better event for listeners to attach to or can we somehow make the downgrade transparent like the upgrade?

Example tests that fail because of this would be `

  • xpcshell-remote.toml:toolkit/components/extensions/test/xpcshell/test_ext_dnr_private_browsing.js
  • uriloader/exthandler/tests/mochitest/browser_download_idn_blocklist.js

See https://bugzilla.mozilla.org/show_bug.cgi?id=1894662 for a patch to reproduce. This problem hasn't shown up so far because HTTPS->HTTP downgrades were broken in most cases when a proxy is used.

So if I understand the problem correctly, we try to connect to a HTTPS page, there is an error, we end up calling DocumentLoadListener::OnStartRequest which in turn triggers a new HTTP connection.
The problem happens because DocumentLoadListener::OnStopRequest still triggers, causing the load to be completed before the HTTP request comes in.

Ideally, it would have been possible to trigger a redirectTo to the downgraded URI, but we can't do that at this point because the OnStartRequest callback has been called, and any listeners in between nsHttpChannel and DocumentLoadListener might not like OnStartRequest being called more than once 🙂.

Another option might be to teach DocumentLoadListener how to do this additional internal redirect - not sure how difficult that would be.

Or if that doesn't work, we could make DocumentLoadListener ignore the calls to OnStartRequest/OnDataAvailable/OnStopRequest coming from the first channel, once we've fired off the downgraded load, and wait for the next OnStopRequest.

Nika, do you think any of my previous suggestions are a valid fix for this?

Flags: needinfo?(nika)
Blocks: 1894662
No longer blocks: 1894662
Blocks: 1897136
See Also: → 1897148
Blocks: 1719271

Would it be possible for the downgrade to happen before the OnStartRequest is fired in the DocumentLoadListener or similar, such that we can do a redirectTo as you suggested?

I think I'm a bit confused as to how the other suggestions you made would help with the potential "any listeners in between nsHttpChannel and DocumentLoadListener might not like OnStartRequest being called more than once" issue you mentioned otherwise, as anything which continues handling this check in DocumentLoadListener seems like it by necessity can only be acting after those listeners are hit.

If we developed some way to allow DocumentLoadListener to do a smooth transition to a new load/channel under the hood after receiving OnStartRequest that could potentially be interesting, though I don't know how frequently we'd use it other than this (e.g. I don't think it'd be super practical to redirect from the failing channel into an error page load to make it a single load request, as we currently need things like the failing channel in the error page process).

Flags: needinfo?(nika) → needinfo?(valentin.gosu)

(In reply to Nika Layzell [:nika] (ni? for response) from comment #4)

Would it be possible for the downgrade to happen before the OnStartRequest is fired in the DocumentLoadListener or similar, such that we can do a redirectTo as you suggested?

I was thinking we could add a new "event listener" that would be called before onStartRequest, and DocumentLoadListener could have called RedirectTo. Something like this:

+---------------------+   +------------------+   +--------------------+
|nsHttpChannel        |   |nsHttpCompressConv|   |DocumentLoadListener|
|                     |   |                  |   |                    |
| OnStartRequest      |   |                  |   |                    |
|      ---------------+---+------------------+--->OnBeforeStartRequest|
|                     |   |                  |   |                    |
|       RedirectTo <--+---+------------------+---+-----               |
|         ...         |   |                  |   |                    |
|                     |   |                  |   |                    |
| OnStartRequest      |   |                  |   |                    |
|     ----------------+---+->OnStartRequest--+---+--->OnStartRequest  |
|                     |   |                  |   |                    |
+---------------------+   +------------------+   +--------------------+

I don't think it'd be super practical to redirect from the failing channel into an error page load to make it a single load request, as we currently need things like the failing channel in the error page process

That's a good point. Also I don't think redirect chain normally contains failed channels, so that might make things more complex.

Flags: needinfo?(valentin.gosu)

Since there are two cases, a failing channel which can be retried and a failing channel which cannot be retried, could we just have two separate events?

Flags: needinfo?(valentin.gosu)

I think that should be possible. I don't know DocumentLoadListener well enough to actually attempt a fix, but I believe the fix should go there.
You could look at whether not calling RejectOpenPromise in DisconnectListeners avoids the race, since it's that bit of the code that triggers nsIWebProgressListener.STATE_STOP.
Or even better, I see there's a continueNavigating parameter being passed to DisconnectListeners, which controls whether we fire STATE_STOP in in the promise rejection code

Flags: needinfo?(valentin.gosu)
Severity: -- → N/A
Priority: -- → P2
Whiteboard: [necko-triaged]
Blocks: https-first-mode
No longer blocks: 1897136
Blocks: 1897136
No longer blocks: 1719271
No longer blocks: 1897136

IIUC, if this changes the channel ID, it will deactivate the BFCache because of https://searchfox.org/mozilla-central/rev/6bd577bc99ce0d2fb8ec32c86df99d8a24bee612/docshell/base/CanonicalBrowsingContext.cpp#3008,3015-3017

I am not familiar with this code. Would it create a new channel ID, Valentin?

Flags: needinfo?(valentin.gosu)
See Also: → 1877935

(In reply to Simon Friedberger (:simonf) from comment #8)

IIUC, if this changes the channel ID, it will deactivate the BFCache because of https://searchfox.org/mozilla-central/rev/6bd577bc99ce0d2fb8ec32c86df99d8a24bee612/docshell/base/CanonicalBrowsingContext.cpp#3008,3015-3017

I am not familiar with this code. Would it create a new channel ID, Valentin?

I'm not too familiar with the BFCache either, so I don't really know if the browsing context sees both the channel ID of the HTTPS channel and of the downgraded HTTP channel. Maybe Peter knows this?

Flags: needinfo?(valentin.gosu) → needinfo?(peterv)

I don't know, but note that even then it wouldn't exactly deactivate BFCache, it would just block the page from the BFCache as long as neither of those channels have received a StopRequest.

Flags: needinfo?(peterv)

This seems fairly involved and is currently not blocking anything.

Status: NEW → RESOLVED
Closed: 10 months ago
Resolution: --- → WONTFIX

There are multiple tests that skip the pref because of this, and more in the patch to bug 1897136.

When I updated the test to be unskipped (bug 1952681), I had to skip the https_first pref on desktop too: https://searchfox.org/mozilla-central/rev/b7b6aa5e8ffc27bc70d4c129c95adc5921766b93/toolkit/components/extensions/test/xpcshell/test_ext_contentscript_context.js#172-176
... because not doing so caused the window actor to be destroyed during spawn(), likely because the test continues too quickly after loadContentPage as explained in the initial comment of this bug.

Even if this issue cannot be solved at a fundamental level, would it perhaps be feasible to update the logic in XPCShellContentUtils to account for the expected process switch attempt? Then we don't have to skip the tests in so many places. The test server in xpcshell tests does not support https anyway (bug 1742061).

See Also: → 1897136, 1952681, 1742061
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Attached file WIP: Bug 1897075: Test sketch (obsolete) —
Assignee: nobody → sfriedberger
Attached file WIP: Bug 1897075: Break some tests (obsolete) —
Attachment #9491133 - Attachment description: Bug 1897075: Add BeforeOnStartRequest for HTTPS-First downgrades r=valentin → Bug 1897075: Downgrade HTTPS-First early r=valentin
Attachment #9491132 - Attachment is obsolete: true
Attachment #9491134 - Attachment is obsolete: true
See Also: → 1965575
Whiteboard: [necko-triaged] → [necko-triaged][necko-priority-new]
Attached file WIP: Bug 1897075: Enable tests (obsolete) —

needinfo my self to take a look.

Flags: needinfo?(kershaw)
Whiteboard: [necko-triaged][necko-priority-new] → [necko-triaged]
Attachment #9501810 - Attachment is obsolete: true
Attachment #9501807 - Attachment is obsolete: true
Assignee: sfriedberger → kershaw
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: