webRequest.StreamFilter errors for downloads
Categories
(WebExtensions :: Request Handling, defect, P3)
Tracking
(firefox104 affected, firefox105 affected, firefox106 affected)
People
(Reporter: maxime+mozilla, Unassigned)
References
Details
(Keywords: regressionwindow-wanted)
Attachments
(2 files)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:104.0) Gecko/20100101 Firefox/104.0
Steps to reproduce:
- From an extension, create a
webRequest.StreamFilter
which substitutes a JPEG by another.
const image = new Uint8Array([0xff, 0xd8, 0xff, 0xe0, 0x00, 0x10, 0x4a, 0x46, 0x49, 0x46, 0x00, 0x01, 0x02, 0x00, 0x00, 0x64, 0x00, 0x64, 0x00, 0x00, 0xff, 0xec, 0x00, 0x11, 0x44, 0x75, 0x63, 0x6b, 0x79, 0x00, 0x01, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xff, 0xee, 0x00, 0x0e, 0x41, 0x64, 0x6f, 0x62, 0x65, 0x00, 0x64, 0xc0, 0x00, 0x00, 0x00, 0x01, 0xff, 0xdb, 0x00, 0x84, 0x00, 0x1b, 0x1a, 0x1a, 0x29, 0x1d, 0x29, 0x41, 0x26, 0x26, 0x41, 0x42, 0x2f, 0x2f, 0x2f, 0x42, 0x47, 0x3f, 0x3e, 0x3e, 0x3f, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x01, 0x1d, 0x29, 0x29, 0x34, 0x26, 0x34, 0x3f, 0x28, 0x28, 0x3f, 0x47, 0x3f, 0x35, 0x3f, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0x47, 0xff, 0xc0, 0x00, 0x11, 0x08, 0x00, 0x08, 0x00, 0x19, 0x03, 0x01, 0x22, 0x00, 0x02, 0x11, 0x01, 0x03, 0x11, 0x01, 0xff, 0xc4, 0x00, 0x61, 0x00, 0x01, 0x01, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x02, 0x05, 0x01, 0x01, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x04, 0x10, 0x00, 0x02, 0x02, 0x02, 0x02, 0x03, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x02, 0x11, 0x03, 0x00, 0x41, 0x21, 0x12, 0xf0, 0x13, 0x04, 0x31, 0x11, 0x00, 0x01, 0x04, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x21, 0x31, 0x61, 0x71, 0xb1, 0x12, 0x22, 0xff, 0xda, 0x00, 0x0c, 0x03, 0x01, 0x00, 0x02, 0x11, 0x03, 0x11, 0x00, 0x3f, 0x00, 0xa1, 0x7e, 0x6b, 0xad, 0x4e, 0xb6, 0x4b, 0x30, 0xea, 0xe0, 0x19, 0x82, 0x39, 0x91, 0x3a, 0x6e, 0x63, 0x5f, 0x99, 0x8a, 0x68, 0xb6, 0xe3, 0xea, 0x70, 0x08, 0xa8, 0x00, 0x55, 0x98, 0xee, 0x48, 0x22, 0x37, 0x1c, 0x63, 0x19, 0xaf, 0xa5, 0x68, 0xb8, 0x05, 0x24, 0x9a, 0x7e, 0x99, 0xf5, 0xb3, 0x22, 0x20, 0x55, 0xea, 0x27, 0xcd, 0x8c, 0xeb, 0x4e, 0x31, 0x91, 0x9d, 0x41, 0xff, 0xd9])
function listener(details) {
let filter = browser.webRequest.filterResponseData(details.requestId);
filter.onstart = () => {
console.log('onstart: replacing image')
console.log([...image])
filter.write(image)
}
filter.ondata = event => {
console.log('ondata: ignoring original image')
console.log([...new Uint8Array(event.data)])
}
filter.onerror = () => {
console.log('onerror: ' + filter.error)
}
filter.onstop = () => {
console.log('onstop: disconnecting filter')
filter.disconnect();
}
console.log('onBeforeRequest: created filter for request ' + details.requestId)
return {};
}
browser.webRequest.onBeforeRequest.addListener(
listener,
{urls: ["https://www.w3schools.com/images/myw3schoolsimage.jpg"]},
["blocking"]
);
-
Navigate to a page containing the substituted JPEG URL (e.g.: https://www.w3schools.com/TAGS/tryit.asp?filename=tryhtml5_a_download ) and confirm the substitution occurred. This can also be confirmed in the console (
ctrl + J
) by the presence of theonstart: replacing image
entry. -
Trigger a download of the image based on the URL (not through a right-click on the image; e.g. using the
download
attribute on thea
tag). In the above example ( https://www.w3schools.com/TAGS/tryit.asp?filename=tryhtml5_a_download ), clicking on the image will download it from its URL.
Actual results:
The downloaded image is the original image, not the substituted one. The webRequest.StreamFilter
also fired an onerror
event for Invalid request ID
even-though the request ID was obtained through the webRequest.onBeforeRequest
listener.
Expected results:
The downloaded image should be the substituted one.
Comment 1•2 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Firefox::Downloads Panel' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Updated•2 years ago
|
Comment 2•2 years ago
|
||
Hello,
I reproduced the issue on the latest Nightly (106.0a1/20220904213226), Beta (105.0b7/20220904185841) and Release (104.0.120220829141339) under Windows 10 x64 and Ubuntu 16.04 LTS.
After running the provided script from any installed add-on’s console, the image on https://www.w3schools.com/TAGS/tryit.asp?filename=tryhtml5_a_download will indeed change to a substitute and a onstart: replacing image
entry will be logged to the browser console.
Downloading the image via clicking on it reveals that the original image and not the substitute is actually downloaded, confirming the issue.
For more details, see the attached screenshot.
Comment 3•2 years ago
|
||
Updated•2 years ago
|
Comment 4•2 years ago
|
||
Performed a bisection spanning 2018-01-01 until the present day and found that the issue started occurring from 2019-09-09.
Unfortunately Mozregression could not bisect further into that day and stopped, however it produced a pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ce8742e6c77b7aef8a2beba32776ca3494adacb8&tochange=bdb64cf16b68c4a7212ba16aef425bce66d8f4ca
I’ve refined the search further from 2019-09-08 until 2019-09-09, but Mozregression stopped in this case as well, not finding enough data to bisect and without producing a regressor bug or pushlog.
Comment 5•2 years ago
|
||
The severity field is not set for this bug.
:mixedpuppy, could you have a look please?
For more information, please visit auto_nag documentation.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 8•4 months ago
|
||
I took a look at the issue in another bug, and posted some logging at https://bugzilla.mozilla.org/show_bug.cgi?id=1426789#c12
From the debug build logs, in https://bugzilla.mozilla.org/show_bug.cgi?id=1426789#c13, I see that the immediate trigger for disconnecting is ParentProcessDocumentOpenInfo::OnDocumentStartRequest
. That calls ParentProcessDocumentOpenInfo ::DisconnectChildListeners
, which in its turn calls DocumentLoadListener::DisconnectListeners
, which has the following comment:
// TODO: If we retargeted the stream to a non-default handler (e.g. to trigger
// a download), we currently never attach a stream filter. Should we attach a
// stream filter in those situations as well?
mStreamFilterRequests.Clear();
I suppose that the answer here would be "Yes", but I don't know how to do that.
Nika - do you have any suggestions on what to do here?
Comment 9•4 months ago
|
||
In general, we only apply stream converters within the final content process on the HttpChannelChild. When doing a download, we never create the HttpChannelChild as we directly download it, so we never end up attaching the stream listeners as you noticed.
It might be possible to have the stream converter apply unconditionally by having nsHttpChannel
apply the stream converters before DocumentChannel. It appears we already do this when working with multipart/x-mixed-replace
channels, as those needed the stream filter to run before multipart parsing happens (added in bug 1638422). It might be worth testing out a simple patch like this to see if it works (though I'm not sure if this is the approach we'd want to take as I mention below):
// https://searchfox.org/mozilla-central/rev/fa86401b80f19afb6ed9bfca127ecc5e3a6f0cdc/netwerk/protocol/http/nsHttpChannel.cpp#1844-1865
for (StreamFilterRequest& request : mStreamFilterRequests) {
if (docListener) {
mozilla::ipc::Endpoint<extensions::PStreamFilterParent> parent;
mozilla::ipc::Endpoint<extensions::PStreamFilterChild> child;
nsresult rv = extensions::PStreamFilter::CreateEndpoints(&parent, &child);
if (NS_FAILED(rv)) {
request.mPromise->Reject(false, __func__);
} else {
extensions::StreamFilterParent::Attach(this, std::move(parent));
request.mPromise->Resolve(std::move(child), __func__);
}
} else {
request.mPromise->Reject(false, __func__);
}
request.mPromise = nullptr;
}
mStreamFilterRequests.Clear();
I'm not super confident why we don't always intercept in the parent process like this, but in bug 1638422 comment 10, :mattwoodrow mentions (emphasis mine):
Listening in the parent process, before multi-part handling would solve this, and also mean that file downloads are intercepted. Unfortunately it'd also happen before content conversion (gzip etc). I'm not sure we want to do that in the parent process for all compressed pages when adblock is installed.
I guess we could attempt to intercept in the parent process, and then defer to the child once we detect that the content type isn't multipart. We're already doing decompression in the parent for compressed multi-part, so that's not a regression.
So it sounds like perhaps always intercepting would be a performance regression for normal loads due to running decompression at a worse place in the channel pipeline? I'm unsure if this is an issue we'd see in practice, but harming the performance for users of extensions like ublock seems like a bad idea.
We could potentially try to just add this interception in the case where we're actually doing a download, but there's a chance that's happening too late within the parent process for the normal interception codepath to be used. At this point, the channel has already called OnStartRequest
on its listener, so the new listener set up by https://searchfox.org/mozilla-central/rev/fa86401b80f19afb6ed9bfca127ecc5e3a6f0cdc/toolkit/components/extensions/webrequest/StreamFilterParent.cpp#214-216 probably will never have OnStartRequest
called on it because it is being installed in front of already-called stream listeners.
I expect we'd probably need to install the interception instead after the already-called stream listener, but before the one which we're creating in ParentProcessDocumentOpenInfo
(https://searchfox.org/mozilla-central/rev/fa86401b80f19afb6ed9bfca127ecc5e3a6f0cdc/netwerk/ipc/DocumentLoadListener.cpp#285-287). This might be possible by doing something like this in that function (untested pseudocode ignoring stuff like privacy or APIs):
for (StreamFilterRequest& request : mStreamFilterRequests) {
Endpoint<PStreamFilterParent> parent;
Endpoint<PStreamFilterChild> child;
nsresult rv = PStreamFilter::CreateEndpoints(&parent, &child);
if (NS_SUCCEEDED(rv)) {
auto sfp = MakeRefPtr<StreamFilterParent>();
// Manually initialize the listener and put it inline as the next stream listener to call.
sfp->mChannel = channel;
sfp->mOrigListener = m_targetStreamListener;
m_targetStreamListener = sfp;
// Make sure stream conversion is requested from our channel
// XXX: I don't think this is public at all, seems to only be used by nsITraceableChannel::SetNewListener
channel->StoreListenerRequiresContentConversion(true);
// Bind it to the newly created endpoints and resolve our request with it.
sfp->ActorThread()->Dispatch(
NewRunnableMethod<ParentEndpoint&&>("StreamFilterParent::Bind", sfp,
&StreamFilterParent::Bind,
std::move(parent)),
NS_DISPATCH_NORMAL);
request.mPromise->Resolve(std::move(child), __func__);
} else {
request.mPromise->Reject(false, __func__);
}
}
mStreamFilterRequests.Clear();
This effectively tries to do the things done when attaching a stream filter, but without actually calling nsITracableChannel::SetNewListener
as the listener needs to be specifically inserted after the current listener and before the next one. Not sure if this would work, but it's my best guess as to what would need to happen if we can't just enable stream filter in the parent process for all requests (which seems likely due to the performance concerns brought up in the previous bug).
Updated•4 months ago
|
Comment 10•4 months ago
|
||
Thanks for the pointers Nika! The sketched logic looks reasonable. I have a question about the ListenerRequiresContentConversion
flag.
(In reply to Nika Layzell [:nika] (ni? for response) from comment #9)
In general, we only apply stream converters within the final content process on the HttpChannelChild. When doing a download, we never create the HttpChannelChild as we directly download it, so we never end up attaching the stream listeners as you noticed.
Question: if a download requires further conversions, such as decompression, is that handled in the parent too?
You've suggested to call channel->StoreListenerRequiresContentConversion(true);
from ParentProcessDocumentOpenInfo::OnDocumentStartRequest
(based on the original logic in HttpBaseChannel::SetListener
). The only use of that flag is in nsHttpChannel::CallOnStartRequest
, after the stream listener's OnStartRequest
was called: https://searchfox.org/mozilla-central/rev/d3fea1aa852bb3a353a0a4a875c3685da11ce39b/netwerk/protocol/http/nsHttpChannel.cpp#1887,1904
Is the flag going to be effective when needed? I am questioning its effectiveness, because for it to be effective, the control flow must already have passed the previous point where stream filters were going to be attached: https://searchfox.org/mozilla-central/rev/d3fea1aa852bb3a353a0a4a875c3685da11ce39b/netwerk/protocol/http/nsHttpChannel.cpp#1844-1865,1882,1885,1887,1904
(and if we did attach the stream filters at that point, then we wouldn't have this bug)
( side note: Given the involvement of DocumentChannel here, I wondered what happens if the download was initiated by the extension through the browser.downloads.download
extension API, which uses Downloads.sys.mjs
internally from the parent. I can observe that the StreamFilter is attached as expected, and that the webRequest resource type is "other" instead of "main_frame"; the internal type is TYPE_SAVEAS_DOWNLOAD
. )
We could potentially try to just add this interception in the case where we're actually doing a download, but there's a chance that's happening too late within the parent process for the normal interception codepath to be used. At this point, the channel has already called
OnStartRequest
on its listener, so the new listener set up by https://searchfox.org/mozilla-central/rev/fa86401b80f19afb6ed9bfca127ecc5e3a6f0cdc/toolkit/components/extensions/webrequest/StreamFilterParent.cpp#214-216 probably will never haveOnStartRequest
called on it because it is being installed in front of already-called stream listeners.
I think that you're right. If the missing call of OnStartRequest is the only issue, I think that we could artificially call the relevant logic in OnStartRequest
. But considering that StreamFilterParent expects a listener to forward the notification to, and we shouldn't enter OnStartRequest again on the original listener, this is likely unfeasible. Moreover, the contract of nsITraceableChannel
explicitly discourages updating the listener after the OnStartRequest
call.
I expect we'd probably need to install the interception instead after the already-called stream listener, but before the one which we're creating in
ParentProcessDocumentOpenInfo
(https://searchfox.org/mozilla-central/rev/fa86401b80f19afb6ed9bfca127ecc5e3a6f0cdc/netwerk/ipc/DocumentLoadListener.cpp#285-287). This might be possible by doing something like this in that function (...)
I can confirm that the linked function could be sufficiently timely: You've linked ParentProcessDocumentOpenInfo::TryExternalHelperApp
, which does not only have access to mStreamFilterRequests
but is also called from nsDocumentOpenInfo::DispatchContent
, which in its turn is called from nsDocumentOpenInfo::OnStartRequest
(and that in its turn could be called from ParentProcessDocumentOpenInfo::OnDocumentStartRequest
which is the function I linked in comment 8). After that DispatchContent
call, m_targetStreamListener->OnStartRequest
is called.
(summary of original comment: code snippet based on
StreamFilterParent::Attach
andStreamFilterParent::Init
, hooking up withm_targetStreamListener
)This effectively tries to do the things done when attaching a stream filter, but without actually calling
nsITracableChannel::SetNewListener
as the listener needs to be specifically inserted after the current listener and before the next one. Not sure if this would work, but it's my best guess as to what would need to happen if we can't just enable stream filter in the parent process for all requests (which seems likely due to the performance concerns brought up in the previous bug).
I was not familiar with this specific area of the code related to DocumentChannel, so I took a closer look for comparison.
With the current usual use of StreamFilterParent::Attach
(+Init
), channel
's mListener
is replaced right before mListener::OnStartRequest
is called in nsHttpChannel::CallOnStartRequest
.
- Before attach: (nsHttpChannel) likely has mListener
- After attach: (nsHttpChannel) channel -> StreamFilterParent -> mOrigListener (original mListener retrieved via HttpBaseChannel::SetListener).
The proposed fix is to to do something similar, where ParentProcessDocumentOpenInfo::OnDocumentStartRequest
fills the role of nsHttpChannel::CallOnStartRequest
:
- The case relevant to this bug is when we go through
nsDocumentOpenInfo::TryExternalHelperApp
, i.e. download. - Before attach:
nsDocumentOpenInfo::OnStartRequest
callsm_targetStreamListener
, andm_targetStreamListener
in this case is an instance ofnsExternalAppHandler
). - After attach:
m_targetStreamListener
has becomeStreamFilterParent
, and StreamFilterParent'smOrigListener
is the originalnsExternalAppHandler
The logic looks comparable and reasonable to me. I wasn't able to easily confirm whether ListenerRequiresContentConversion
works by just looking at the code, so I asked for clarification at the top of my comment.
Description
•