Closed Bug 1410755 Opened 2 years ago Closed Last year

filterResponseData breaks load event in XML documents

Categories

(WebExtensions :: Request Handling, defect, P5)

60 Branch
defect

Tracking

(firefox61 verified)

RESOLVED FIXED
mozilla61
Tracking Status
firefox61 --- verified

People

(Reporter: tim.babych, Assigned: kmag)

References

Details

Attachments

(3 files)

If webextension simply uses filterResponseData API to process an XML document, its window.load event does not fire. Processed HTML documents still fire window.load events.

function listener(details) {
  let filter = browser.webRequest.filterResponseData(details.requestId);
  filter.ondata = event => {
    filter.write(event.data);
    filter.disconnect();
  }
  return {};
}

browser.webRequest.onBeforeRequest.addListener(
  listener,
  {urls: ["*://*/*.fb2*"], types: ["main_frame"]},
  ["blocking"]
);

Actual results with test addon enabled:
https://tymofij.github.io/filter-breaks-xml-load-event/example.xml (unmatched) - window.load fires

https://tymofij.github.io/filter-breaks-xml-load-event/example.fb2.xml (matched) - window.load does not fire

Code and examples attached, also available on github
https://github.com/tymofij/filter-breaks-xml-load-event

https://tymofij.github.io/filter-breaks-xml-load-event/example.fb2.html (matched) - window.load fires

Expected results:
window.load fires in all cases.
Assignee: nobody → kmaglione+bmo
Priority: -- → P5
Please notice that this bug prevents NoScript from using webRequest.filterResponseData() and then filter.ondata as the only work-around I could find to dynamically inject scripts with tabs.executeScript() and make them reliably run on document start on a specific page: any other timing, either in webRequest or in webNavigation, results in the scripts running either to early (before the document is replaced) or too late (when some page scripts have already been executed, even though the DOM is not complete yet), no matter what the runAt property asks for.
Therefore, in order not to break XML documents and, most notably, RSS feeds (onload not firing makes Firefox itself to fail rendering its feed reader UI), I'm currently forced into using contentScript.register which, for single pages identified in webRequest events, feels frankly overkill, clumsy and error prone (as rightly noticed by Rob Wu in his latest review).

Could this bug be re-prioritized and/or tabs.executeScript() be fixed to better honor runAt: "document_start"?
The load event does fire, but too early, as the modified test case shows:

Edit the test case and replace `console.log(window)` with:

console.log(document.readyState);
console.log(performance.timing.loadEventEnd);

In the working cases (HTML and unfiltered XML), the result is "loading" and "0".
In the broken case (filtered XML), the result is "complete" and a number, e.g. "1521888179666".
(FYI: The DOMContentLoaded does fire as expected in all cases, i.e. when the parser completes.)


I discovered why this happens (jump to the end for the conclusion and work-around), as follows:

I built Firefox from source (HG revision b498494cfac55b99668607280756bee3cb60791a), put a breakpoint on nsDocLoader::doStopDocumentLoad, and checked the call stack with GDB.

The HTML and unfiltered XML cases are identical:
#0 nsDocLoader::doStopDocumentLoad(nsIRequest*, nsresult) at uriloader/base/nsDocLoader.cpp:833
#1 nsDocLoader::DocLoaderIsEmpty(bool) at uriloader/base/nsDocLoader.cpp:747
#2 nsDocLoader::DocLoaderIsEmpty(bool) at uriloader/base/nsDocLoader.cpp:632
#3 nsDocLoader::OnStopRequest(nsIRequest*, nsISupports*, nsresult) at uriloader/base/nsDocLoader.cpp:632
#4 mozilla::net::nsLoadGroup::RemoveRequest(nsIRequest*, nsISupports*, nsresult) at netwerk/base/nsLoadGroup.cpp:629
#5 nsDocument::DoUnblockOnload() at dom/base/nsDocument.cpp:8496
#6 nsDocument::UnblockOnload(bool) at dom/base/nsDocument.cpp:8418
#7 nsDocument::DispatchContentLoadedEvents() at dom/base/nsDocument.cpp:5427

The filtered XML case looks different:
#0 nsDocLoader::doStopDocumentLoad(nsIRequest*, nsresult) at uriloader/base/nsDocLoader.cpp:833
#1 nsDocLoader::DocLoaderIsEmpty(bool) at uriloader/base/nsDocLoader.cpp:747
#2 nsDocLoader::DocLoaderIsEmpty(bool) at uriloader/base/nsDocLoader.cpp:632
#3 nsDocLoader::OnStopRequest(nsIRequest*, nsISupports*, nsresult) at uriloader/base/nsDocLoader.cpp:632
#4 mozilla::net::nsLoadGroup::RemoveRequest(nsIRequest*, nsISupports*, nsresult) at netwerk/base/nsLoadGroup.cpp:629
#5 mozilla::net::HttpChannelChild::DoOnStopRequest(nsIRequest*, nsresult, nsISupports*) at netwerk/protocol/http/HttpChannelChild.cpp:1307
#6 mozilla::net::HttpChannelChild::OnStopRequest(nsresult const&, mozilla::net::ResourceTimingStruct const&, mozilla::net::nsHttpHeaderArray const&) at netwerk/protocol/http/HttpChannelChild.cpp:1167

nsDocLoader::doStopDocumentLoad is called too early because RemoveRequest removed the last pending request, so  mLoadGroup->IsPending is false, and therefore nsDocLoader::IsBusy() is false:
https://searchfox.org/mozilla-central/rev/56274d0a016a6833e5da7770ce70580b6f5abb21/netwerk/protocol/http/HttpChannelChild.cpp#1298-1316
1298     mListener->OnStopRequest(aRequest, aContext, mStatus);
      // ^ Calls mozilla::extensions::StreamFilterParent::OnStopRequest
1316     mLoadGroup->RemoveRequest(this, nullptr, mStatus);
      // ^ Oops, that was the last pending request - "load" event fires.

This does not happen for HTML, because a pending request is added during OnStartRequest:
#0 mozilla::net::nsLoadGroup::AddRequest(nsIRequest*, nsISupports*) at netwerk/base/nsLoadGroup.cpp:444
#1 nsDocument::BlockOnload() at dom/base/nsDocument.cpp:8390
#2 nsContentSink::WillBuildModelImpl() at dom/base/nsContentSink.cpp:1630
#3 nsHtml5TreeOpExecutor::WillBuildModel(nsDTDMode) at parser/html/nsHtml5TreeOpExecutor.cpp:145
#4 nsHtml5StreamParser::OnStartRequest(nsIRequest*, nsISupports*) at parser/html/nsHtml5StreamParser.cpp:947
#5 nsDocumentOpenInfo::OnStartRequest(nsIRequest*, nsISupports*) at uriloader/base/nsURILoader.cpp:324
#6 mozilla::extensions::StreamFilterParent::OnStartRequest(nsIRequest*, nsISupports*) at toolkit/components/extensions/webrequest/StreamFilterParent.cpp:419

For XML, on the other hand, a pending request is added during OnDataAvailable:
#0  0x00007fffe73c6020 in mozilla::net::nsLoadGroup::AddRequest(nsIRequest*, nsISupports*) at netwerk/base/nsLoadGroup.cpp:444
#1 nsDocument::BlockOnload() at dom/base/nsDocument.cpp:8390
#2 nsContentSink::WillBuildModelImpl() at dom/base/nsContentSink.cpp:1630
#3 nsXMLContentSink::WillBuildModel(nsDTDMode) at dom/xml/nsXMLContentSink.cpp:167
#4 nsParser::WillBuildModel(nsTString<char16_t>&) at parser/htmlparser/nsParser.cpp:463
#5 nsParser::ResumeParse(bool, bool, bool) at parser/htmlparser/nsParser.cpp:1033
#6 nsParser::ResumeParse(bool, bool, bool) at parser/htmlparser/nsParser.cpp:1031
#7 nsParser::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned long, unsigned int) at parser/htmlparser/nsParser.cpp:1437
#8 mozilla::extensions::StreamFilterParent::Write(nsTArray<unsigned char>&) at toolkit/components/extensions/webrequest/StreamFilterParent.cpp:390

When webRequest.filterResponseData is used, OnDataAvailable is not called until the extension responds. Under these circumstances, the server can finish sending a response before the extension responds, and OnStopRequest is called before OnDataAvailable. Consequently the pending request count drops to 0, and the "load" event is fired too early.


A work-around for extension authors is to make sure that OnDataAvailable is triggered as soon as possible, e.g. by writing data at the filter.onstart event. An empty byte sequence is sufficient:
filter.onstart = () => {
  filter.write(new Uint8Array());
};
(In reply to Rob Wu [:robwu] from comment #2)
> For XML, on the other hand, a pending request is added during
> OnDataAvailable:

Well that's annoying. Thanks for digging into this.
(In reply to Rob Wu [:robwu] from comment #2)

> A work-around for extension authors is to make sure that OnDataAvailable is
> triggered as soon as possible, e.g. by writing data at the filter.onstart
> event. An empty byte sequence is sufficient:
> filter.onstart = () => {
>   filter.write(new Uint8Array());
> };

Thank you!
Comment on attachment 8962112 [details]
Bug 1410755: Add StreamFilterParent as a request to the channel load group.

https://reviewboard.mozilla.org/r/230956/#review236606
Attachment #8962112 - Flags: review?(mixedpuppy) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/a13dfc14b4c27834ef6fb151cfc1a1c83b9b1d48
Bug 1410755: Add StreamFilterParent as a request to the channel load group. r=mixedpuppy
Verified as fixed in latest Firefox 61. I can see that the event was fired for the XML. I will add a screenshot for the fix.
Status: RESOLVED → VERIFIED
Attached image Postfix screenshot
Product: Toolkit → WebExtensions
> A work-around for extension authors is to make sure that OnDataAvailable is
> triggered as soon as possible, e.g. by writing data at the filter.onstart
> event. An empty byte sequence is sufficient:
> filter.onstart = () => {
>   filter.write(new Uint8Array());
> };

Unfortunately the work around doesn't appear to work in ESR60, breaking XML feeds when NoScript is installed and impacting the ability of the Tor Browser of integrating it in next release.

Can the fix be backported on 60?
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla61 → mozilla60
Version: 58 Branch → 60 Branch
Comment on attachment 8962112 [details]
Bug 1410755: Add StreamFilterParent as a request to the channel load group.

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: This is pretty broken for some important extensions.  As 60 esr will live for another year, I think we should consider the patch here for esr.
User impact if declined: extensions like tor/noscript/etc will not work properly, see comment 14.
Fix Landed on Version: 61
Risk to taking this patch (and alternatives if risky): I think it's low, this patch has baked for a long time on 61.
String or UUID changes made by this patch: none

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Attachment #8962112 - Flags: approval-mozilla-esr60?
Status: REOPENED → RESOLVED
Closed: 2 years agoLast year
Resolution: --- → FIXED
(In reply to Giorgio Maone [:mao] from comment #14)

Target Milestone tracks when a fix landed on m-c. Please don't change it because you want a fix uplifted.
Target Milestone: mozilla60 → mozilla61
Comment on attachment 8962112 [details]
Bug 1410755: Add StreamFilterParent as a request to the channel load group.

This needs a rebased patch for ESR60 uplift consideration. Please fold in the follow-up fixes that landed too.
Flags: needinfo?(kmaglione+bmo)
Attachment #8962112 - Flags: approval-mozilla-esr60? → approval-mozilla-esr60-
Shane, can you please help with the rebase?
Flags: needinfo?(mixedpuppy)
I started to look at doing so, then realized that this was on top of some changes in bug 1444680, specifically:

https://hg.mozilla.org/mozilla-central/rev/09006f4350b6
https://hg.mozilla.org/mozilla-central/rev/dfdfc8e56c31

And that we probably should also add Bug 1474296

https://hg.mozilla.org/mozilla-central/rev/4bf70a81f42f

It would be great if Kris would chime in here, but otherwise I think either we live with esr60 as-is (not great) or uplift the additional fixes here.
Flags: needinfo?(mixedpuppy) → needinfo?(ryanvm)
If we can roll in just the changes from those 3 commits and still have everything build and pass tests, that doesn't sound *too* terrible? But yeah, I'd love to see a risk analysis for that.
Flags: needinfo?(ryanvm)
I don't see any other necessary changes.
(In reply to Shane Caraveo (:mixedpuppy) from comment #21)
> I don't see any other necessary changes.

I should have added, those fixes for a race condition and a crash.  The changes are pretty minimal and have been running on nightly.

I have the patches ready with some minor changes in test files, I'll pushed those to try to see how it goes.  

https://hg.mozilla.org/try/pushloghtml?changeset=ad9783f21f3dd0ed046c01c64dd7010b4c10a951
Hey Shane, any updates on this bug?
Flags: needinfo?(kmaglione+bmo) → needinfo?(mixedpuppy)
leaving as-is per a chat with kris a while back.
Flags: needinfo?(mixedpuppy)

This bug is not really fixed!

let txt_html_doc = `
<html>
<head>
<script>
  window.onload = function() { document.body.innerText = "replacement content"; };
</script>
</head>
<body>
</body>
</html>
`

function listener(details) {
  let filter = browser.webRequest.filterResponseData(details.requestId);

  filter.onstart = event => {
    console.log("started");
    let encoder = new TextEncoder();
    filter.write(encoder.encode(txt_html_doc));
    filter.close();
  }
}

browser.webRequest.onBeforeRequest.addListener(
  listener,
  {urls: ["https://example.com/"], types: ["main_frame"]},
  ["blocking"]
);

The window.onload handler will sometimes not be called if you reload or load the page for some time. You will see that the "replacement content" text will not be written to the page!

Flags: needinfo?(rob)

My Firefox version: 69.0a1 (2019-06-22) (64-Bit)

The page from https://example.com/ is a html document.

In the future, could you open a new bug instead of commenting on an old bug? This bug is about XML documents, whereas your bug is clearly not about a XML document.

I can't immediately find the cause of your bug, especially because it is not consistently reproducible. I posted the little parts I knew in bug 1561604. If you have any more information, please add it to that bug. Debugging would be much easier if there is a solid test case to trigger the issue.

Flags: needinfo?(rob)
See Also: → 1561604
You need to log in before you can comment on or make changes to this bug.