Closed Bug 1949623 Opened 1 year ago Closed 5 months ago

declarativeNetRequest shouldn't be affected by in-memory cache

Categories

(WebExtensions :: Request Handling, task, P2)

task

Tracking

(firefox149 fixed)

RESOLVED FIXED
149 Branch
Tracking Status
firefox149 --- fixed

People

(Reporter: arai, Assigned: arai)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

declarativeNetRequest uses WebRequest.sys.mjs as the underlying implementation, and it requires the actual network activity.
This means the rule is not applied when in-memory cache is used, for stylesheets and images, and also for scripts once bug 1670403 is fixed.

https://searchfox.org/mozilla-central/rev/ae4d3bc2af37eef804113b621455883a92a29e9c/toolkit/components/extensions/webrequest/WebRequest.sys.mjs#756-773

observe(subject, topic, data) {
  let channel = this.getWrapper(subject);
  switch (topic) {
    case "http-on-modify-request":
      this.runChannelListener(channel, "onBeforeRequest");
      break;
    case "http-on-before-connect":
      this.runChannelListener(channel, "onBeforeSendHeaders");
      break;
    case "http-on-examine-cached-response":
    case "http-on-examine-merged-response":
      channel.fromCache = true;
    // falls through
    case "http-on-examine-response":
      this.examine(channel, topic, data);
      break;
  }
},

webRequest API is also affected by the cache, but webRequest doesn't guarantee the events on cached case (https://developer.chrome.com/docs/extensions/reference/api/webRequest#caching), and Chrome also doesn't trigger the events for cached case, and also there's explicit API for clearing the cache webRequest.handlerBehaviorChanged (bug 1657575).
So, for webRequest API, extensions are responsible for clearing cache with webRequest.handlerBehaviorChanged when they add a new rule.

But Chrome's declarativeNetRequest doesn't seem to be affected by cache (cached response is not used when a new redirect rule is added).
This means the extensions aren't responsible for clearing cache for declarativeNetRequest API, and browser is responsible for making it work automatically.

Possible option would be to clear the in-memory cache (stylesheets, scripts, and images) when the rule is added or modified.

This is the DNR equivalent of bug 1657575.

Assignee: nobody → arai.unmht
Severity: -- → N/A
Status: NEW → ASSIGNED
Priority: -- → P2
See Also: → 1657575
Attachment #9467690 - Attachment description: WIP: Bug 1949623 - Clear the in-memory resource cache when modifying the declarativeNetRequest rules. r?robwu! → Bug 1949623 - Clear the in-memory resource cache when modifying the declarativeNetRequest rules. r?robwu!

Upon seeing the patch, I wonder whether we should really do this. The proposed patch proposes to unconditionally wipe the cache whenever any DNR rule changes. This is a very big hammer, and does not have the rate-limiting that was added in bug 1657575. Even without an early fix now, extensions have the work-around of calling the webRequest.handlerBehaviorChanged method if they want the same outcome.

The declarativeNetRequest API allows extensions to specify how network requests should be handled. By design, this is declarative, and extensions do not get (event) notifications for triggered requests. This does not mean that the DNR rules are unobservable, because obviously it is possible for an extension to see the effect of (not) modifying network requests.

Unlike webRequest (through webRequest.filterResponseData), DNR does not allow modification of response bodies. Its only request-modifying actions are to block a request, redirect a request, modify a response header, modify a request header. Are these significant enough to warrant wiping all memory caches?

Let's take same concrete examples and see what options are available:

  • Extensions that block/redirect requests to prevent servers from detecting the client (e.g. Decentraleyes and LocalCDN). Serving from cache is acceptable.
  • Ad blocker wants to block a request, to block ad resources or scripts. Ad/content blocking is only guaranteed to be effective after a page reload.
  • Content blocker wants to replace (redirect) a script request, to change its functionality. E.g. Firefox's Smart Blocking feature of Enhanced Tracking Protection redirects script requests to a substitute that offers a minimal API to unbreak pages that expect an API. (this currently uses webRequest by the way.)

With the above scenarios, how likely would it be for users to negatively be affected by the in-memory cache? I was hoping to say "very unlikely", but I'm not sure if I can say that.

If we wanted to fix this properly without dropping the benefits of the cache, a potential alternative may be run DNR rule evaluation for each cached resource and see if there is a change in the rule action. And if so, to then wipe individual entries (maybe related to bug 1946348 and bug 1948288).

Another option could be to mark resources as potentially dirty and revalidate what should happen before the cached entry can be used.

Flags: needinfo?(arai.unmht)

Thank you for the thorough review on the approach!

(In reply to Rob Wu [:robwu] from comment #3)

Upon seeing the patch, I wonder whether we should really do this. The proposed patch proposes to unconditionally wipe the cache whenever any DNR rule changes. This is a very big hammer, and does not have the rate-limiting that was added in bug 1657575.

Yes, not having the rate-limit can be problematic.
This patch implements delayed-batching with 10ms, but it still allows 3000x calls than webRequest.handlerBehaviorChanged's limit (which is 20 calls per 10 minutes).

Even without an early fix now, extensions have the work-around of calling the webRequest.handlerBehaviorChanged method if they want the same outcome.

The goal here was to have a parity with Chrome, where Chrome's DNR doesn't seem to require manually clearing the cache.
So, I'm not sure having webRequest.handlerBehaviorChanged validates the effect of cache,
and having the difference here might introduce the future interoperability issue.

See the DNR sections in https://docs.google.com/document/d/1f7-eOtNAi5bBBKKCJYJwwcv3ROT05u-KaQOU83RAVlg/edit?tab=t.0#heading=h.i1mj8v6jgeb6 for more details and the experiment results.
Given the document was initially targeting the webRequest API, we might need to investigate some more on the Chrome's behavior tho...

Unlike webRequest (through webRequest.filterResponseData), DNR does not allow modification of response bodies. Its only request-modifying actions are to block a request, redirect a request, modify a response header, modify a request header. Are these significant enough to warrant wiping all memory caches?

At least the redirecting case and the blocking case are affected (see below).
I'm not sure about how critical the modifying the request/response headers case are, but if modifying the request header is supposed to result in receiving a different content (such as, the server returns different content based on User-Agent or something), then it's also affected.

The problem here is the case where cache is created before the rule is added, or modified to different one.

For redirecting case, the scenario is the following:

  1. An user visits the page that contains a reference to a file A
  2. The user enables a rule that redirects the file A to file B
  3. The user reloads the page

Here, the file A is stored into the in-memory cache at the step 1, and if the cache is not evicted at the step 2, the reload at the step 3 will use the cache, and the redirect doesn't happen, and the user will keep seeing the page with the file A at the step 3.

Possible workaround on the user's side is to either perform the force-reload, or clear the cache manually, or restart the browser.
The force-reload bypasses the in-memory cache as well and the in-memory cache will be overridden with the redirect (or evicted if the redirect response is not cacheable).

Same thing happens for blocking scenario.
If the file is stored into the in-memory cache before the rule is added, the subsequent load for the same resource won't be blocked unless either the cache is explicitly evicted, or the cache expires.

So,

  • If people frequently enable/disable rules, it's more likely they hit the issue where the rule is not reflected to the subsequent normal reload
  • If people are familiar with force-reload, this might be smaller problem

Let's take same concrete examples and see what options are available:

  • Extensions that block/redirect requests to prevent servers from detecting the client (e.g. Decentraleyes and LocalCDN). Serving from cache is acceptable.

So, the goal of those extensions is to avoid the actual network request,
and it's not targeting the execution of the response, right?

In that case, yes, this case is not affected.
Serving the file from cache won't go the actual network request,
and if the server response is marked not-cacheable to make the subsequent request reach the network, then the cache is not used at all.

  • Ad blocker wants to block a request, to block ad resources or scripts. Ad/content blocking is only guaranteed to be effective after a page reload.

This is problematic, depends on how the Ad blocker works.

For images, and if the ad blocker just blocks the request, without modifying the DOM tree or styles to actually hide the elements, then the cached ad resources will be used in the subsequent normal reloads.
Those ad resources will be evicted from the cache on the subsequent force-reload, or when the user restarts the browser, or when the content process shuts down.
And this will look like the ad blocker doesn't work immediately after you add the rule.

For images, and if the ad blocker also modifies the DOM tree and removes/hides the element, then the negative effect is almost negligible, except for that the resource might be visible until the DOM tree modification is applied while the page load.

For scripts, the script that modifies the document will keep running until the cache is evicted. Of course the ad blocker may block other resources that's used by the script, so the script may work differently, such as the script inserts ad image, but the ad image isn't loaded, but it will look like the ad blocker is partially broken.

  • Content blocker wants to replace (redirect) a script request, to change its functionality. E.g. Firefox's Smart Blocking feature of Enhanced Tracking Protection redirects script requests to a substitute that offers a minimal API to unbreak pages that expect an API. (this currently uses webRequest by the way.)

Isn't it same as the bug 1948288's feature ?

If it's something different, then it is also affected.
Until the cache entry is evicted, or until the restart, the cached original script will be used in subsequent loads.
So, this will look like the on/off switch doesn't work as expected.

If it's the same thing as the bug 1948288's feature, then that part is fixed by the patch there, but it's also clearing the entire in-memory cache on modification.
This is done via experimental API, and it's not exposed to other extensions, it's not possible to abuse the API tho, still this may regress the experience if the rule is modified frequently.

If we wanted to fix this properly without dropping the benefits of the cache, a potential alternative may be run DNR rule evaluation for each cached resource and see if there is a change in the rule action. And if so, to then wipe individual entries (maybe related to bug 1946348 and bug 1948288).

There's a prototype for the API that receives the URL pattern and evict only the matching entries, in https://phabricator.services.mozilla.com/D238358 .
So, we could look into reviving the patch (rebasing onto ChromeUtils.clearResourceCache API), and use it instead.
(still we might need to look into rate-limiting separately...)

The concern with the URL pattern approach was how effective it is, compared to the complexity introduced by the above patch.

At first, the patch was created assuming it will be used for webRequest API (I hadn't noticed the existence of webRequest.handlerBehaviorChanged at that point and the no-guarantee on the cached case), and at least for webRequest APIs, the extensions can just add a rule that matches all URLs, and that will effectively clear all caches, with extra cost of the pattern matching.

But now that webRequest API's case doesn't need it.
And for DNR, the situation might be better, given it's rule-based and having a pattern that matches all URLs won't make much sense?

It would be nice to re-evaluate this. I'll look into it.

Another option could be to mark resources as potentially dirty and revalidate what should happen before the cached entry can be used.

Unfortunately, there's no "dirty" state currently.

Then, there's a plan to implement the cache revalidation after the in-memory cache expires (so, when it exceeds the Cache-Control header).
Currently, when the cache entry expires, the cache entry is effectively removed from the cache, and the subsequent load requires IPC to the necko, and communication with the remote server, and sending the file over the network and IPC again, and recompile it.
Here, it's possible that the subsequent load results in the same file, and in that case the IPC and recompile is basically unnecessary, and we can instead revive the cache.

So, once the above is implemented, we could have similar "dirty" state, and use it instead of clearing, that way at least the IPC cost and recompile cost is reduced even if we mark all entries "dirty", but still it requires IPC tho.

Flags: needinfo?(arai.unmht)

I was about to apply the "URL pattern" way, and just realized that the "condition" used in DNR is completely different than the webRequest's url pattern matching, and it has way more complex structure.
https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/declarativeNetRequest/RuleCondition

I'm not sure if replicating the same logic inside the JS/CSS/image cache handling (which is C++) and applying it for each cached item is reasonable approach.
At least the urlFilter and regexFilter handling needs to be implemented in C++, in performant way.

So, if we can go with different way, that would be better.

I have some questions regarding this, and it would be nice if we have some data that answers them:

  • how often the DNR rule is modified in regular browsing scenario?
  • is the rule modification triggered by user's explicit action for the rule modification itself? or are there any case the rule is modified on the other events, such as simple navigation?

If the rule is modified by the user's explicit interaction on the rule list, and if the interaction doesn't happen frequently, then I think the cost of clearing the all in-memory cache is not a critical problem.

robwu, can I have your input here?

Flags: needinfo?(rob)

(In reply to Tooru Fujisawa [:arai] from comment #4)

  • Content blocker wants to replace (redirect) a script request, to change its functionality. E.g. Firefox's Smart Blocking feature of Enhanced Tracking Protection redirects script requests to a substitute that offers a minimal API to unbreak pages that expect an API. (this currently uses webRequest by the way.)

Isn't it same as the bug 1948288's feature ?

The example, yes. I was describing a general class of use cases, with a known example. Having a specific work-around inside that built-in extension does not apply broadly.

(In reply to Tooru Fujisawa [:arai] from comment #5)

I was about to apply the "URL pattern" way, and just realized that the "condition" used in DNR is completely different than the webRequest's url pattern matching, and it has way more complex structure.
https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/declarativeNetRequest/RuleCondition

I'm not sure if replicating the same logic inside the JS/CSS/image cache handling (which is C++) and applying it for each cached item is reasonable approach.
At least the urlFilter and regexFilter handling needs to be implemented in C++, in performant way.

So, if we can go with different way, that would be better.

I never got to fully optimizing it, it is tracked by bug 1745768 and bug 1745769.

I think that the most viable approach would be the dirty approach (end of comment 4).

(In reply to Tooru Fujisawa [:arai] from comment #6)

I have some questions regarding this, and it would be nice if we have some data that answers them:

  • how often the DNR rule is modified in regular browsing scenario?

Among the public extensions listed on AMO in January, 366 extensions use the DNR API, of which 152 use static rules. When an extension with static rules are updated, all DNR rules are replaced in bulk.

There are extensions that register a temporary rule with declarativeNetRequest.updateSessionRules with the objective of modifying the behavior of one request. We should ideally not trash the whole cache for one request.

  • is the rule modification triggered by user's explicit action for the rule modification itself? or are there any case the rule is modified on the other events, such as simple navigation?

It varies. Some extensions declare tab-specific rules in response to user feedback. Rules can be based on static data (e.g. content blocker filters), dynamic data and/or user-defined data. The extension API has no way to tell the difference.

If the rule is modified by the user's explicit interaction on the rule list, and if the interaction doesn't happen frequently, then I think the cost of clearing the all in-memory cache is not a critical problem.

We cannot assume that the rule modifications are user-triggered or infrequently.

Flags: needinfo?(rob)

Thank you for the detailed answer!
The session rules part is really helpful, and yes, that sounds like the rule is modified more frequently than I expected.

Okay, I'll look into other options than just clearing the entire in-memory cache.
If the "dirty cache" way turns out to be the best option, then this bug is going to be postponed until the underlying necko functionality gets implemented.

Then, I have one more questions, in order to verify my understanding around the Chrome's DNR behavior and the expectation for the DNR API.
(I'll also do some more experiments with the Chrome's behavior)
Inside the extensions that uses DNR, do these extensions call webRequest.handlerBehaviorChanged or browsingData.remove* API, or something that touches the cache, after modifying the rule? Or do they expect the rule being applied without any other API call?

Flags: needinfo?(rob)

(In reply to Tooru Fujisawa [:arai] from comment #8)

Then, I have one more questions, in order to verify my understanding around the Chrome's DNR behavior and the expectation for the DNR API.
(I'll also do some more experiments with the Chrome's behavior)
Inside the extensions that uses DNR, do these extensions call webRequest.handlerBehaviorChanged or browsingData.remove* API, or something that touches the cache, after modifying the rule? Or do they expect the rule being applied without any other API call?

I spot-checked two extensions in Chrome:

... and they both call handlerBehaviorChanged. These are developed by experienced extension developers. I would expect most DNR extensions to be unaware of the cache interactions, especially because it is not documented anywhere in Chrome's declarativeNetRequest documentation.

Flags: needinfo?(rob)

Thank you again for investigating!

That sounds like there's a chance that there's some case where DNR is affected by in-memory cache.
I'll investigate the Chrome's behavior for each API usage and scenario.
If it turns out it's affected by cache, then we might not have to handle the cache for the rule modification.

Updated the testcases(6, 7, 8) in https://docs.google.com/document/d/1f7-eOtNAi5bBBKKCJYJwwcv3ROT05u-KaQOU83RAVlg/edit?tab=t.0

So, there's indeed a case where the cache affects DNR.

  • modifyHeaders action is affected by the cache, and the rule is not reflected until force-reload, both for addition and removal
  • block action is not affected by the cache, and the addition/removal are immediately reflected to the subsequent normal reload
  • redirect action is also not affected by the cache

So, still, the likely-common cases ("block" and "redirect" actions) need some automatic handling for the cache. otherwise we might hit some compat issues.
If requiring the handlerBehaviorChanged call is acceptable (so, even if it works on Chrome, porting the extension to Firefox requires adding the explicit call on handlerBehaviorChanged), then we can go with the same way as webRequest API.

I'll revisit this after bug 1879123.

Depends on: 1879123
Depends on: 1998925
No longer depends on: 1879123
Attachment #9467690 - Attachment description: Bug 1949623 - Clear the in-memory resource cache when modifying the declarativeNetRequest rules. r?robwu! → WIP: Bug 1949623 - Invalidate the in-memory resource cache when modifying the declarativeNetRequest rules. r?robwu!

The updated patch uses the dirty flag.
Currently it's implemented only in JS cache.
So the testcase is partially disabled for css and images, but this is pre-existing behavior and nothing changed around the patch stack.

If the cached css and images are causing similar trouble, then we can extend the dirty flag also to css (so, move it to SharedSubResourceCache), and also to image somehow.

emilio, I want to have your opinion about stylesheet cache here.

This bug is the WebExtensions declarativeNetRequest (DNR) version of bug 1657575,
where WebExtensions can modify the requests so that the in-memory cache items become outdated.
With DNR, it happens in more implicit way, where WebExtensions just modifies the rule, and the rules are applied for each request on the necko side.
And also DNR API doesn't have an explicit API for clearing the cache, like webRequest.handlerBehaviorChanged.

Currently, all in-memory cached resources are affected by this issue,
including stylesheets, images, and also JavsScript if the navigation cache is enabled.

As a solution for this on JS, bug 1998925 implements a "dirty" flag on SharedScriptCache items,
using the nsICacheInfoChannel::getCacheEntryId (with bug 1998926 fix) to distinguish the different responses for certain request.

If WebExtensions DNR modifies the rule, we mark all in-memory cached scripts "dirty" (we don't selectively mark or clear based on the rule, because the rule is complex).
When a subsequent request finds a cached script with the "dirty" flag, ScriptLoader performs a regular request to necko.
When the response is received, ScriptLoader checks if the response matches the cached (dirty) script,
by comparint the cacheEntryId between the cached script vs the response.

If those IDs match, ScriptLoader discards the response, revives the cached script, and uses the cached script for the request.
If they don't match, ScriptLoader evicts the cached script, and compiles the response and use it for the request.

This fixes the outdated cache issue, with the cost of extra IPC between necko,
while keeping the cached items as long as the cache turns out to be valid,
so there's no extra cost of compilation.
(the extra IPC cost can be reduced by bug 1879123 work, but it's future work)

Then, I wonder if we should implement similar things on stylesheets and images
(and possibly move the "dirty" flag logic to SharedSubResourceCache).
The testcase in https://phabricator.services.mozilla.com/D239146 has tests for stylesheets and images, with some part commented out due to this issue.

Is there any already-known compat issue happening on stylesheets with WebExtensions declarativeNetRequest API?
Could the above "dirty" flag and the "revive" behavior be implemented also on stylesheet caches easily?

Flags: needinfo?(emilio)

(In reply to Tooru Fujisawa [:arai] from comment #14)

Is there any already-known compat issue happening on stylesheets with WebExtensions declarativeNetRequest API?

Not afaict.

Could the above "dirty" flag and the "revive" behavior be implemented also on stylesheet caches easily?

Could? Yes. But it seems like a lot of work / special code for something that should hopefully be rare?

How likely are session rules to change? If they change rarely, I'd rather over-invalidate than maintain this extra complicated path.

Flags: needinfo?(emilio) → needinfo?(arai.unmht)

Thank you!

(In reply to Emilio Cobos Álvarez [:emilio] from comment #15)

(In reply to Tooru Fujisawa [:arai] from comment #14)

Is there any already-known compat issue happening on stylesheets with WebExtensions declarativeNetRequest API?

Not afaict.

Good to hear that it's not causing problems right now :)

Could the above "dirty" flag and the "revive" behavior be implemented also on stylesheet caches easily?

Could? Yes. But it seems like a lot of work / special code for something that should hopefully be rare?

How likely are session rules to change? If they change rarely, I'd rather over-invalidate than maintain this extra complicated path.

the comment #7 has some context around the situation and likely-ness.
I don't know the exact number of the frequency or population tho, at least it's possible that the rule is modified for each navigation.

Then, yes, clearing the entire resource is one option.
The previous version of patch does so, for all resources (JS/CSS/image).
So, if there's any issue in CSS/images and if the cost of clearing all in-memory cache is acceptable, we could apply it to CSS and images,
maybe while keeping the dirty flag approach for JS, or maybe clear the JS cache as well if we confirmed the situation doesn't happen so frequently.

Flags: needinfo?(arai.unmht)
Attachment #9467690 - Attachment description: WIP: Bug 1949623 - Invalidate the in-memory resource cache when modifying the declarativeNetRequest rules. r?robwu! → Bug 1949623 - Invalidate the in-memory resource cache when modifying the declarativeNetRequest rules. r?robwu!
Depends on: 2001368
Attachment #9467690 - Attachment description: Bug 1949623 - Invalidate the in-memory resource cache when modifying the declarativeNetRequest rules. r?robwu! → Bug 1949623 - Part 2: Invalidate the in-memory resource cache when modifying the declarativeNetRequest rules. r?robwu!
Pushed by arai_a@mac.com: https://github.com/mozilla-firefox/firefox/commit/4a424b3fd9b7 https://hg.mozilla.org/integration/autoland/rev/dac72462baf5 Part 1: Add the reload method to xpcshell ContentPage. r=robwu https://github.com/mozilla-firefox/firefox/commit/28ec8145a7be https://hg.mozilla.org/integration/autoland/rev/36f4876db153 Part 2: Invalidate the in-memory resource cache when modifying the declarativeNetRequest rules. r=robwu
Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 149 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: